dx-packages team mailing list archive
-
dx-packages team
-
Mailing list archive
-
Message #43839
[Bug 2002659] [NEW] Ubuntu AMI (ami-06114b38b9273f7c2) failed to join cluster in UAE region due to 403 on pause container
Public bug reported:
Customer is working on a POC to test EKS in the me-central-1 region and
they shared EC2 instances based of the Ubuntu EKS Optimized AMI failed
to join cluster when using managed node groups.
I've been able to repeat and identify the issue with the following
steps:
1. Created a new EKS cluster in me-central-1 with the following cluster
configuration:
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: uae-poc-test
region: me-central-1
managedNodeGroups:
- name: custom-ng-2
minSize: 1
maxSize: 4
amiFamily: Ubuntu2004
2. The CloudFormation stack rolls back due to Ubuntu node unable to join
cluster. It used this AMI: ami-06114b38b9273f7c2.
3. Looking the Cloud init logs we can see the following 403 error on the
pause container:
Cloud-init v. 22.4.2-0ubuntu0~20.04.2 running 'modules:config' at Wed, 11 Jan 2023 14:43:31 +0000. Up 40.30 seconds.
eksctl: running /etc/eks/bootstrap
Aliasing EKS k8s snap commands
Added:
- kubelet-eks.kubelet as kubelet
Added:
- kubectl-eks.kubectl as kubectl
Stopping k8s daemons until configured
Stopped.
Cluster "kubernetes" set.
Container runtime is containerd
Attempt 5 of 5
ctr: failed to resolve reference "602401143452.dkr.ecr.me-central-1.amazonaws.com/eks/pause:3.5": pulling from host 602401143452.dkr.ecr.me-central-1.amazonaws.com failed with status code [manifests 3.5]:403 Forbidden
Based on the Amazon container image registries
(https://docs.aws.amazon.com/eks/latest/userguide/add-ons-images.html),
it looks like it's using the wrong AWS region ECR registry as by
specifying the AMI used by the managed node group and overriding
--pause-container-account in the bootstrap command as per the below
configuration, the node registers as expected.
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: uae-poc-test
region: me-central-1
managedNodeGroups:
- name: custom-ng-3
ami: ami-06114b38b9273f7c2
minSize: 1
maxSize: 4
overrideBootstrapCommand: |
#!/bin/bash
/etc/eks/bootstrap.sh <cluster> --pause-container-account 759879836304
** Affects: compiz-plugins-main (Ubuntu)
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of DX
Packages, which is subscribed to compiz-plugins-main in Ubuntu.
Matching subscriptions: dx-packages
https://bugs.launchpad.net/bugs/2002659
Title:
Ubuntu AMI (ami-06114b38b9273f7c2) failed to join cluster in UAE
region due to 403 on pause container
Status in compiz-plugins-main package in Ubuntu:
New
Bug description:
Customer is working on a POC to test EKS in the me-central-1 region
and they shared EC2 instances based of the Ubuntu EKS Optimized AMI
failed to join cluster when using managed node groups.
I've been able to repeat and identify the issue with the following
steps:
1. Created a new EKS cluster in me-central-1 with the following
cluster configuration:
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: uae-poc-test
region: me-central-1
managedNodeGroups:
- name: custom-ng-2
minSize: 1
maxSize: 4
amiFamily: Ubuntu2004
2. The CloudFormation stack rolls back due to Ubuntu node unable to
join cluster. It used this AMI: ami-06114b38b9273f7c2.
3. Looking the Cloud init logs we can see the following 403 error on
the pause container:
Cloud-init v. 22.4.2-0ubuntu0~20.04.2 running 'modules:config' at Wed, 11 Jan 2023 14:43:31 +0000. Up 40.30 seconds.
eksctl: running /etc/eks/bootstrap
Aliasing EKS k8s snap commands
Added:
- kubelet-eks.kubelet as kubelet
Added:
- kubectl-eks.kubectl as kubectl
Stopping k8s daemons until configured
Stopped.
Cluster "kubernetes" set.
Container runtime is containerd
Attempt 5 of 5
ctr: failed to resolve reference "602401143452.dkr.ecr.me-central-1.amazonaws.com/eks/pause:3.5": pulling from host 602401143452.dkr.ecr.me-central-1.amazonaws.com failed with status code [manifests 3.5]:403 Forbidden
Based on the Amazon container image registries
(https://docs.aws.amazon.com/eks/latest/userguide/add-ons-
images.html), it looks like it's using the wrong AWS region ECR
registry as by specifying the AMI used by the managed node group and
overriding --pause-container-account in the bootstrap command as per
the below configuration, the node registers as expected.
---
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: uae-poc-test
region: me-central-1
managedNodeGroups:
- name: custom-ng-3
ami: ami-06114b38b9273f7c2
minSize: 1
maxSize: 4
overrideBootstrapCommand: |
#!/bin/bash
/etc/eks/bootstrap.sh <cluster> --pause-container-account 759879836304
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/compiz-plugins-main/+bug/2002659/+subscriptions
Follow ups