[ EKS ] ClusterAutoscaler (CA)

김붕어87 2023. 4. 6. 11:04

개요
ClusterAutoScaler 이란 ?
ClusterAutoScaler은 WorkerNode의 수를 조절하는 기능이다.

HPA가 POD의 Resource 사용량을 감지하다가 일정 값 이상을 사용하면 POD 개수를 늘려준다.
HPA가 POD를 계속 늘리다가 WorkerNode의 Resource가 부족해서 POD가 배포되지 않고 Pending 상태로 빠진다.
CA는 resource 부족으로 Pending 상태의 POD를 감지해서 WorkerNode를 증설해준다.
특이사항 : 같은 역할하는 오픈소스 Karpenter 참조.

[ 선행 작업 (Required) ]

AWS Infra 구성
- EKS Cluster 설치
- IAM OIDC 생성
- ASG(WorkerNode)에 Tag Add 필수
  - TAG Key : https://k8s.io/cluster-autoscaler/enabled
  - TAG Value : true

EKS APP 구성
- metrics-server 설치
  - 설치 링크 : https://dongwook35.tistory.com/54
- HPA 설치
  - 설치 링크 : https://dongwook35.tistory.com/66

[ Cluster AutoScaler 설치 ]

Cluster Autoscaler 관련 링크

1. Policy 생성

1-1. policy yaml 생성

Cluster Autoscaler가 IAM 역할을 사용할 수 있도록 정책 생성
my-cluster 의 정보에는 EKS cluster 정보 기입 : prod-xxx-eks

vi cluster-autoscaler-policy.json

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "autoscaling:SetDesiredCapacity",
                "autoscaling:TerminateInstanceInAutoScalingGroup"
            ],
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/k8s.io/cluster-autoscaler/my-cluster": "owned"
                }
            }
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "autoscaling:DescribeAutoScalingInstances",
                "autoscaling:DescribeAutoScalingGroups",
                "ec2:DescribeLaunchTemplateVersions",
                "ec2:DescribeInstanceTypes",
                "autoscaling:DescribeTags",
                "autoscaling:DescribeLaunchConfigurations"
            ],
            "Resource": "*"
        }
    ]
}

1-2. policy 배포

위에서 생성한 policy yaml 배포

aws iam create-policy \
    --policy-name AmazonEKSClusterAutoscalerPolicy \
    --policy-document file://cluster-autoscaler-policy.json

2. Role 생성

2-1. Role 생성

1. https://console.aws.amazon.com/iam/에서 IAM 콘솔을 엽니다.
2. 왼쪽 탐색 창에서 역할(Roles)을 선택합니다. 그런 다음 역할 생성(Create role)을 선택합니다.

3. 신뢰할 수 있는 엔터티 유형(Trusted entity type) 섹션에서 웹 자격 증명(Web identity)을 선택합니다.

웹 자격 증명(Web identity) 섹션에서:

보안 인증 공급자의 경우 클러스터에 대해 OpenID Connect 공급자 URL(provider URL)을 선택합니다(Amazon EKS의 클러스터 개요(Overview) 탭에 표시된 대로).
대상(Audience)에서 sts.amazonaws.com을 입력합니다.
다음(Next)을 선택합니다.

4. 필터 정책(Filter policies) 상자에 AmazonEKSClusterAutoscalerPolicy를 입력합니다. 그런 다음 검색에 반환된 정책 이름 왼쪽에 있는 확인란을 선택합니다.
다음(Next)을 선택합니다.

5. 역할 이름(Role name)에 역할의 고유한 이름(예: AmazonEKSClusterAutoscalerRole)을 입력합니다.
설명(Description)에서 Amazon EKS - Cluster autoscaler role과 같은 설명 텍스트를 입력합니다.
역할 생성(Create role)을 선택합니다.

6. 역할을 생성한 후 편집할 수 있도록 콘솔에서 이 역할을 선택하여 엽니다.
신뢰 관계(Trust relationships) 탭을 선택한 후 신뢰 정책 편집(Edit trust policy)을 선택합니다.
""oidc.eks.ap-southeast-1.amazonaws.com/id/xxxx:sub": "system:serviceaccount:kube-system:cluster-autoscaler"" 내용 추가

3. Cluster Autoscaler 배포

3-1. Cluster Autoscaler yaml 다운로드

curl -o cluster-autoscaler-autodiscover.yaml https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml

3-2. Cluster Autoscaler yaml 수정

arn:aws:iam::<YOUR Account ID>:role/AmazonEKSClusterAutoscalerRole을 Account ID 수정합니다
<YOUR CLUSTER NAME>을 클러스터 이름으로 바꿉니다.

vi cluster-autoscaler-autodiscover.yaml
metadata:
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::xxx:role/AmazonEKSClusterAutoscalerRole
    
- --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/prod-xxx-eks

3-3. Cluster Autoscaler 배포

kubectl apply -f cluster-autoscaler-autodiscover.yaml -n kube-system

serviceaccount/cluster-autoscaler created
clusterrole.rbac.authorization.k8s.io/cluster-autoscaler created
role.rbac.authorization.k8s.io/cluster-autoscaler created
clusterrolebinding.rbac.authorization.k8s.io/cluster-autoscaler created
rolebinding.rbac.authorization.k8s.io/cluster-autoscaler created
deployment.apps/cluster-autoscaler created

4. Cluster Autoscaler 옵션 수정

4-1. 서비스 계정에 주석을 설정

AWS IAM Role 권한을 SA(ServiceAccount)에 설정
이전에 생성한 IAM 역할의 ARN을 사용하여 cluster-autoscaler 서비스 계정에 주석을 지정합니다. example values를 고유한 값으로 바꿉니다

kubectl annotate serviceaccount cluster-autoscaler \
  -n kube-system \
  eks.amazonaws.com/role-arn=arn:aws:iam::ACCOUNT_ID:role/AmazonEKSClusterAutoscalerRole

4-2. deployment에 주석을 설정

safe-to-evict 옵션 : 실행 중인 POD가 있으면 WorkerNode를 제거를 방지

kubectl patch deployment cluster-autoscaler \
  -n kube-system \
  -p '{"spec":{"template":{"metadata":{"annotations":{"cluster-autoscaler.kubernetes.io/safe-to-evict": "false"}}}}}'

4-3. deployment에 옵션 설정

--balance-similar-node-groups
- 모든 가용 영역에서 사용 가능한 컴퓨팅이 충분한지 확인
--skip-nodes-with-system-pods=false
- kube-system에서 사용하는 pod가 있으면 노드를 삭제하지 않는다.
- DeamonSet 또는 mirror pod 제외
- default True

kubectl edit deploy cluster-autoscaler -n kube-system

4-4. deployment에 image 옵션 설정

Cluster Autoscaler Image 버전은 “v1.22.2” 설정
https://github.com/kubernetes/autoscaler/releases?q=1.23&expanded=true

kubectl edit deploy cluster-autoscaler -n kube-system

image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.22.2

5. Cluster Autoscaler 설치 확인

Cluster Autoscaler Log 확인

kubectl -n kube-system logs -f deployment.apps/cluster-autoscaler

Cluster AutoScaler 테스트

작업 준비
- EKS WorkerNode ASG 셋팅
  - (MAX 4 / MIN 2 / Desired 2)

aws autoscaling describe-auto-scaling-groups --query "AutoScalingGroups[? Tags[? (Key=='eks:cluster-name') && Value=='prod-xxx-eks']].[AutoScalingGroupName, MinSize,MaxSize,DesiredCapacity]" --output table

1. pod 배포

web pod yaml 생성

vi php-apache.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: php-apache
spec:
  selector:
    matchLabels:
      run: php-apache
  replicas: 1
  template:
    metadata:
      labels:
        run: php-apache
    spec:
      containers:
      - name: php-apache
        image: registry.k8s.io/hpa-example
        ports:
        - containerPort: 80
        resources:
          limits:
            cpu: 500m
          requests:
            cpu: 200m
      nodeSelector:
        nodegroupname: test-ng
---
apiVersion: v1
kind: Service
metadata:
  name: php-apache
  labels:
    run: php-apache
spec:
  ports:
  - port: 80
  selector:
    run: php-apache
---
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: php-apache
  namespace: dw
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: php-apache
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50

web pod 배포

kubectl apply -f php-apache.yaml -n dw

web pod, hpa 배포 확인

watch "kubectl get pod,hpa -n dw; echo; echo ; kubectl top pod -n dw; echo ;echo; aws autoscaling describe-auto-scaling-groups --query \"AutoScalingGroups[? Tags[? (Key=='eks:cluster-name') && Value=='prod-xxx-eks']].[AutoScalingGroupName, MinSize,MaxSize,DesiredCapacity]\" --output table"

2. 부하 테스트

wget 부하 테스트

kubectl run -i -n dw \
    --tty load-generator \
    --rm --image=busybox \
    --restart=Never \
    -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"

3. Pod Pending 확인

pod pending 확인

kubectl get pod -n dw
pod/php-apache-5f8766dbcb-59jwx   0/1     Pending   0          9s
pod/php-apache-5f8766dbcb-5mzsv   0/1     Pending   0          9s

pod event 확인
- 0/8 nodes are available: 2 Insufficient cpu, 6 node(s) didn't match Pod's node affinity/selector.
  - 사용가능한 노드 (0/8) 8개중 0개, 2개 노드는 cpu 부족으로 배포 불가능.
- pod triggered scale-up: [{prod-xxx-eks-test-node-group-xxx 2->3 (max: 4)}]
  - cluster-autoscaler가 pod 리소스 부족 트리거 감지
  - web POD가 배포된 WorkerNode(test-ng)의 개수 2 -> 3개로 증설

kubectl describe pod/php-apache-5f8766dbcb-59jwx -n dw 

  Type     Reason                  Age                    From                Message
  ----     ------                  ----                   ----                -------
  Normal   TriggeredScaleUp        3m11s                  cluster-autoscaler  pod triggered scale-up: [{prod-xxx-eks-test-node-group-xxx 2->3 (max: 4)}]
  Warning  FailedScheduling        2m12s (x2 over 3m21s)  default-scheduler   0/8 nodes are available: 2 Insufficient cpu, 6 node(s) didn't match Pod's node affinity/selector.

kubectl get ev -n dw |grep pod/php-apache-5f8766dbcb-59jwx

13m         Warning   FailedScheduling               pod/php-apache-5f8766dbcb-59jwx       0/8 nodes are available: 2 Insufficient cpu, 6 node(s) didn't match Pod's node affinity/selector.
14m         Normal    TriggeredScaleUp               pod/php-apache-5f8766dbcb-59jwx       pod triggered scale-up: [{prod-xxx-eks-test-node-group-xxx 2->3 (max: 4)}]

4. WorkerNode 증가

ASG 정보 보기

aws autoscaling describe-auto-scaling-groups --query "AutoScalingGroups[? Tags[? (Key=='eks:cluster-name') && Value=='prod-xxx-eks']].[AutoScalingGroupName, MinSize,MaxSize,DesiredCapacity]" --output table

----------------------------------------------------------------------------------------------
|                                  DescribeAutoScalingGroups                                 |
+-----------------------------------------------------------------------------+----+----+----+
|  eks-prod-xxx-eks-app-node-group-xxx                                        |  3 |  3 |  3 |
|  eks-prod-xxx-eks-infra-node-group-xxx                                      |  3 |  3 |  3 |
|  eks-prod-xxx-eks-test-node-group-xxx                                       |  2 |  4 |  3 |
+-----------------------------------------------------------------------------+----+----+----+

kubectl get no --show-labels |grep test-ng |awk '{print $1,$2,$3,$4}'

ip-xxx.ap-southeast-1.compute.internal Ready <none> 15m
ip-xxx.ap-southeast-1.compute.internal Ready <none> 3h31m
ip-xxx.ap-southeast-1.compute.internal Ready <none> 4h9m

pod 배포 완료
- Cluster AutoScaler가 WorkerNode 증설
- 리소스 부족으로 Pending된 POD 배포 완료

kubectl describe pod/php-apache-5f8766dbcb-59jwx -n dw

  Normal   SandboxChanged          42s (x12 over 72s)     kubelet             Pod sandbox changed, it will be killed and re-created.
  Normal   Pulling                 41s                    kubelet             Pulling image "registry.k8s.io/hpa-example"

kubectl get ev -n dw |grep pod/php-apache-5f8766dbcb-59jwx

11m         Normal    Pulling                        pod/php-apache-5f8766dbcb-59jwx       Pulling image "registry.k8s.io/hpa-example"

Cluster Autoscaler 정리

WorkerNode 증가

WorkerNode 증가 확인
- POD Pending 후 Cluster Autoscaler가 바로 감지함 (10초 단위로 체크)
- Cluster Autoscaler가 WorkerNode 증설
- EC2 증가 → EKS WorkerNode 참여 → POD Recreate
  - 4분 소요, EC2 타입 및 POD에 따라서 더 오래 걸릴 수 있음

WorkerNode 축소

WorkerNode 축소 확인
- WorkerNode 10분 후 축소

node 이벤트 정보

Cluster Autoscaler가 WokerNode를 삭제

kubectl describe node ip-xxx.ap-southeast-1.compute.internal

  Type    Reason                   Age                From                Message
  ----    ------                   ----               ----                -------
  Normal  ScaleDown                2s                 cluster-autoscaler  node removed by cluster autoscaler

Cluster Autoscaler 로그 정보

사용하지 않는 WokerNode를 10분 대기 후 삭제

kubectl logs cluster-autoscaler-7f467f7c8b-6nmjf -n kube-system

I1014 10:01:52.850847       1 scale_down.go:448] Node ip-10-223-69-47.ap-southeast-1.compute.internal - cpu utilization 0.164894
I1014 10:01:52.851457       1 cluster.go:345] Pod dw/php-apache-5f8766dbcb-g7r26 can be moved to ip-10-223-69-47.ap-southeast-1.compute.internal
I1014 10:01:52.851587       1 static_autoscaler.go:510] ip-10-223-69-47.ap-southeast-1.compute.internal is unneeded since 2022-10-14 09:51:47.768395971 +0000 UTC m=+24420.518339893 duration 10m4.880789629s
I1014 10:01:52.851799       1 scale_down.go:829] ip-10-223-69-47.ap-southeast-1.compute.internal was unneeded for 10m4.880789629s
I1014 10:01:52.851858       1 scale_down.go:1104] Scale-down: removing empty node ip-10-223-69-47.ap-southeast-1.compute.internal
I1014 10:01:52.852216       1 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"kube-system", Name:"cluster-autoscaler-status", UID:"d1fef2a9-fce5-45de-83a7-a8a2c2cf133d", APIVersion:"v1", ResourceVersion:"28608753", FieldPath:""}): type: 'Normal' reason: 'ScaleDownEmpty' Scale-down: removing empty node ip-10-223-69-47.ap-southeast-1.compute.internal
I1014 10:01:52.878177       1 delete.go:103] Successfully added ToBeDeletedTaint on node ip-10-223-69-47.ap-southeast-1.compute.internal
I1014 10:01:53.226093       1 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"Node", Namespace:"", Name:"ip-10-223-69-47.ap-southeast-1.compute.internal", UID:"67a630b6-eeae-400d-ae7e-3729132ce21d", APIVersion:"v1", ResourceVersion:"28607260", FieldPath:""}): type: 'Normal' reason: 'ScaleDown' node removed by cluster autoscaler

옵션

--scale-down-delay-after-add

--scale-down-delay-after-delete

--scale-down-delay-after-failure

--scale-down-delay-after-add=5m