Configure HPA

Currently there are no resources in our cluster that enable horizontal pod autoscaling, which you can check with the following command:

~$kubectl get hpa -A

No resources found

In this case we're going to use the ui service and scale it based on CPU usage. The first thing we'll do is update the ui Pod specification to specify CPU request and limit values.

Kustomize Patch
Deployment/ui
Diff

~/environment/eks-workshop/modules/autoscaling/workloads/hpa/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ui
spec:
  template:
    spec:
      containers:
        - name: ui
          resources:
            limits:
              cpu: 250m

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app.kubernetes.io/created-by: eks-workshop
    app.kubernetes.io/type: app
  name: ui
  namespace: ui
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/component: service
      app.kubernetes.io/instance: ui
      app.kubernetes.io/name: ui
  template:
    metadata:
      annotations:
        prometheus.io/path: /actuator/prometheus
        prometheus.io/port: "8080"
        prometheus.io/scrape: "true"
      labels:
        app.kubernetes.io/component: service
        app.kubernetes.io/created-by: eks-workshop
        app.kubernetes.io/instance: ui
        app.kubernetes.io/name: ui
    spec:
      containers:
        - env:
            - name: JAVA_OPTS
              value: -XX:MaxRAMPercentage=75.0 -Djava.security.egd=file:/dev/urandom
          envFrom:
            - configMapRef:
                name: ui
          image: public.ecr.aws/aws-containers/retail-store-sample-ui:0.4.0
          imagePullPolicy: IfNotPresent
          livenessProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 8080
            initialDelaySeconds: 45
            periodSeconds: 20
          name: ui
          ports:
            - containerPort: 8080
              name: http
              protocol: TCP
          resources:
            limits:
              cpu: 250m
              memory: 1.5Gi
            requests:
              cpu: 250m
              memory: 1.5Gi
          securityContext:
            capabilities:
              add:
                - NET_BIND_SERVICE
              drop:
                - ALL
            readOnlyRootFilesystem: true
            runAsNonRoot: true
            runAsUser: 1000
          volumeMounts:
            - mountPath: /tmp
              name: tmp-volume
      securityContext:
        fsGroup: 1000
      serviceAccountName: ui
      volumes:
        - emptyDir:
            medium: Memory
          name: tmp-volume

               name: http
               protocol: TCP
           resources:
             limits:
+              cpu: 250m
               memory: 1.5Gi
             requests:
               cpu: 250m
               memory: 1.5Gi

Next, we need to create a HorizontalPodAutoscaler resource which defines the parameters HPA will use to determine how to scale our workload.

~/environment/eks-workshop/modules/autoscaling/workloads/hpa/hpa.yaml
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: ui
  namespace: ui
spec:
  maxReplicas: 4
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: ui
  targetCPUUtilizationPercentage: 80

Let's apply this configuration:

~$kubectl apply -k ~/environment/eks-workshop/modules/autoscaling/workloads/hpa