Generate load

To observe HPA scale out in response to the policy we have configured we need to generate some load on our application. We'll do that by calling the home page of the workload with hey.

The command below will run the load generator with:

10 workers running concurrently
Sending 5 queries per second each
Running for a maximum of 60 minutes

~$kubectl run load-generator \

--image=williamyeh/hey:latest \

--restart=Never -- -c 10 -q 5 -z 60m http://ui.ui.svc/home

Now that we have requests hitting our application we can watch the HPA resource to follow its progress:

~$kubectl get hpa ui -n ui --watch

NAME   REFERENCE       TARGETS   MINPODS   MAXPODS   REPLICAS   AGE

ui     Deployment/ui   69%/80%   1         4         1          117m

ui     Deployment/ui   99%/80%   1         4         1          117m

ui     Deployment/ui   89%/80%   1         4         2          117m

ui     Deployment/ui   89%/80%   1         4         2          117m

ui     Deployment/ui   84%/80%   1         4         3          118m

ui     Deployment/ui   84%/80%   1         4         3          118m

Once you're satisfied with the autoscaling behavior, you can end the watch with Ctrl+C and stop the load generator like so:

~$kubectl delete pod load-generator

As the load generator terminates, notice that HPA will slowly bring the replica count to min number based on its configuration.