kubernetes horizontal pod autoscaler

A horizontal pod autoscaler works in the same way as a VPA for the most part. It continuously monitors specified metrics, such as CPU utilization or custom metrics, for the pods it is scaling. You define a target value for the chosen metric. For example, you might set a target CPU utilization percentage. Based on the observed metrics and the defined target value, HPA makes a scaling decision to either increase or decrease the number of pod replicas. The amount of resources allocated to each pod remains the same. The number of pods will increase to accommodate this influx. If there is a service associated with the pod, the service will automatically start load balancing across the pod replicas without any intervention from your side.

  • horizontal auto-scaling is the number of replicas for a controller
  • it is performed by a controller named autoscaler, drived by a resource type HorizontalPodAutoScaler
  • the controller checks regularly (f=30s by default) kubernetes pods metrics
    • cpu/memory resources
    • custom metrics
$ # configure an existing deployment
$ kubectl autoscale deployment foobar --cpu-percent 30 --min 1 --max 5
horizontalpodautoscaler.autoscaling/foobar autoscaled
 
$ kubectl get hpa
NAME     REFERENCE           TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
foobar   Deployment/foobar   <unknown>/30%   1         5         0          4s
---
# v1 supports only CPU
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: autosc
spec:
  maxReplicas: 5
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: autosc
  targetCPUUtilizationPercentage: 30
---
# v2 (k8s >= 1.6) supports cpu + memory + custom
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: php-apache
  namespace: default
spec:
  scaleTargetRef: { apiVersion: apps/v1, kind: Deployment, name: php-apache, minReplicas: 1,
  maxReplicas: 10 }
  metrics:
    - type: Resource
      resource: { name: cpu, targetAverageUtilization: 50 }
    - type: Pods
      pods: { metricName: packets-per-second, targetAverageValue: 1k }
    - type: Object
      object:
        metricName: requests-per-second
        target: { apiVersion: networking.k8s.io/v1, kind: Ingress, name: main-route }
        targetValue: 10k

References