kubernetes horizontal pod autoscaler
A horizontal pod autoscaler works in the same way as a VPA for the most part. It continuously monitors specified metrics, such as CPU utilization or custom metrics, for the pods it is scaling. You define a target value for the chosen metric. For example, you might set a target CPU utilization percentage. Based on the observed metrics and the defined target value, HPA makes a scaling decision to either increase or decrease the number of pod replicas. The amount of resources allocated to each pod remains the same. The number of pods will increase to accommodate this influx. If there is a service associated with the pod, the service will automatically start load balancing across the pod replicas without any intervention from your side.
- horizontal auto-scaling is the number of replicas for a controller
- it is performed by a controller named
autoscaler
, drived by a resource typeHorizontalPodAutoScaler
- the controller checks regularly (
f=30s
by default) kubernetes pods metrics- cpu/memory resources
- custom metrics
$ # configure an existing deployment
$ kubectl autoscale deployment foobar --cpu-percent 30 --min 1 --max 5
horizontalpodautoscaler.autoscaling/foobar autoscaled
$ kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
foobar Deployment/foobar <unknown>/30% 1 5 0 4s
---
# v1 supports only CPU
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: autosc
spec:
maxReplicas: 5
minReplicas: 1
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: autosc
targetCPUUtilizationPercentage: 30
---
# v2 (k8s >= 1.6) supports cpu + memory + custom
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: php-apache
namespace: default
spec:
scaleTargetRef: { apiVersion: apps/v1, kind: Deployment, name: php-apache, minReplicas: 1,
maxReplicas: 10 }
metrics:
- type: Resource
resource: { name: cpu, targetAverageUtilization: 50 }
- type: Pods
pods: { metricName: packets-per-second, targetAverageValue: 1k }
- type: Object
object:
metricName: requests-per-second
target: { apiVersion: networking.k8s.io/v1, kind: Ingress, name: main-route }
targetValue: 10k