HPA with CPU/memory and custom latency metricsHorizontal Pod Autoscaling on Kubernetes should start with CPU and memory, then advance to custom latency metrics to scale a Rails API predictably under spiky workloads. Using autoscaling/v2, define multiple metrics so HPA considers CPU, memory, and SLO‑aligned latency or RPS, avoiding blind spots that hurt tail latency at scale. Expose Rails API lat..