How to set up Load Balancing for Auto-scaled Instances

Last updated: February 9, 2026

Set up Load Balancing

By default, Porter uses a round-robin load balancing algorithm for distributing requests across multiple instances.

To modify the load balancing algorithm, add ingress annotations to your web service in the porter.yml file:

services:
  - name: ${SERVICE_NAME}
    type: web
    ingressAnnotations:
      nginx.ingress.kubernetes.io/load-balance: "ewma"  # or "round_robin"

Choose between two available algorithms:
- round_robin - Distributes requests sequentially across available instances
- ewma - (Exponentially Weighted Moving Average) Routes based on server response times

Usage

The load balancing configuration works automatically with your autoscaling settings. In the example configuration:

Minimum instances: 1
Maximum instances: 5
Scaling triggers: 70% CPU or memory utilization

When autoscaling creates new instances, the load balancer automatically includes them in the request distribution according to the configured algorithm.