How to set up Load Balancing for Auto-scaled Instances

Last updated: February 9, 2026

Set up Load Balancing

  1. By default, Porter uses a round-robin load balancing algorithm for distributing requests across multiple instances.

  2. To modify the load balancing algorithm, add ingress annotations to your web service in the porter.yml file:

    services:
      - name: ${SERVICE_NAME}
        type: web
        ingressAnnotations:
          nginx.ingress.kubernetes.io/load-balance: "ewma"  # or "round_robin"
  3. Choose between two available algorithms:

    • round_robin - Distributes requests sequentially across available instances

    • ewma - (Exponentially Weighted Moving Average) Routes based on server response times

Usage

The load balancing configuration works automatically with your autoscaling settings. In the example configuration:

  • Minimum instances: 1

  • Maximum instances: 5

  • Scaling triggers: 70% CPU or memory utilization

When autoscaling creates new instances, the load balancer automatically includes them in the request distribution according to the configured algorithm.