Preparing Your Cluster for High Traffic Events

Last updated: February 6, 2026

If you're expecting a significant increase in traffic to your application, it's important to ensure your cluster and infrastructure are prepared to handle the load. This article outlines key steps to take when preparing for high-traffic events.

1. Enable and Configure Autoscaling

Ensure your applications have autoscaling enabled:

  • Set a minimum of 3 replicas for each application

  • Adjust the maximum replicas based on expected traffic increase (e.g., if expecting 3x traffic, set max to 9)

2. Implement Health Checks and Graceful Shutdown

Enable health checks and graceful shutdown support for your applications to ensure safe scaling. Refer to our Zero Downtime Deployments documentation for more information.

3. Check and Increase EC2 Quota

Ensure you have sufficient EC2 quota to support your maximum node count:

  1. Open the AWS Console and go to the Service Quota page

  2. Select your region and search for "Running On-Demand Standard (A, C, D, H, I, M, R, T, Z) instances"

  3. Verify that your quota exceeds the sum total of (max nodes across all clusters * number of CPU cores per node)

  4. If needed, request a quota increase (note: this can take several days to process)

4. Increase Max Application Node Count

Adjust the maximum node count for your cluster:

  1. Navigate to Infrastructure > [Your Cluster Name] > Default node group

  2. Increase the maximum node count (e.g., to 15-20)

Heuristic for Setting Node Limits: Observe how many nodes your cluster uses during usual traffic without deployments, then set the maximum to 2x this number. This accounts for deployment scenarios where the cluster keeps your previous version running while the new one spins up, effectively doubling resource requirements.

Cost Considerations: Increasing the max application nodes will not incur charges unless you scale to use those nodes. There is no downside to setting a higher upper limit if you don't reach it.

Monitoring Node Usage: You can check the current number of running nodes at the top of the Infrastructure menu to help determine your baseline usage.

Observation 1: Increasing the max application nodes will not incur charges unless you scale to use those nodes.
Observation 2: During traffic, note how many nodes of your cluster are used as a base. Then, set the maximum nodes to twice that number. That's to allow you to deploy your apps as, during deployments, the cluster keeps your previous version running while the new one spins up, taking twice the resources.

5. Check and Increase EC2 Quota

Ensure you have sufficient EC2 quota to support your maximum node count:

  1. Open the AWS Console and go to the Service Quota page

  2. Select your region and search for "Running On-Demand Standard (A, C, D, H, I, M, R, T, Z) instances"

  3. Verify that your quota exceeds the sum total of (max nodes across all clusters * number of CPU cores per node)

  4. If needed, request a quota increase (note: this can take several days to process)

Additional Considerations

When preparing for high traffic events, it's crucial to plan ahead. Start these preparations well in advance, especially when requesting quota increases from AWS, as these can take several days to process.

Also, note that the Network Load Balancer configured for your cluster is designed to handle millions of requests per second. In most cases, no changes are needed to the load balancer configuration.

By following these steps, you can ensure that your infrastructure is ready to handle significant increases in traffic, allowing your application to scale smoothly during high-demand periods.