Load balancers are an essential part of any scalable web architecture. They distribute traffic across multiple servers to improve performance and availability. But how many load balancers can you have before you hit diminishing returns? Here is a quick look at the factors that determine optimal load balancer capacity.
Why Use Multiple Load Balancers?
Using multiple load balancers provides several key benefits:
- Increased redundancy – If one load balancer fails, traffic can be routed to the remaining load balancers with minimal interruption.
- Scalability – Additional load balancers can be added to handle increasing traffic.
- Performance – Load can be spread across more resources to reduce individual load balancer strain.
In general, the more load balancers you have, the higher your capacity, redundancy and scalability will be. However, each additional load balancer also adds complexity and cost. So what is the optimal amount?
Factors That Determine Load Balancer Capacity
There are several key factors that influence how many load balancers you need:
Traffic Volume
The amount of traffic or requests hitting your site is a major determinant of load balancer requirements. If you have huge traffic volume, you will generally need more load balancers to handle the demand.
As a rule of thumb, plan for each load balancer to handle around 1,000 requests per second. But actual capacity will depend on the size and complexity of each request.
Monitor your traffic patterns and scale load balancers up or down based on peaks and valleys.
Load Balancer Size
Load balancer instance types and configurations will impact how much traffic each one can handle. Larger load balancer sizes or clusters generally have higher throughput.
For example, an AWS Network Load Balancer has a maximum capacity of 50,000 connections per second. Moving to an Application Load Balancer doubles that to 100,000. And the larger Application Load Balancer instances have even higher capacity.
Right size your load balancers based on your typical and peak connection rates.
Application Architecture
How your application is architected can determine load balancing needs. For example:
- Are you running a clustered architecture with multiple app servers? This may require more load balancers.
- Do you have different application components with distinct workflows? Using multiple smaller load balancers can help isolate routing.
- Are workloads isolated by region? You may want local load balancers in each region.
Understand how your app is put together so you can distribute work efficiently across load balancers.
Redundancy Requirements
More load balancers means higher redundancy and availability. To determine how much redundancy you need, look at factors like:
- How critical is your application uptime? Mission critical apps may need more backups.
- What is your traffic volume? More traffic means greater impact if a load balancer goes down.
- Do you need to maintain availability during maintenance windows or hardware failures?
If your app requires high redundancy, plan for at least 3-5 load balancers to allow for failures or upgrades.
Geographic Coverage
If you have users distributed across different geographic regions, you may want to deploy load balancers in each major location. This localizes traffic routing decisions and reduces latency.
The number of regions you need to cover will influence the load balancer count. Multi-national consumer apps often have many global load balancers.
Hybrid Environments
For hybrid environments that span both cloud and on-premise infrastructure, you may need distinct load balancers for each environment. This keeps routing logic separate.
Hybrid architecture can potentially double the number of load balancers compared to a purely cloud-based or on-premise application.
Cost Considerations
There are direct costs associated with running more load balancers, both for the instances themselves and data processed. At a certain point, you may hit diminishing returns where adding more load balancers does not provide enough performance benefit to justify the cost.
Analyze the performance you are getting vs dollars spent at different scales, and find the optimal economic balance.
General Load Balancer Count Guidelines
Taking all the factors above into consideration, here are some general guidelines for load balancer capacity planning:
Traffic Volume | 1x load balancer per 1,000 requests per second (rps) |
Redundancy | Minimum 3x load balancers for high availability |
Environments | 2x load balancers for hybrid environments |
Regions | 1x load balancer per geographic region |
So for example, an application with:
- 10,000 rps peak traffic
- Hybrid cloud environment
- 2 geographic regions (East and West Coast)
- High availability requirements
Would need around:
- 10 load balancers for traffic (10,000 rps / 1,000 rps per load balancer)
- 2x for hybrid environent
- 2x for geographic regions
- 3x for high availability
For a total of 10 * 2 * 2 * 3 = 120 load balancers.
These are just guidelines – you still need to measure your actual infrastructure performance and capacity needs. But it provides a good starting point for planning your load balancer scale.
Cloud Load Balancing Services
The major cloud providers offer managed load balancing services that can automate scaling up and down of load balancers based on demand. This includes:
- AWS Elastic Load Balancing (ELB) – Automatically scales up to the thousands of load balancers.
- Google Cloud Load Balancing – Scales up globally based on utilization.
- Azure Load Balancer – Load balancer instances can auto-scale using virtual machine scale sets.
These services can dynamically grow and shrink your pool of load balancers based on metrics like requests per second, network utilization, and latency. This optimizes performance and cost automatically.
For example, you may only need 10 load balancers to handle average traffic, but scale up to 100 during peaks. Cloud load balancing services can handle this fluctuation for you.
Considerations for Large Scale
Once you get beyond around 100 load balancers, some additional considerations come into play:
Load Balancer Mode
At large scale, a DNS-based load balancing approach is easiest to manage. With DNS, you simply register new load balancer IPs as you scale up. The DNS service handles distributing connections across the full set of IP’s.
This is easier than a proxy or round-robin mode where you have to actively manage and update the full list of backends.
IP Address Management
With hundreds of load balancers, you need to automate IP address allocation. This ensures each load balancer gets a unique IP address and that IPs are recycled when load balancers are terminated.
Cloud virtual networking services like AWS VPC or Google Cloud Networking can automate IP assignments at scale.
Load Balancer Monitoring
Actively monitor load balancer metrics like connections, latency, and errors to optimize distribution. If any particular load balancer starts showing high utilization, you may need to scale out further or redistribute traffic.
Use monitoring tools like AWS CloudWatch, Datadog or New Relic to gain visibility across all load balancers.
Automated Scaling
Use auto-scaling groups to automatically grow and shrink the load balancer pool based on demand. This provides hands-off management at large scale.
Most cloud providers offer auto-scaling for load balancers integrated with their load balancing services.
Global Traffic Management
Route traffic intelligently across globally distributed load balancers based on region, latency and utilization. Global traffic management systems like AWS Traffic Flow or NS1 can optimize this traffic routing for you.
Conclusion
Optimizing load balancer scale is a balancing act between cost, performance, redundancy and complexity. The right number comes down to your specific traffic volumes, infrastructure and application architecture.
Plan load balancer capacity based on requests per second, geography, redundancy needs and hybrid requirements. 100+ load balancers is common for high scale apps spanning multiple regions and clouds.
Take advantage of cloud load balancer auto-scaling, global traffic management and automation tools to optimize cost and performance as you scale up.