High Throughput Internal Load Balancing in AWS Cloud

I delved into different approaches to internal load balancing. What kind of options there are and benchmark them in order to see how they behave and perform under high load.

Jul 11, 2024

Introduction

I work at a crypto exchange, where latency and througput of trading hot path is utmost important. We try to use every trick that we can come up with in AWS Cloud in order to get even more performance out of our systems. Saying that, load balancing is obviously very important. AZ aware connections, proximity of servers inside AZ, network hops from service to service, direct memory access, optimizing TCP stack, so that data fits in one packet… it all matters. In this post we will focus on our benchmark process regarding internal load balancing.

Loadbalancing approaches

Idea is not to delve too much into details of each, but high overview is still due.

AWS Application load balancer

Provided by AWS
Operates at Layer 7 (HTTP/HTTPS) and supports features like path-based routing, host-based routing, and HTTP/2
Automatically scales to handle changes in application traffic

AWS Network load balancer

Provided by AWS
Operates at Layer 4 (TCP/UDP) and provides extreme performance
Handles millions of requests per second while maintaining ultra-low latencies
Provides a static IP address per Availability Zone
Useful for load balancing traffic to databases, media streaming, and other use cases that require ultra-low latencies

IPVS load balancer

Not provided by AWS as service and usually used in conjunction with kubernetes
IPVS is a load balancing solution built into the Linux kernel
It implements transport-layer load balancing, handling TCP and UDP traffic
IPVS can direct requests to a cluster of real servers and make the services appear as virtual services on a single IP address

iptables load balancer

Not provided by AWS as service and usually used in conjunction with kubernetes
iptables load balancing is implemented directly in the Linux kernel, providing very high performance
The nat table in iptables is used to perform Network Address Translation (NAT). This allows iptables to redirect incoming traffic to different backend servers

Proxy load balancer

Not provided by AWS as service
HAProxy, NGINX Plus, Traefik etc…
Distributes incoming traffic across multiple backend servers based on predefined algorithms (e.g., round-robin, least connections)
In summary proxy load balancer combines the benefits of a reverse proxy (security, caching, SSL termination) with the load balancing capabilities (traffic distribution, health checks, scalability) to provide a comprehensive solution for optimizing the delivery of web applications.

DNS based loadbalancing

Not provided by AWS as service
Consul, there might be more, but the priciple is same.
DNS load balancer provides active health checking of services
If a service fails its health check, it is removed from the service discovery results
This enables load balancing to avoid unhealthy instances

Purpose

I aim to determine which type of load balancer is best suited for throughput in terms of requests per second with as little, to no loss comparing to actual direct client / server throughput.

Rules

HTTP traffic

Since there are multiple approaches of talking service to service, majority of clients that actually use these services in finance usually talk HTTP or FIX protocol. For intra service communication gRPC would probably be more suitable, but for the sake of giving all loadbalancers a chance in the race, I chose HTTP to be the one that we are going to benchmark. No keep-alive connections!

Static file serving

Purpose is to benchmark load balancers, not webserver performance itself. Static file serving (an image) is as simple as it gets for a webserver to serve without any interruption. Image gets cached and we get minimum delay serving it from memory upon receiving request. Decision was made, to serve small, only 4kb image file which more or less reflects size of common API service response.

Sweet spot

A baseline benchmark from one server to another, without any load balancer, has been established as our target requests per second. We aim to achieve this target, or get as close to it as possible, while introducing various load balancers between the client server and the webserver.

Setup

General

VPC
1 AZ (no cross AZ traffic allowed)
c5.large instance for client and server
Amazon Linux 2023
No additional optimizations of TCP stack in kernel apart from necessary changes needed for specific loadbalancer to actually work
Apache benchmark tool
200 concurrent connections and 1 million requests, repeated at least 10 times, taking average requests per second as a metric that we are observing

AWS Application Load Balancer

Loadbalancer with HTTP port 80 forwarding to target group with webserver. Additional AZ subnet was added to ALB, since it requires at least 2 AZs to be able to run, but no instances exist in that second AZ.

AWS Network Load Balancer

TCP loadbalancer, forwarding port 80 to target group.

IPVS

First we installed ipvsadm, loaded kernel module and set kernel params to be able to run IPVS. Whole setup is basically the same as kube-proxy does it, when choosing IPVS as proxy mode.

yum -y install ipvsadm

modprobe ip_vs

sysctl -w net.ipv4.ip_forward=1
sysctl -w net.ipv4.ip_nonlocal_bind=1

systemctl start ipvsadm

After that we created a virtual IP and added our server as backend to it:

ipvsadm -A -t 10.0.1.50:80 -s rr
ipvsadm -a -t 10.0.1.50:80 -r 10.0.1.100:80 -m

At this point we can run benchmark on our vip 10.0.1.50.

iptables

I made minimal setup. Intentionally I didn’t add all additional stuff that usually kubernetes does by marking chains and all shebang that would result in even more impact on the actual benchmark, cause of additional rule processing. In the end it comes down to this:

iptables -F
iptables -t nat -F
iptables -t nat -X

# Forward traffic from port 80 to 192.168.10.211:80
iptables -t nat -A PREROUTING -p tcp --dport 80 -j DNAT --to-destination 192.168.10.211:80
iptables -t nat -A OUTPUT -p tcp -o lo --dport 80 -j DNAT --to-destination 192.168.10.211:80
iptables -t nat -A POSTROUTING -p tcp -d 192.168.10.211 --dport 80 -j MASQUERADE

What this does is very similar or even simplified version that kube-proxy does when you choose to run iptables as proxy mode.

Proxy server

Even if it seemed as non-sence to even try this one, since running additional webserver and amount of processing needed to run it, just to forward traffic to another backend seems very resource intense and a very poor solution for load balancer. I still decided to give it a chance and chose HAProxy as it’s suppose to be most suitable for these cases general.

Here is the excerpt of the config for HAproxy:

frontend myfrontend
  mode http
  bind :80
  default_backend web_servers

backend web_servers
  mode http
  balance roundrobin
  server webserver 192.168.10.211:80 check

Benchmark results

Forenote: For obvious reason it makes no sense to be benchmarking DNS load balancing solution, since that kind of load balancing already happens on the DNS resolution stage, before we even start establishing connection to the server and therefore is treated as a direct client to server connection.

Observation

AWS Application Loadbalancer

We can immediately notice how slow start is in effect here. It starts with about 30% of final throughput. It took loadbalancer roughly 10 minutes to scale up on AWS side to be performing at par.

AWS Network Loadbalancer

As advertised by AWS, very fast to react and at times even faster than direct connection to webserver. One could say, that doesn’t even make sense! But it actually does, you just don’t see whats going on in the background since it’s hidden in the “Cloud”. The fact is, servers are scattered around in datacenter and there is X switches between them. I would assume AWS reserves portion of the switches and routes network so that their services get best latency from servers. this doesn’t mean server to server is same well optimised! Meaning, RTT from client to server might take longer than RTT from client to network loadbalancer to server. Interesting, although as seen from the graph, difference is very minimal.

IPVS

Best from non AWS native load balancers. As it came up strong, it still fell short in the long run vs Application Load Balancer. Kernel module as soon as it was loaded did have an effect on the overall network performance. It added additional processing needed to be done and that is seen from the graph. For kubernetes, if you don’t want to be paying for network load balancers, because you might have a lot of microservices and what not, IPVS is best option to go with.

iptables

Second to last. Processing of iptables is burdening the system and if you add even more rules that usually kubernetes comes with, you will have a cumbersome infrastructure. It might do it for non latency sensitive and low troughput stuff, but I don’t see why would anyone use it, if the change to IPVS in kube-proxy proxy mode is pretty trivial.

Proxy

For obvious reasons last. Processing needed to forward traffic is just too much for server to be dealing with additionally, to stay competitive. Usually HAProxy is something you would use in AWS to overcome other limitations of native AWS load balancers. Example of that would be like a active-standby setup. Where only one server can be answering, but you still want to have a highly available setup in case it goes down.

Summary

We can conclude best choice for internal load balancer would be AWS Network Load Balancer. If you don’t have a lot of microservices and are in need of huge number of load balancers, this is your preferred choice. It can become costly though in the long run as your infrastructure grows and you need more and more of them…

IPVS is best “free” loadbalancer, if you go with kubernetes and kube-proxy to manage it, otherwise it can become quite a burden to manage.

General rule of thumb is, if you want to even further optimise the impact of loadbalancer on overall performance and throughput, is to use keep alive or connection pooling. Not all programming languages support connection pooling, but HTTP keep-alive is pretty straight forward improvement. From my testing, enabling keep-alive pretty much doubled the performance in requests per second.

BitNirmata

Discussion about this post