As we and our collaborators prepare Round 9 of our Framework Benchmarks project, we had an epiphany:
With high-performance software, a single modern server processes over 1 million HTTP requests per second.
Five months ago, Google talked about load-balancing to achieve 1 million requests per second. We understand their excitement is about the performance of their load balancer1. Part of what we do is performance consulting—so we are routinely deep in request-per-second data—and we recognized a million requests per second as an impressive milestone.
But fast-forward to today, where we see the same response rate from a single server. We had been working with virtual servers and our modest workstations for so long that these data were a bit of a surprise.
The mind immediately begins painting a world of utter simplicity, where our applications' scores of virtual servers are rendered obsolete. Especially poignant is the reduced architectural complexity that an application can reap if its performance requirement can be satisfied by a single server. You probably still want at least two servers for resilience, but even after accounting for resilience, your architectural complexity will likely remain simpler than with hundreds of instances.
Our project's new hardware
For Round 9 of our benchmarks project, Peak Hosting has generously provided us with a number of Dell R720xd servers each powered by dual Xeon E5-2660 v2 CPUs and 10-gigabit Ethernet. Loaded up with disks, these servers are around $8,000 a piece direct from Dell. Not cheap.
But check out what they can do:
This is output from Wrk testing a single server running Undertow using conditions similar to Google's test (1-byte response body, no HTTP pipelining, no special request headers). 1.039 million requests per second.
Obviously there are myriad variables that make direct comparison to Google's achievement an impossibility. Nevertheless, achieving a million HTTP requests per second over a network without pipelining to a single server says something about the capacity of modern hardware.
It's possible even higher numbers would be reported had we tested a purpose-built static web server such as nginx. Undertow is the lightweight Java web application server used in WildFly. It just happens to be quite quick at HTTP. Here's the code we used for this test:
In Round 9 (coming soon, we swear!), you'll be able to see the other test types on Peak's hardware alongside our i7 workstations and the EC2 instances we've tested in all previous rounds. Spoiler: I feel bad for our workstations.
Incidentally, if you think $8,000 is not cheap, you might want to run the monthly numbers on 200 virtual server instances. Yes, on-demand capacity and all the usual upsides of cloud deployments are real. But the simplified system architecture and cost advantage of high-performance options deserve some time in the sun.
1. Not only that, Google expressly said they were not using this exercise to demonstrate the capacity of their instances but rather to showcase their load balancer's performance. However, the scenario they created achieved massive request-per-second scale by load balancing hundreds of instances. We are simply providing a counter-point that the massive scale achieved by hundreds of instances can be trivially mimicked by a single modern server with modern tools. The capacity of a single server may not be surprising to some, but it may come as a surprise to others.