Blog

Now in its fifth year, the TechEmpower Framework Benchmarks project has another official round of results available. Round 16 is a real treat for anyone who likes big numbers. Not just in measured results per second (several metric crap tonne), but in number of tests measured (~1830), number of framework permutations tested (~464), number of languages included (26), and total execution time of the test suite (67 hours, or 241 billion microseconds to make that sound properly enormous). Take a look at the results and marvel in the magnitude of the numbers.

Recent months have been a very exciting time for this project. Most importantly, the community has been contributing some amazing test implementations and demonstrating the fun and utility of some good-natured performance competition. More on that later. This is a longer-than-average TFB round announcement blog entry, but there is a lot to share, so bear with me.

Dockerification… Dockerifying… Docking?

After concluding Round 15, we took on the sizeable challenge of converting the full spectrum of ~460 test implementations from our home-brew quasi-sandboxed (only mostly sandboxed) configuration to a stupefying array of Docker containers. It took some time, but The Great Dockerification has yielded great benefits.

Most importantly, thanks to Dockerizing, the reproducibility and consistency of our measurements is considerably better than previous rounds. Combined with our continuous benchmarking, we now see much lower variability between each run of the full suite.

Across the board, our sanity checking of performance metrics has indicated Docker's overhead is immeasurably minute. It's lost in the noise. And in any event, whatever overhead Docker incurs is uniformly applicable as all test implementations are required to be Dockered.

Truly, what we are doing with this project is a perfect fit for Docker. Or Docker is a perfect fit for this. Whichever. The only regret is not having done this earlier (if only someone had told us about Docker!). That and not knowing what the verb form of Docker is.

Dockerificationization.

New hardware

As we mentioned in March, we have a new physical hardware environment for Round 16. Nicknamed the “Citrine” environment, it is three homogeneous Dell R440 servers, each equipped with a Xeon Gold 5120 CPU. Characterized as entry- or mid-tier servers, these are nevertheless turning out to be impressive when paired with a 10-gigabit Ethernet switch.

Being freshly minted by Dell and Cisco, this new environment is notably quicker than equipment we have used in previous rounds. We have not produced a “difference” view between Round 15 and Round 16 because there are simply too many variables—most importantly this new hardware and Docker—to make a comparison remotely relevant. But in brief, Round 16 results are higher than Round 15 by a considerable margin.

In some cases, the throughput is so high that we have a new challenge from our old friend, “network saturation.” We last were acquainted with this adversary in Round 8, in the form of a Giant Sloar, otherwise known as one-gigabit Ethernet. Now The Destructor comes to us laughing about 10-gigabit Ethernet. But we have an idea for dealing with Gozer.

(Thanks again to Server Central for the previous hardware!)

Convergence in Plaintext and JSON serialization results

In Round 16, and in the continuous benchmarking results gathered prior to finalizing the round, we observed that the results for the Plaintext and JSON serialization tests were converging on theoretical maximums for 10-gigabit networking.

Yes, that means that there are several frameworks and platforms that—when allowed to use HTTP pipelining—can completely saturate ten gigabit per second with ~140-byte response payloads using relatively cheap commodity servers.

To remove the network bottleneck for future rounds, we are concocting a plan to cross the streams, in a manner of speaking: use lasers and the fiberoptic QSFP28 ports on our Cisco switch to bump the network capacity up a notch.

Expect to hear more about this as the plan develops during Round 17.

Continuous benchmarking

Introduced prior to Round 16, the continuous benchmarking platform really came into a fully-realized state in the past several months. Combined with the Great Dockening, we now see good (not perfect, but good) results materializing automatically every 67 hours or thereabouts.

Some quick points to make here:

  • We don't expect to have perfection in the results. Perfection would imply stability of code and implementations and that is not at all what we have in mind for this project. Rather, we expect and want participants to frequently improve their frameworks and contribute new implementations. We also want the ever-increasing diversity of options for web development to be represented. So expecting perfection is incongruous with tolerating and wanting dynamism.
  • A full suite run takes 67 hours today. This fluctuates over time as implementation permutations are added (or deleted).
  • Total execution time will also increase when we add more test types in the future. And we are still considering increasing the duration of each individual benchmarking exercise (the duration we run the load generator for to gather a single result datum). That is the fundamental unit of time for this project, so increasing that will approximately linearly increase the total execution time.
  • We have already seen tremendous social adoption of the continuous benchmarking results. For selfish reasons, we want to continue creating and posting official rounds such as today's Round 16 periodically. (Mostly so that we can use the opportunity to write a blog entry and generate hype!) We ask that you humor us and treat official rounds as the super interesting and meaningful events that they are.
  • Jokes aside, the continuous results are intended for contributors to the project. The official rounds are less-frequent snapshots suitable for everyone else who may find the data interesting.

Social media presence

As hinted above, we created a Twitter account for the TechEmpower Framework Benchmarks project: @TFBenchmarks. Don't don't @ us.

Engaging with the community this way has been especially rewarding during Round 16 because it coincided with significant performance campaigns from framework communities. Rust has blasted onto the server-side performance scene with several ultra high-performance options that are competing alongside C, C++, Go, Java, and C#.

Speaking of C#, a mainstream C# framework from a scrappy startup named Microsoft has been taking huge leaps up the charts. ASP.NET Core is not your father's ASP.NET.

Warming our hearts with performance

There is no single reason we created this project over five years ago. It was a bunch of things: frustration with slow web apps; a desire to quantify the strata of high-water marks across platforms; confirming or dispelling commonly-held hunches or prevailing wisdom about performance.View Round 16 results

But most importantly, I think, we created the project with a hopeful mindset of “perhaps we can convince some people to invest in performance for the better of all web-app developers.”

With expectations set dutifully low from the start, we continue to be floored by statements that warm our heart by directly or indirectly suggesting an impact of this project.

When asked about this project, I (Brian) have often said that I believe that performance improvements are best made in platforms and frameworks because they have the potential to benefit the entire space of application developers using those platforms. I argue that if you raise the framework's performance ceiling, application developers get the headroom—which is a type of luxury—to develop their application more freely (rapidly, brute-force, carefully, carelessly, or somewhere in between). In large part, they can defer the mental burden of worrying about performance, and in some cases can defer that concern forever. Developers on slower platforms often have so thoroughly internalized the limitations of their platform that they don't even recognize the resulting pathologies: Slow platforms yield premature architectural complexity as the weapons of “high-scale” such as message queues, caches, job queues, worker clusters, and beyond are introduced at load levels that simply should not warrant the complexity.

So when we see developers upgrade to the latest release of their favorite platform and rejoice over a performance win, we celebrate a victory. We see a developer kicking performance worry further down the road with confidence and laughter.

I hope all of the participants in this project share in this celebration. And anyone else who cares about fast software.

On to Round 17!

Technical notes

Round 16 is composed of:

We have retired the hardware environment provided by Server Central for our Web Framework Benchmarks project. We want to sincerely thank Server Central for having provided servers from their lab environment to our project.

Their contribution allowed us to continue testing on physical hardware with 10-gigabit Ethernet. Ten-gigabit Ethernet gives the highest-performing frameworks opportunity to shine. We were particularly impressed at Server Central's customer service and technical support, which was responsive and helpful in troubleshooting configuration issues even though we were using their servers free of charge. (And since the advent of our Continuous Benchmarking, we were essentially using the servers at full load around the clock.)

Thank you, Server Central!

New hardware for Round 16 and beyond

For Round 16 and beyond, we are happy to announce that Microsoft has provided three Dell R440 servers and a Cisco 10-gigabit switch. These three servers are homogeneous, each configured with an Intel Xeon Gold 5120 CPU (14/28 cores at 2.2/3.2 GHz), 32 GB of memory, and an enterprise SSD.

If your contributed framework or platform performs best with hand-tuning based on cores, please send us a pull request to adjust the necessary parameters.

These servers together compose a hardware environment we've named "Citrine" and are visible on the TFB Results Dashboard. Initial results are impressive, to say the least.

Adopting Docker for Round 16

Concurrent to the change in hardware, we are hard at work converting all test implementations and the test suite to use Docker. The are several upsides to this change, the most important being better isolation. Our past home-brew mechanisms to clean up after each framework were, at times, akin to whack-a-mole as we encountered new and fascinating ways in which software may refuse to stop after being subjected to severe levels of load.

Docker will be used uniformly—across all test implementations—so any impact will be imparted on all platforms and frameworks equally. Our measurements indicate trivial performance impact versus bare metal: on the order of less than 1%.

As you might imagine, the level of effort to convert all test implementations to Docker is not small. We are making steady progress. But we would gladly accept contributions from the community. If you would like to participate in the effort, please see GitHub issue #3296.

February 14, 2018

Framework Benchmarks Round 15

As of 2018-03-13, Azure results for Round 15 have been posted. These were not available when Round 15 was originally published.

What better day than Valentine's Day to renew one's vow to create high-performance web applications? Respecting the time of your users is a sure way to earn their love and loyalty. And the perfect start is selecting high-performance platforms and frameworks.

Results from Round 15 of the Web Framework Benchmarks project are now available! Round 15 includes results from the physical hardware environment at Server Central and cloud results from Microsoft Azure.

We ❤️ Performance

High-performance software warms our hearts like a Super Bowl ad about water or an NBC Olympics athlete biography.

But really, who doesn't love fast software? No one wants to wait for computers. There are more important things to do in life than wait for a server to respond. For programmers, few things are as rewarding as seeing delighted users, and respecting users' time is a key element of achieving that happiness.

View Round 15 resultsAmong the many effects of this project, one of which we are especially proud is how it encourages platforms and frameworks to be fast—to elevate the high-water marks of performance potential. When frameworks and platforms lift their performance ceiling upward, application developers enjoy the freedom and peace of mind of knowing they control their applications' performance fate. Application developers can work rapidly or methodically; they can write a quick implementation or squeeze their algorithms to economize on milliseconds; they can choose to optimize early or later. This flexibility is made possible when the framework and platform aren't boxing out the application—preemptively consuming the performance pie—leaving only scraps for the application developer. High-performance frameworks take but a small slice and give the bulk of the pie to the application developer to do with as they please.

This Valentine's Day, respect yourself as a developer, own your application's performance destiny, and fall in love with a high-performance framework. Your users will love you back.

Love from the Community

Community contributions to the project continue to amaze us. As of Round 15, we have processed nearly 2,500 pull requests and the project has over 3,000 stars on GitHub. We are honored by the community's feedback and participation.

We are routinely delighted to see the project referenced elsewhere, such as this project that monitors TCP connections that used our project to measure overhead, or the hundreds of GitHub issues discussing the project within other repositories. We love knowing others receive value from this project!

More Immediate Results for Contributors

When you are making contributions to this project, you want to see the result of your effort so you can measure and observe performance improvements. You also want/need log files in case things don't go as expected. To help accelerate the process, we have made the output of our continuous benchmarking platform available as a results dashboard. Our hardware test environment is continuously running, so new results are available every few days (at this time, a full run takes approximately 90 hours). As each run completes, a raw results.json file will be posted as well as zipped log files and direct links to log files for frameworks that encountered significant testing errors. We hope this will streamline the process of troubleshooting contributions.

We used run ed713ee9 from Server Central and run a1110174 from Azure.

In Progress

We are working to update the entire suite to Ubuntu 16 LTS and aim to be able to migrate to Ubuntu 18 LTS soon after it's available. This update will allow us to keep up with several features in both hardware and cloud environments, such as Azure's Accelerated Networking. Watch the GitHub project for more updates on this as they arrive!

Thank You!

Thank you so much to all of the contributors! Check out Round 15 and if you are a contributor to the project or just keenly interested, keep an eye on continuous results.