Blog

By Jeffrey Papen, CEO and Founder, Peak Hosting

At Peak Hosting, we're big fans of TechEmpower's Framework Benchmarks, an open source project the company has been coordinating since early 2013. Covering a wide variety of web application frameworks, this project gives developers useful data that can help them find the framework that will provide the performance and features they need for their application.

TechEmpower's benchmarking now includes six test types, more than 120 frameworks, 290 test permutations, and results that include latency and framework overhead.

Hardware comes into play when performance is important. And TechEmpower will tell you performance is always important. The best results were derived from a real-world environment running physical hardware. As a managed hosting provider, we were able to provide the project with the same types of machines that our customers use to run their production environments.

We first contributed to TechEmpower's Framework Benchmarks in Round 9 when we set up for the project five dedicated Dell R720 dual-Xeon E5 servers with 10 Gigabit Ethernet running in our data centers. High-end hardware directly correlates to high performance and the results from Round 9 to Round 10 bear this out. According to TechEmpower's Round 10 blog post:

Competition for the top position in the JSON-serialization test within the Peak Hosting environment has heated up so much that Round 10 sees a more than 100% increase in the top performance versus Round 9 (2.2M versus 1.05M). A year ago, TechEmpower showed that one million HTTP responses per second without load balancing was easy. We're delighted that 1M is already old news.

So what does hardware have to do with this impressive round-over-round improvement? We didn’t change the hardware between Rounds 9 and 10. What did change was that between rounds, test implementation contributors realized they had hardware available to them with 40 hyperthreading cores, and they were able to optimize their code to take advantage of that performance and capacity. A bit of tweaking for high-end hardware was all that was needed to utterly smash the previous round's leaderboard.

We're pleased that we're able to provide TechEmpower and the open source community with this hardware environment—and that it's the same type of hardware our customers use every day in production environments, making the results as valuable as possible. And we will eagerly await the results of Round 11 where we anticipate more significant performance leaps!

Round 10 of the Framework Benchmarks project is now available! It has been a little less than a year since the previous round and in that time, approximately 133 contributors have made 2,835 git commits. View Round 10 resultsThese contributions have improved the project's toolset and added many new framework test implementations.

We retired our in-house i7-2600K hardware environment for Round 10, and we changed our Amazon EC2 environment to c3.large instances. Meanwhile, the Peak R720 dual-Xeon E5 environment with 10-gigabit Ethernet is our default view for the results rendering.

Much of the effort in the past year has been focused on improving the toolset, allowing contributors to create their own test and development environment with less effort and to optionally focus on just the frameworks or platforms of interest to them. Between Round 9 and Round 10, we saw an average of 7 commits per day.

RoundFrameworksFramework permutations
Round 9~105205 configurations
Round 10~125293 configurations

View Round 10 results now.

Round 10 notes and observations

  • Competition for the top position in the JSON-serialization test within the Peak environment has heated up so much that Round 10 sees a more than 100% increase in the top performance versus Round 9 (2.2M versus 1.05M). For Round 10, Lwan has taken the crown. But we expect the other top contenders won't leave this a settled matter. A year ago, we said one million HTTP responses per second without load balancing was easy. We're delighted that 1M is already old news.
  • Compiled languages such as C, C++, Java, Scala, and Ur continue to dominate most tests, and Lua retains its unique position of standard-bearer for JIT languages by showing up within the top 10 on many test types.
  • While Go has, if anything, slightly improved since Round 9, the increased competition means Go is not in the top-ten leaderboard within the Peak environment. Go remains a strong performer in the smaller-server scenario as demonstrated by our EC2 c3.large environment.
  • During our preview cycles on Round 10, we elected to—for the time being as least—remove SQLite tests. SQLite tests miss the spirit of the database tests by avoiding network communication to a secondary server (a database server), making them a bit similar to our future caching-enabled test type. The SQLite tests may return once we have the caching test type specified and implemented.
  • The 2,835 git commits since Round 9 averages out to 7 commits per day. The contributors to this project have been keeping very busy! Since Round 9, 675 issues were opened and 511 issues were closed. Of those issues, 441 pull requests were created, and 321 pull requests were merged, which is roughly one PR merged per day.
  • The project is now Vagrant-compatible to ease environment setup.
  • Travis CI integration allows contributors to get a "green light" on pull requests shortly after submission. The massive breadth and test coverage represented by this project has created an inordinate load on the servers Travis provides for free use by the open source community. Going forward, we are working with Travis to more intelligently narrow our work-load based on the particulars of each PR. A great big thanks to Travis for being so tolerant of the crushing load we've created.
  • If you would like to contribute to the project, we've migrated documentation to ReadTheDocs.
  • Windows support has again fallen behind. We have received a great deal of Windows help in the past, but we don't have the internal capacity to keep it current with the evolution of the project. Round 10 does not include Windows tests, but we'd very much welcome any help catching Windows up for the next round.
  • For a bit of novelty, we are presently testing our benchmarks on a Raspberry Pi 2 Model B environment. If the results are interesting, we may include this environment in Round 11.

Change visualization

Hamilton Turner created a Gource video that illustrates the changes between Round 9 and Round 10.

Thanks!

A huge thank-you to Hamilton Turner, whose contributions to Round 10 are legion. He even referenced the project in his Ph.D. thesis!

A continued and special thanks to Peak Hosting for providing the dedicated hardware server environment we're using for our project. In a world that seems all too content to consider physical hardware as exceptional, we're living a life of multi-Xeon luxury. It is so choice; if you have the means, I highly recommend picking some up.

As always, we also want to thank everyone who has contributed to the project. We are at 572 forks on Github and counting. Considering we have only recently put serious effort into making the project approachable for contributors, we're super impressed by this number.

If you have questions, comments, criticism, or would like to contribute a new test or an improvement to an existing one, please join our Google Group, visit the project at Github, or come chat with us in #techempower-fwbm on Freenode.

About TechEmpower

We provide web and mobile application development services and are passionate about application performance. If this sounds interesting to you, we'd love to hear from you.

The latest round of our ongoing Framework Benchmarks project is now available! Round 9 updates several test implementations and introduces a new test environment using modern server hardware.

View Round 9 results

Since the first round, we have known that the highest-performance frameworks and platforms have been network-limited by our gigabit Ethernet. For Round 9, Peak Hosting has provided a high-performance environment equipped with 10-gigabit Ethernet at one of their data centers.

With ample network bandwidth, the highest-performing frameworks are further differentiated. Additionally, the Peak test environment uses servers with 40 HT cores across two Xeon E5 processors, and the results illustrate how well each framework scales to high parallelism on a single server node.

View Round 9 results now.

Round 9 notes and observations

In previous rounds, JSON serialization results for the highest-performance frameworks on our in-house i7 hardware converged on approximately 200,000 requests per second. Although some response headers are normalized, we believe that the small variations beyond 200,000 RPS are caused by how each framework uses the available bandwidth of our gigabit Ethernet.

For example, a larger number of response headers might mean slightly fewer responses sent per second. However, this is acceptable as we want each implementation to be production-grade and typical for the framework, not tuned to include only the response headers we require. Thorough normalization of behavior is not a goal. (More importantly, we recommend against making decisions based on such small variations. 200,000 versus 20,000 is notable; 210,000 versus 200,000 is probably not.)

Top JSON results on i7 versus Peak
Top ten frameworks at JSON serialization; i7 on left, Peak on right

With 10-gigabit Ethernet, the results from the new Peak environment show a larger spread between top-performing frameworks. Of course, other variables are at play, such as significantly more HT processor cores--40 versus 8 for our i7 tests--and a different system architecture, NUMA.

The NUMA architecture presented a variety of challenges including unexpected difficulty scaling the database servers to utilize all available CPU cores. For example, we are using MySQL 5.5 in our database tests, but later versions of MySQL are reportedly better suited for NUMA. For this reason, we may migrate all MySQL tests to a more recent version in a future round. Having reviewed the results, we plan to use differently-equipped machines for Round 10. The hardware specifications used in Round 9 were not well-optimized for the web-app use case our project exercises.

As a result of the concessions made for NUMA, the performance delta for database tests between our i7 workstations and the the 40-core Xeon servers is not as pronounced as the non-database tests. We'd like to see this situation improve in future rounds. We would be happy to receive input and advice from readers with NUMA database deployment expertise.

Similarly, we expect that some application platforms are better suited for NUMA than others. For example, platforms that use process-level concurrency scaled to a large number of processor cores quite well. For JSON serialization, node.js performance in the Peak environment is about 3.3x that of i7. Meanwhile, Go's JSON serialization performance on Peak is only 1.6x that of i7. Even more interesting: Go's database performance is slightly lower on Peak than i7. (Yes, the Go tests use all CPU cores. Putting aside scaling as CPU cores increase, Go is quicker in the JSON serialization test than node.js in both environments.)

C++ continues to dominate virtually all tests. If you are a web developer using C++, you win this particular bragging-rights game. With 6,738,911 plaintext RPS from this single server, the remarkable 188,585 Fortunes RPS might be overlooked at first although it may be more impressive on consideration.

For Round 9, we added validation checks that confirm each implementation is behaving as we expect prior to measuring its performance. These checks are rigid--failing implementations for improper JSON response schema, for example--but ultimately a valuable tool to ensure fairness. We've fixed up many of the implementations to pass validation, but several remain to be fixed. If your favorite framework is showing up as "did not complete" at the bottom of the charts, we'd appreciate your help in correcting that for the next round. Visit the GitHub repository if you can lend a hand.

Busy schedules delayed Round 9. We apologize for the delay and thank you for your continued interest and patience! We welcome you to join in for Round 10.

Thanks!

As always, thank you to all of the contributes to this project, especially those who helped us address validation errors for this latest round.

An extra special thanks to Peak Hosting for providing a no-joking-around, no-holds-barred, genuine server environment. It has been a treat to see these servers set massive new records.

If you have questions, comments, criticism, or would like to contribute a new test or an improvement to an existing one, please join our Google Group or visit the project at Github.

About TechEmpower

We provide web and mobile application development services and are passionate about application performance. We are presently looking for a generalist web/mobile developer to expand our team. If this sounds interesting to you, we'd love to hear from you.