You are viewing entries tagged frameworks and may want to check out the most recent entries.

October 31, 2013

Framework Benchmarks Round 7

Happy Halloween fans of web development frameworks! After a several-month hiatus, Round 7 of our project measuring the performance of web application frameworks and platforms is available!

View Round 7 results

Round 7 includes many new framework test implementations contributed by the community. They are Falcore, Grizzly, HttpListener, PHPixie, Plain, Racket-WS, Start, Stream, and Treefrog. There are now a whopping 84 frameworks and over 200 individual test permutations.

Many preexisting frameworks' tests have been updated to include more test coverage and/or update dependencies and tune their implementation. To date, the project has processed 344 pull requests from the community. Thanks so much for your contributions. We are grateful for your continued interest!

View Round 7 results now.

Round 7 notes and observations

  • The Round 6 champion Undertow (the web server for WildFly) continues to impress with chart-dominating showings such as 180,000 plaintext requests per second on meager m1.large instances.
  • Thanks to community contributions, the C# tests have been dramatically improved, especially when querying the database. We also have some SQL Server tests in our i7 environment.
  • A contributor prepared scripts for running the benchmark suite on Windows Azure. Unfortunately, we were unable to reach the author of these scripts in the past weeks. If any Azure experts are interested in picking up that work where it exists now, please visit the GitHub repository or the Google Group for the project.
  • The high-performance tier has become significantly more crowded even during this project's relatively short history. Most interesting to us is how many frameworks can easily saturate our gigabit Ethernet with the JSON serialization and plaintext tests, even with our tests' intentionally small payloads. We do not have the hardware necessary to run 10 gigabit Ethernet tests, but if you have a 10 GBE lab and are willing to run the suite, we'd love to publish the results.
  • The benchmark toolset continues to mature gradually, but a lot of room for improvement still exists. A great deal of sanity-checking remains a manual process. If you're a Python programmer and interested in this project, let us know. We have several enhancements we'd like to make to the benchmark tool set (Python scripts), time permitting.
  • This round used a community-review model wherein project participants were able to review preliminary results we were capturing in our i7 environment and submit pull requests. The model is not perfect and will need to improve with each round, but it will help reduce the amount of time we (TechEmpower) need to allocate to each round's sanity checks, meaning quicker turn-around of rounds (see how I spun that as a good thing?).
  • Starting now, we aim to be on a monthly cycle of running official rounds. This helps reduce the perceived severity of configuration problems since they can be addressed in the next run, which is only a month away.
  • We've also pushed the display name for tests into the project, allowing contributors to assign test permutations any name they choose. E.g., "play-scala-anorm" and "aspnet-mvc-mono."
  • One particularly interesting anomaly is the dominance of Windows paired with Mongo on EC2 in the Updates test. The performance is only slightly lower than the same pairing on i7, where in most cases our i7s (2600K workstations, to be precise) and EC2 (m1.large) instances differ by a factor of seven or more. It's possible the Windows EC2 instance is running on a newer host than the Linux EC2 instance, but both are classified as m1.large.
  • Speaking of database tests, in previous rounds, we had used an SSD to host the databases. Prior to finishing Round 7, that SSD failed, so Round 7 is run with ramdisk-backed databases (excluding SQL Server). This project is not a database benchmark so we believed it would be fascinating to see the performance of the full stack when the friction of the database writes is reduced to a bare minimum. As confirmed by our previous spot checking in Round 5, database writes are about 20% to 30% faster across the board when using a ramdisk versus the Samsung 840 Pro SSD we had been using. As expected, reads are unaffected since the tests are designed to allow the database engine to fit the entire data set into memory.


As always, we'd like to say thank you to all of the contributors who have added test implementations for new frameworks or improved existing implementations. Round 7 was unusually long, so we also thank everyone for their patience.

The contributors for Round 7 are numerous. In no particular order: @fernandoacorreia, @kppullin, @MalcolmEvershed, @methane, @KevinHoward, @huntc, @lucassp, @dracony, @weltermann17, @kekekeks, @fwbrasil, @treefrogframework, @yogthos, @oberhamsi, @purplefox, @yz0075, @necaris, @pdonald, @Kepinator, @DavidBadura, @zznate, @nightlyone, @jeapostrophe, @astaxie, @troytoman, @grob, @torhve, @trautonen, @stuartwdouglas, and @xaxaxa. Sincere apologies if we forgot anyone!

If you have questions, comments, criticism, or would like to contribute a new test or an improvement to an existing one, please join our Google Group or visit the project at Github.

About TechEmpower

We provide web and mobile application development services and are passionate about application performance. Read more about what we do.

July 2, 2013

Frameworks Round 6

July marks the fourth month of our ongoing project measuring the performance of web application frameworks and platforms. We've just posted Round 6, which includes several more developer community-provided framework test implementations: Beego, Dart, Hapi, Jester, Luminus, Nancy, Yaf, Plack, Play-Slick, and Undertow.

View Round 6 results

The results web site has been improved with test-type and hardware-type navigation, allowing you to share links to a specific results chart, such as Round 6, Fortunes on EC2.

By popular demand, Round 6 introduces a plaintext test that uses HTTP pipelining, implemented in 14 frameworks so far. Previously, the most fundamental test was JSON serialization of a very small object. The new plaintext test demonstrates the extremely high request routing throughput possible on very high-performance platforms.

View Round 6 results now.

Round 6 notes and observations

  • HTTP pipelining is the star of the new plaintext test. With pipelining enabled, the capacity of gigabit Ethernet is much more efficiently used as compared to plain HTTP Keep-alives. The wrk benchmark tool reports a peak of approximately 95 megabytes per second transfer rate with pipelining and 35 megabytes per second without.
  • Tiny variations in response payload can appear as significant variations in requests per second in the new pipelining plaintext test. We caution all readers to review the source code of each test when interpreting the plaintext numbers. We will work to further normalize the implementations' response headers for Round 7.
  • The final plaintext data was captured from wrk configured to send 16 requests per pipeline. In preliminary tests, we experimented with more requests per pipeline but doing so caused socket write buffer overflows. The wrk tool was later enhanced to allow larger write buffers, but we ultimately decided to retain a 16 requests per pipeline test configuration.
  • All of the other tests (JSON, Fortunes, database read and write tests) are still run without HTTP pipelining. However, we spot-tested the JSON serialization test with pipelining enabled. Performance was essentially identical to the plaintext tests, confirming earlier suspicions that the impact of small JSON serialization workloads on high-performance platforms is trivial compared to the HTTP network traffic. (Incidentally, a later test will exercise larger JSON workloads.)
  • Also in response to popular demand, we ran the plaintext test with higher client-side concurrency levels than other tests. Where other tests are run at levels up to 256, the plaintext test runs at 256, 1024, 4096, and 16384 concurrency. However, most servers were already CPU or network limited at 256 concurrency, so it is not surprising that the results are fairly even across these higher concurrency levels.
  • The results web site now includes a "framework overhead" chart inspired by comments from Hacker News user goodwink. This new chart type compares frameworks versus their underlying platform (e.g., Unfiltered versus Netty; Spring versus Servlet; Rails versus Rack). The theoretical ideal is 100%, meaning the framework imparts no performance penalty compared to its platform. However, in some cases due to custom core components in a framework or implementation particulars, a framework may exceed 100%. For example, in the multiple query test, Revel currently considerably exceeds Go. This seems implausible on the surface, but it's repeatable. We look forward to hearing from Revel and Go experts with explanations and pull requests to apply for Round 7.
  • If you've enjoyed this project so far, we'd like your opinion on what test to include next. Take a look at the list of test ideas we have collected to date.


Huge thanks to Will Glozer who created a special branch of his Wrk benchmarking tool with HTTP pipeline support for use in this project. Not only is Wrk an excellent tool; @wg is an awesome developer whose contributions we greatly appreciate!

In addition to test implementations in new frameworks, the community continued to improve previous tests. In particular, a great deal of effort was provided in reviewing the ASP.NET MVC tests. Special thanks to MalcolmEvershed and kppullin for their contributions here.

Thanks to several others for many contributions (in no particular order): robfig (Revel and Go), pdonald (additional Windows and ASP.NET work), jrudolph (Spray), sirthias (Spray), oberhamsi (Ringo), methane (Python), nraychaudhuri (Play), dom96 (Jester), xaxaxa (CPPSP), JulienSchmidt (Go), moxford (Erlang), gjuric (Symfony2), stuartwdouglas (Undertow), avaly (Express and Hapi), yogthos (Luminus), vividsnow (Perl), amarsahinovic (Beego), LekisS (Kohana), cmircea (ServiceStack), bradfitz (Go), fruit (YAF).

If you have questions, comments, criticism, or would like to contribute a new test or an improvement to an existing one, please join our Google Group or visit the project at Github.

About TechEmpower

We provide web and mobile application development services and are passionate about application performance. Read more about what we do.

May 17, 2013

Frameworks Round 5

We have posted Round 5 of our ongoing project measuring the performance of web application frameworks and platforms. In this round, we're very happy to announce that a community member has contributed tests for ASP.NET running on native Windows.

View Round 5 results
We've included Windows on EC2 results as a separate set of data within the results view but caution that the results should be considered preliminary. This is Round 5, but it's only the first round with Windows results. We'd welcome any corrections for the Windows configuration to improve the results in the next round.

Additionally, we have added a new test type focused on writes, wherein a variable number of database updates are executed per request. This test thoroughly punishes the database server to a degree not seen in any of our previous tests. The test is uniquely I/O limited where previous tests have been predominantly CPU limited, and in some circumstances, network limited.

View Round 5 results now.

Round 5 notes and observations

  • The new Windows results, while respectable, do not match the Linux tests. We suspect there is room for improvement and we would be particularly interested in community recommendations for tuning the performance of the Windows environment. Note that the Windows configuration presently retains the Linux-hosted database server—only the web/application server is running Windows Server 2012. And yes, we would like to include Microsoft SQL Server in the future. Also, a future round will include Windows results on our dedicated hardware.
  • We outfitted the early-2011 vintage i7 workstation we are using as a database server in our dedicated-hardware tests with a Samsung 840 Pro solid state disk in preparation for running our new database updates test. Previous rounds' i7 tests were run using a traditional magnetic hard drive, although the tests were read-focused and therefore the disk was scarcely involved in fulfilling requests. As a result, you'll notice the other database tests are essentially unchanged versus Round 4. Equipping the i7 workstation with a high-performance SSD has given the database updates test a large performance gulf between our physical hardware and EC2 m1.large.
  • We ran a bunch of spot-tests of the database updates test with various configurations, some scarcely suitable for a production server.
    • Prior to installing the SSD, the i7 hardware with traditional magnetic disks was only two to three times quicker than the EC2 instances.
    • We experimented with hosting a MySQL database on a ramdisk to observe performance with I/O operations reduced to bare minimum. But surprisingly, the performance was only about 30% higher than the SSD performance.
    • MySQL with MyISAM was substantially faster at writes than with InnoDB, but the official Round 5 results are using InnoDB.
  • Note that our database updates test at "20 updates" is actually executing 40 total database statements: 20 queries to fetch random rows and 20 updates to persist the same rows after one small modification each.
  • At the recommendation of the community, we have modified the HTTP request payload to include several commonplace headers such as a complex User-Agent and some Cookies (though none that would be considered session identifiers). One particularly interesting outcome of having done so: the plain Servlet implementation of the Fortunes test, which is implemented using JSP, suffers a significant performance penalty. We tentatively suspect this is related to optimistic parsing of the request's cookies.
  • The larger HTTP request payload has normalized the peak JSON response rate on our i7 hardware somewhat. The JSON i7 leaderboard has shuffled slightly as a result. The Finagle test implementation in particular was extensively modified to use Finagle best-practices which are not oriented to eking out the highest performance at all costs, but rather on other measurements of code quality.
  • Some frameworks reacted to the new request headers in unexpected ways. Kohana, for example, refused to process requests that provided Cookies it was not expecting to receive, responding with "A valid cookie salt is required. Please set Cookie::$salt." Configuration tweaks are likely required. Kohana has been temporarily hidden from view since all of its results measured 0.
  • As with previous rounds, Round 5 adds some frameworks to the test suite: Spray, RestExpress, Web::Simple, Revel, and CPPSP. It also includes some updates such as Play 2.1.2-RC1, Go 1.1 final, and Python 2.7.4. If you're a maintainer of a framework we're testing and you've released a new version, please let us know so that we can get the tests updated.
  • This round also includes Flask on PyPy, which is our first Python test running on PyPy.
  • When you apply filters to the results view, the URL should change to capture your settings; you can share specific comparisons with friends and colleagues.
  • Some other glitches affected tests in Round 5. The ASP.NET contribution included both native Windows and Linux/Mono tests, but unfortunately, the Mono tests are not yet working correctly. The Windows configuration includes the Go test, but a configuration problem prevents the database portions of the Go test from completing. On Windows, the Go database tests show 0 requests per second. The new RingoJS + Mongo test wasn't able to complete. We're optimistic these issues can be identified and resolved before the next round.


Thank you again to the developer community for the numerous contributions in this and previous rounds. The magnitude of the project is a testament to the development community's continued participation and assistance in broadening the tests, both in framework coverage and test types. We are especially delighted by the response and involvement we've observed from framework maintainers. If you are a framework maintainer, we'd love to hear from you if you have questions, recommendations, or complaints.

This round's special shout-out goes to @pdonald, the contributor of the ASP.NET tests on both Windows and Mono. Have a look at the size of those pull requests!

If you have questions, comments, or criticism, please contact us on our Google Group or at Github.

About TechEmpower

We provide web and mobile application development services and are passionate about application performance. Read more about what we do.