Blog

July 2, 2013

Frameworks Round 6

July marks the fourth month of our ongoing project measuring the performance of web application frameworks and platforms. We've just posted Round 6, which includes several more developer community-provided framework test implementations: Beego, Dart, Hapi, Jester, Luminus, Nancy, Yaf, Plack, Play-Slick, and Undertow.

View Round 6 results

The results web site has been improved with test-type and hardware-type navigation, allowing you to share links to a specific results chart, such as Round 6, Fortunes on EC2.

By popular demand, Round 6 introduces a plaintext test that uses HTTP pipelining, implemented in 14 frameworks so far. Previously, the most fundamental test was JSON serialization of a very small object. The new plaintext test demonstrates the extremely high request routing throughput possible on very high-performance platforms.

View Round 6 results now.

Round 6 notes and observations

  • HTTP pipelining is the star of the new plaintext test. With pipelining enabled, the capacity of gigabit Ethernet is much more efficiently used as compared to plain HTTP Keep-alives. The wrk benchmark tool reports a peak of approximately 95 megabytes per second transfer rate with pipelining and 35 megabytes per second without.
  • Tiny variations in response payload can appear as significant variations in requests per second in the new pipelining plaintext test. We caution all readers to review the source code of each test when interpreting the plaintext numbers. We will work to further normalize the implementations' response headers for Round 7.
  • The final plaintext data was captured from wrk configured to send 16 requests per pipeline. In preliminary tests, we experimented with more requests per pipeline but doing so caused socket write buffer overflows. The wrk tool was later enhanced to allow larger write buffers, but we ultimately decided to retain a 16 requests per pipeline test configuration.
  • All of the other tests (JSON, Fortunes, database read and write tests) are still run without HTTP pipelining. However, we spot-tested the JSON serialization test with pipelining enabled. Performance was essentially identical to the plaintext tests, confirming earlier suspicions that the impact of small JSON serialization workloads on high-performance platforms is trivial compared to the HTTP network traffic. (Incidentally, a later test will exercise larger JSON workloads.)
  • Also in response to popular demand, we ran the plaintext test with higher client-side concurrency levels than other tests. Where other tests are run at levels up to 256, the plaintext test runs at 256, 1024, 4096, and 16384 concurrency. However, most servers were already CPU or network limited at 256 concurrency, so it is not surprising that the results are fairly even across these higher concurrency levels.
  • The results web site now includes a "framework overhead" chart inspired by comments from Hacker News user goodwink. This new chart type compares frameworks versus their underlying platform (e.g., Unfiltered versus Netty; Spring versus Servlet; Rails versus Rack). The theoretical ideal is 100%, meaning the framework imparts no performance penalty compared to its platform. However, in some cases due to custom core components in a framework or implementation particulars, a framework may exceed 100%. For example, in the multiple query test, Revel currently considerably exceeds Go. This seems implausible on the surface, but it's repeatable. We look forward to hearing from Revel and Go experts with explanations and pull requests to apply for Round 7.
  • If you've enjoyed this project so far, we'd like your opinion on what test to include next. Take a look at the list of test ideas we have collected to date.

Thanks!

Huge thanks to Will Glozer who created a special branch of his Wrk benchmarking tool with HTTP pipeline support for use in this project. Not only is Wrk an excellent tool; @wg is an awesome developer whose contributions we greatly appreciate!

In addition to test implementations in new frameworks, the community continued to improve previous tests. In particular, a great deal of effort was provided in reviewing the ASP.NET MVC tests. Special thanks to MalcolmEvershed and kppullin for their contributions here.

Thanks to several others for many contributions (in no particular order): robfig (Revel and Go), pdonald (additional Windows and ASP.NET work), jrudolph (Spray), sirthias (Spray), oberhamsi (Ringo), methane (Python), nraychaudhuri (Play), dom96 (Jester), xaxaxa (CPPSP), JulienSchmidt (Go), moxford (Erlang), gjuric (Symfony2), stuartwdouglas (Undertow), avaly (Express and Hapi), yogthos (Luminus), vividsnow (Perl), amarsahinovic (Beego), LekisS (Kohana), cmircea (ServiceStack), bradfitz (Go), fruit (YAF).

If you have questions, comments, criticism, or would like to contribute a new test or an improvement to an existing one, please join our Google Group or visit the project at Github.

About TechEmpower

We provide web and mobile application development services and are passionate about application performance. Read more about what we do.

May 17, 2013

Frameworks Round 5

We have posted Round 5 of our ongoing project measuring the performance of web application frameworks and platforms. In this round, we're very happy to announce that a community member has contributed tests for ASP.NET running on native Windows.

View Round 5 results
We've included Windows on EC2 results as a separate set of data within the results view but caution that the results should be considered preliminary. This is Round 5, but it's only the first round with Windows results. We'd welcome any corrections for the Windows configuration to improve the results in the next round.

Additionally, we have added a new test type focused on writes, wherein a variable number of database updates are executed per request. This test thoroughly punishes the database server to a degree not seen in any of our previous tests. The test is uniquely I/O limited where previous tests have been predominantly CPU limited, and in some circumstances, network limited.

View Round 5 results now.

Round 5 notes and observations

  • The new Windows results, while respectable, do not match the Linux tests. We suspect there is room for improvement and we would be particularly interested in community recommendations for tuning the performance of the Windows environment. Note that the Windows configuration presently retains the Linux-hosted database server—only the web/application server is running Windows Server 2012. And yes, we would like to include Microsoft SQL Server in the future. Also, a future round will include Windows results on our dedicated hardware.
  • We outfitted the early-2011 vintage i7 workstation we are using as a database server in our dedicated-hardware tests with a Samsung 840 Pro solid state disk in preparation for running our new database updates test. Previous rounds' i7 tests were run using a traditional magnetic hard drive, although the tests were read-focused and therefore the disk was scarcely involved in fulfilling requests. As a result, you'll notice the other database tests are essentially unchanged versus Round 4. Equipping the i7 workstation with a high-performance SSD has given the database updates test a large performance gulf between our physical hardware and EC2 m1.large.
  • We ran a bunch of spot-tests of the database updates test with various configurations, some scarcely suitable for a production server.
    • Prior to installing the SSD, the i7 hardware with traditional magnetic disks was only two to three times quicker than the EC2 instances.
    • We experimented with hosting a MySQL database on a ramdisk to observe performance with I/O operations reduced to bare minimum. But surprisingly, the performance was only about 30% higher than the SSD performance.
    • MySQL with MyISAM was substantially faster at writes than with InnoDB, but the official Round 5 results are using InnoDB.
  • Note that our database updates test at "20 updates" is actually executing 40 total database statements: 20 queries to fetch random rows and 20 updates to persist the same rows after one small modification each.
  • At the recommendation of the community, we have modified the HTTP request payload to include several commonplace headers such as a complex User-Agent and some Cookies (though none that would be considered session identifiers). One particularly interesting outcome of having done so: the plain Servlet implementation of the Fortunes test, which is implemented using JSP, suffers a significant performance penalty. We tentatively suspect this is related to optimistic parsing of the request's cookies.
  • The larger HTTP request payload has normalized the peak JSON response rate on our i7 hardware somewhat. The JSON i7 leaderboard has shuffled slightly as a result. The Finagle test implementation in particular was extensively modified to use Finagle best-practices which are not oriented to eking out the highest performance at all costs, but rather on other measurements of code quality.
  • Some frameworks reacted to the new request headers in unexpected ways. Kohana, for example, refused to process requests that provided Cookies it was not expecting to receive, responding with "A valid cookie salt is required. Please set Cookie::$salt." Configuration tweaks are likely required. Kohana has been temporarily hidden from view since all of its results measured 0.
  • As with previous rounds, Round 5 adds some frameworks to the test suite: Spray, RestExpress, Web::Simple, Revel, and CPPSP. It also includes some updates such as Play 2.1.2-RC1, Go 1.1 final, and Python 2.7.4. If you're a maintainer of a framework we're testing and you've released a new version, please let us know so that we can get the tests updated.
  • This round also includes Flask on PyPy, which is our first Python test running on PyPy.
  • When you apply filters to the results view, the URL should change to capture your settings; you can share specific comparisons with friends and colleagues.
  • Some other glitches affected tests in Round 5. The ASP.NET contribution included both native Windows and Linux/Mono tests, but unfortunately, the Mono tests are not yet working correctly. The Windows configuration includes the Go test, but a configuration problem prevents the database portions of the Go test from completing. On Windows, the Go database tests show 0 requests per second. The new RingoJS + Mongo test wasn't able to complete. We're optimistic these issues can be identified and resolved before the next round.

Thanks!

Thank you again to the developer community for the numerous contributions in this and previous rounds. The magnitude of the project is a testament to the development community's continued participation and assistance in broadening the tests, both in framework coverage and test types. We are especially delighted by the response and involvement we've observed from framework maintainers. If you are a framework maintainer, we'd love to hear from you if you have questions, recommendations, or complaints.

This round's special shout-out goes to @pdonald, the contributor of the ASP.NET tests on both Windows and Mono. Have a look at the size of those pull requests!

If you have questions, comments, or criticism, please contact us on our Google Group or at Github.

About TechEmpower

We provide web and mobile application development services and are passionate about application performance. Read more about what we do.

May 2, 2013

Frameworks Round 4

We’ve posted Round 4 of our ongoing project measuring the performance of many web application frameworks and platforms. As with previous rounds, the developer community has contributed several additional frameworks for Round 4, bringing the total to 57! This round adds Bottle (Python), Dancer (Perl), Kelp (Perl), MicroMVC (PHP), Mojolicious (Perl), Phalcon (PHP), RingoJS (JavaScript), Spark (Java), and Wai (Haskell).

View Round 4 results

To contend with the huge number of tests, we’ve added filtering to the results view, allowing you to hide frameworks that do not meet your needs. You can filter by classification (full-stack, micro-framework, or platform); language; platform; front-end server; database server; ORM classification; and implementation approach.

Additionally, we’ve added our fourth test called “Fortunes” which exercises server-side templates and collections. The Fortunes test is implemented in 17 frameworks, including most mainstream contenders.

View Round 4 results now.

Round 4 notes and observations

  • When viewing the data tables, you will notice that some frameworks show up more than once. Check the attribute columns at the right to compare the various test permutations. For example, in some cases a framework is tested both with and without its ORM (“full ORM” versus “Raw”).
  • Onion, a C platform contributed by Coralbits is now included in the EC2 JSON serialization test and the results are extremely impressive, clocking in over 52,000 JSON responses per second.
  • Thanks to work by Brad Fitzpatrick and others in the Go community, Go 1.1 is now a champion of database connectivity. In the previous round, Go exhibited serious problems with highly concurrent database utilization. Understatement: the Round 4 numbers for Go are impressive.
  • The new Fortunes test exercises server-side templates and collections. We kept the payload small so that the test does not run into a Gigabit Ethernet wall. Thanks to contributors, the test is already implemented on 17 of the frameworks. The initial results in this round are certainly interesting, but we expect later rounds to be still more interesting as additional implementations arrive. A surprising performance pain point for some implementations is escaping of untrusted text. To create the plain Servlet implementation, we initially used an Apache StringEscapeUtils library but found its performance lacking. Since the Servlet implementation uses JSP as a view, we replaced StringEscapeUtils with JSTL’s “out” tag.
  • Assigning attributes such as platform, ORM type, and classification (full-stack, micro, or platform) was an interesting challenge. We are not certain we have every framework described properly, so please contact us if you think we’ve mischaracterized any framework’s attributes.
  • Django is now connecting to Postgres in this round, making this the first Postgres test in the suite. We plan to add more in time. Django performance is slightly improved on Postgres versus MySQL, presumably thanks to the use of a connection pool. However, the “Stripped” implementation of Django, which still uses MySQL, performs slightly better than the new Django-Postgres test.
  • We’ve dropped the line charts in this round. We felt they did not add tremendous value to the site.
  • We have not yet received the pull request with .NET tests. We hope they arrive soon so we can include them in Round 5.

Thanks!

Once again, thank you to all of the readers who have contributed tests and feedback. A special shout-out to Skamander whose contributions are numerous; just check the Github pull request history!

If you have questions, comments, or criticism, please contact us on our Google Group or at Github!

About TechEmpower

We provide web and mobile application development services and are passionate about application performance. Read more about what we do.