Framework Benchmarks Round 1

March 28, 2013

Nate Brady

How much does your framework choice affect performance? The answer may surprise you.

Authors’ Note: We’re using the word “framework” loosely to refer to platforms, micro-frameworks, and full-stack frameworks. We have our own personal favorites among these frameworks, but we’ve tried our best to give each a fair shot.

Show me the winners!

We know you’re curious (we were too!) so here is a chart of representative results.

Peak JSON responses per second, EC2 large

Whoa! Netty, Vert.x, and Java servlets are fast, but we were surprised how much faster they are than Ruby, Django, and friends. Before we did the benchmarks, we were guessing there might be a 4x difference. But a 40x difference between Vert.x and Ruby on Rails is staggering. And let us simply draw the curtain of charity over the Cake PHP results.

If these results were surprising to you, too, then read on so we can share our methodology and other test results. Even better, maybe you can spot a place where we mistakenly hobbled a framework and we can improve the tests. We’ve done our best, but we are not experts in most of them so help is welcome!

Motivation

Among the many factors to consider when choosing a web development framework, raw performance is easy to objectively measure. Application performance can be directly mapped to hosting dollars, and for a start-up company in its infancy, hosting costs can be a pain point. Weak performance can also cause premature scale pain, user experience degradation, and associated penalties levied by search engines.

What if building an application on one framework meant that at the very best your hardware is suitable for one tenth as much load as it would be had you chosen a different framework? The differences aren’t always that extreme, but in some cases, they might be. It’s worth knowing what you’re getting into.

Simulating production environments

For this exercise, we aimed to configure every framework according to the best practices for production deployments gleaned from documentation and popular community opinion. Our goal is to approximate a sensible production deployment as accurately as possible. For each framework, we describe the configuration approach we’ve used and cite the sources recommending that configuration.

We want it all to be as transparent as possible, so we have posted our test suites on GitHub.

Results

We ran each test on EC2 and our i7 hardware. See Environment Details below for more information.

JSON serialization test

First up is plain JSON serialization on Amazon EC2 large instances. This is repeated in the introduction, above.

The high-performance Netty platform takes a commanding lead for JSON serialization on EC2. Since Vert.x is built on Netty, it too achieved full saturation of the CPU cores and impressive numbers. In third place is plain Java Servlets running on Caucho’s Resin Servlet container. Plain Go delivers the best showing for a non-JVM framework.

We expected a fairly wide field, but we were surprised to see results that span four orders of magnitude.

Dedicated hardware

Here is the same test on our Sandy Bridge i7 hardware.

On our dedicated hardware, plain Servlets take the lead with over 210,000 requests per second. Vert.x remains strong but tapers off at higher concurrency levels despite being given eight workers, one for each HT core.

Database access test (single query)

How many requests can be handled per second if each request is fetching a random record from a data store? Starting again with EC2.

For database access tests, we considered dropping Cake to constrain our EC2 costs. This test exercises the database driver and connection pool and illustrates how well each scales with concurrency. Compojure makes a respectable showing but plain Servlets paired with the standard connection pool provided by MySQL is strongest at high concurrency. Gemini is using its built-in connection pool and lightweight ORM.

It’s worth pausing to appreciate that this shows an EC2 Large instance can query a remote MySQL instance at least 8,800 times per second, putting aside the additional work of each query being part of an HTTP request and response cycle.

Dedicated hardware

The dedicated hardware impresses us with its ability to process nearly 100,000 requests per second with one query per request. JVM frameworks are especially strong here thanks to JDBC and efficient connection pools. In this test, we suspect Vert.x is being hit very hard by its connectivity to MongoDB. We are especially interested in community feedback related to tuning these MongoDB numbers.

Database access test (multiple queries)

The following tests are all run at 256 concurrency and vary the number of database queries per request. The tests are 1, 5, 10, 15, and 20 queries per request. The 1-query samples, leftmost on the line charts, should be similar (within sampling error) of the single-query test above.

As expected, as we increase the number of queries per request, the lines converge to zero. However, looking at the 20-queries bar chart, roughly the same ranked order we’ve seen elsewhere is still in play, demonstrating the headroom afforded by higher-performance frameworks.

We were surprised by the performance of Raw PHP in this test. We suspect the PHP MySQL driver and connection pool are particularly well tuned. However, the penalty for using an ORM on PHP is severe.

Dedicated hardware

The dedicated hardware produces numbers nearly ten times greater than EC2 with the punishing 20 queries per request. Again, Raw PHP makes an extremely strong showing, but PHP with an ORM and Cake—the only PHP framework in our test—are at the opposite end of the spectrum.

How we designed the tests

This exercise aims to provide a “baseline” for performance across the variety of frameworks. By baseline we mean the starting point, from which any real-world application’s performance can only get worse. We aim to know the upper bound being set on an application’s performance per unit of hardware by each platform and framework.

But we also want to exercise some of the frameworks’ components such as its JSON serializer and data-store/database mapping. While each test boils down to a measurement of the number of requests per second that can be processed by a single server, we are exercising a sample of the components provided by modern frameworks, so we believe it’s a reasonable starting point.

For the data-connected test, we’ve deliberately constructed the tests to avoid any framework-provided caching layer. We want this test to require repeated requests to an external service (MySQL or MongoDB, for example) so that we exercise the framework’s data mapping code. Although we expect that the external service is itself caching the small number of rows our test consumes, the framework is not allowed to avoid the network transmission and data mapping portion of the work.

Not all frameworks provide components for all of the tests. For these situations, we attempted to select a popular best-of-breed option.

Each framework was tested using 2^3 to 2^8 (8, 16, 32, 64, 128, and 256) request concurrency. On EC2, WeigHTTP was configured to use two threads (one per core) and on our i7 hardware, it was configured to use eight threads (one per HT core). For each test, WeigHTTP simulated 100,000 HTTP requests with keep-alives enabled.

For each test, the framework was warmed up by running a full test prior to capturing performance numbers.

Finally, for each framework, we collected the framework’s best performance across the various concurrency levels for plotting as peak bar charts.

We used two machines for all tests, configured in the following roles:

Application server. This machine is responsible for hosting the web application exclusively. Note, however, that when community best practices specified use of a web server in front of the application container, we had the web server installed on the same machine.
Load client and database server. This machine is responsible for generating HTTP traffic to the application server using WeigHTTP and also for hosting the database server. In all of our tests, the database server (MySQL or MongoDB) used very little CPU time; and WeigHTTP was not starved of CPU resource. In the database tests, the network was being used to provide result sets to the application server and to provide HTTP responses in the opposite direction. However, even with the quickest frameworks, network utilization was lower in database tests than in the plain JSON tests, so this is unlikely to be a concern.

Ultimately, a three-machine configuration would dismiss the concern of double-duty for the second machine. However, we doubt that the results would be noticeably different.

The Tests

We ran three types of tests. Not all tests were run for all frameworks. See details below.

JSON serialization

For this test, each framework simply responds with the following object, encoded using the framework’s JSON serializer.

{"message" : "Hello, World!"}

With the content type set to application/json. If the framework provides no JSON serializer, a best-of-breed for the platform is selected. For example, on the Java platform, Jackson was used for frameworks that do not provide a serializer.

Database access (single query)

In this test, we use the ORM of choice for each framework to grab one simple object selected at random from a table containing 10,000 rows. We use the same JSON serialization tested earlier to serialize that object as JSON. Caveat: when the data store provides data as JSON in situ (such as with the MongoDB tests), no transcoding is done; the string of JSON is sent as-is.

As with JSON serialization, we’ve selected a best-of-breed ORM when the framework is agnostic. For example, we used Sequelize for the JavaScript MySQL tests.

We tested with MySQL for most frameworks, but where MongoDB is more conventional (as with node.js), we tested that instead or in addition. We also did some spot tests with PostgreSQL but have not yet captured any of those results in this effort. Preliminary results showed RPS performance about 25% lower than with MySQL. Since PostgreSQL is considered favorable from a durability perspective, we plan to include more PostgreSQL testing in the future.

Database access (multiple queries)

This test repeats the work of the single-query test with an adjustable queries-per-request parameter. Tests are run at 5, 10, 15, and 20 queries per request. Each query selects a random row from the same table exercised in the previous test with the resulting array then serialized to JSON as a response.

This test is intended to illustrate how all frameworks inevitably will converge to zero requests per second as the complexity of each request increases. Admittedly, especially at 20 queries per request, this particular test is unnaturally database heavy compared to real-world applications. Only grossly inefficient applications or uncommonly complex requests would make that many database queries per request.

Environment Details

Two Intel Sandy Bridge Core i7-2600K workstations with
8 GB memory each (early 2011 vintage) for the i7 tests
Two Amazon EC2 m1.large instances for the EC2 tests
Switched gigabit Ethernet

WeigHTTP

MySQL 5.5.29
MongoDB 2.0.5

Ruby 2.0.0-p0
Rack 1.5.1
JRuby 1.7.3
Resin 4.0.34 (for JRuby)
Rails 3.2.11
Sinatra 1.3.4

Node.js v0.10.0
Mongoose 3.5.5
Sequelize 1.6.0-beta4
Express 3.1

PHP 5.3.10
Cake 2.3.0
PHP ActiveRecord (Nightly 20121221)

Ubuntu Linux 12.04 64-bit

Passenger 3.9.5-rc3
Gunicorn 0.17.2
Apache HTTPD 2.2.22

Python 2.7.3
Django 1.4

Go 1.0.3
WebGo (master branch)

OpenJDK 1.7.0_09
Netty 4.0.0 beta 2
Caucho Resin 4.0.34 Servlet Container (GPL)
Grails 2.1.1
Play 2.1.0
Jackson 2.1.1
Spring 3.2.1
Tapestry 5.3.6
Hibernate 4.1.1
Maven 2.2.1
Vert.x 1.3.1
Wicket 6.5.0
Clojure 1.4.0
Compojure 1.1.5
Ring-JSON 0.1.2
Korma 0.3.0-RC2
Gemini 1.35

Notes

For the database tests, any framework with the suffix “raw” in its name is using its platform’s raw database connectivity without an object-relational map (ORM) of any flavor. For example, servlet-raw is using raw JDBC. All frameworks without the “raw” suffix in their name are using either the framework-provided ORM or a best-of-breed for the platform (e.g., ActiveRecord).

Code examples

You can find the full source code for all of the tests on Github. Below are the relevant portions of the code to fetch a configurable number of random database records, serialize the list of records as JSON, and then send the JSON as an HTTP response.

Framework Benchmarks Round 1

Show me the winners!

Motivation

Simulating production environments

Results

JSON serialization test

Dedicated hardware

Database access test (single query)

Dedicated hardware

Database access test (multiple queries)

Dedicated hardware

How we designed the tests

The Tests

JSON serialization

Database access (single query)

Database access (multiple queries)

Environment Details

Notes

Code examples

Cake

Compojure

Django

Express

Gemini

Grails

Node.js

PHP (Raw)

PHP (ORM)

Play

Rails

Servlet

Sinatra

Spring

Tapestry

Vert.x

Wicket

Expected questions

Conclusion

About TechEmpower