We are hiring!(Yes, we do more than benchmark frameworks)

Web Framework Benchmarks

In the following tests, we have measured the performance of several web application platforms, full-stack frameworks, and micro-frameworks (collectively, "frameworks"). For more information, read the introduction, motivation, and latest environment details.

Show filters panel
Showing all frameworks.
Filters
Classification
?
We classify frameworks as follows:
  • Full-stack, meaning a framework that provides wide feature coverage including server-side templates, database connectivity, form processing, and so on.
  • Micro, meaning a framework that provides request routing and some simple plumbing.
  • Platform, meaning a raw server (not actually a framework at all). Good luck! You're going to need it.
Disable all
Language
?
The principal programming language used by the framework.
Disable all
Platform
?
The platform is the low-level software or API used to host web applications for the framework; the platform provides an implementation of the HTTP fundamentals.
Disable all
Application operating system
?
The operating system hosting the application server and web server, if applicable.
Disable all
Front-end server
?
The front-end server ("web server") paired with the application framework, if applicable. Some frameworks provide a built-in web server.
Disable all
Database-server
?
The database server paired with the application framework. Not every framework has tests for every database server.
Disable all
Database operating system
?
The operating system hosting the database server.
Disable all
Object-relational mapper (ORM) classification
?
We classify object-relational mappers as follows:
  • Full, meaning an ORM that provides wide functionality, possibly including a query language.
  • Micro, meaning a less comprehensive abstraction of the relational model.
  • Raw, meaning no ORM is used at all; the platform's raw database connectivity is used.
Disable all
Implementation approach
?
Implementation approach describes the test's design disposition.
  • A realistic implementation approach uses the framework with most out-of-the-box functionality enabled. We consider this realistic because most applications built with the framework will leave these features enabled.
  • A stripped implementation approach removes features that are unnecessary for the particulars of this benchmark exercise. This might illuminate the marginal improvement available in fine-tuning a framework to your application's use-case.
Disable all
Framework
?
In addition to filtering frameworks using the attributes above, you may hide frameworks one-by-one by clicking their names below.
Disable all
Key
Close filters panel
Enabled
Disabled
Unavailable
Test types
Hardware

Results

Requirements summary

In this test, each request is processed by fetching a single row from a simple database table. That row is then serialized as a JSON response.

Example response:

HTTP/1.1 200 OK Content-Length: 32 Content-Type: application/json; charset=UTF-8 Server: Example Date: Wed, 17 Apr 2013 12:00:00 GMT {"id":3217,"randomNumber":2149}

For a more detailed description of the requirements, see the Source Code and Requirements section.

Results

Requirements summary

In this test, each request is processed by fetching multiple rows from a simple database table and serializing these rows as a JSON response. The test is run multiple times: testing 1, 5, 10, 15, and 20 queries per request. All tests are run at 256 concurrency.

Example response for 10 queries:

HTTP/1.1 200 OK Content-Length: 315 Content-Type: application/json; charset=UTF-8 Server: Example Date: Wed, 17 Apr 2013 12:00:00 GMT [{"id":4174,"randomNumber":331},{"id":51,"randomNumber":6544},{"id":4462,"randomNumber":952},{"id":2221,"randomNumber":532},{"id":9276,"randomNumber":3097},{"id":3056,"randomNumber":7293},{"id":6964,"randomNumber":620},{"id":675,"randomNumber":6601},{"id":8414,"randomNumber":6569},{"id":2753,"randomNumber":4065}]

For a more detailed description of the requirements, see the Source Code and Requirements section.

Results

Requirements summary

This test exercises database writes. Each request is processed by fetching multiple rows from a simple database table, converting the rows to in-memory objects, modifying one attribute of each object in memory, updating each associated row in the database individually, and then serializing the list of objects as a JSON response. The test is run multiple times: testing 1, 5, 10, 15, and 20 updates per request. Note that the number of statements per request is twice the number of updates since each update is paired with one query to fetch the object. All tests are run at 256 concurrency.

The response is analogous to the multiple-query test. Example response for 10 updates:

HTTP/1.1 200 OK Content-Length: 315 Content-Type: application/json; charset=UTF-8 Server: Example Date: Wed, 17 Apr 2013 12:00:00 GMT [{"id":4174,"randomNumber":331},{"id":51,"randomNumber":6544},{"id":4462,"randomNumber":952},{"id":2221,"randomNumber":532},{"id":9276,"randomNumber":3097},{"id":3056,"randomNumber":7293},{"id":6964,"randomNumber":620},{"id":675,"randomNumber":6601},{"id":8414,"randomNumber":6569},{"id":2753,"randomNumber":4065}]

For a more detailed description of the requirements, see the Source Code and Requirements section.

Results

Requirements summary

In this test, the framework's ORM is used to fetch all rows from a database table containing an unknown number of Unix fortune cookie messages (the table has 12 rows, but the code cannot have foreknowledge of the table's size). An additional fortune cookie message is inserted into the list at runtime and then the list is sorted by the message text. Finally, the list is delivered to the client using a server-side HTML template. The message text must be considered untrusted and properly escaped and the UTF-8 fortune messages must be rendered properly.

Whitespace is optional and may comply with the framework's best practices.

Example response:

HTTP/1.1 200 OK Content-Length: 1196 Content-Type: text/html; charset=UTF-8 Server: Example Date: Wed, 17 Apr 2013 12:00:00 GMT <!DOCTYPE html><html><head><title>Fortunes</title></head><body><table><tr><th>id</th><th>message</th></tr><tr><td>11</td><td>&lt;script&gt;alert(&quot;This should not be displayed in a browser alert box.&quot;);&lt;/script&gt;</td></tr><tr><td>4</td><td>A bad random number generator: 1, 1, 1, 1, 1, 4.33e+67, 1, 1, 1</td></tr><tr><td>5</td><td>A computer program does what you tell it to do, not what you want it to do.</td></tr><tr><td>2</td><td>A computer scientist is someone who fixes things that aren&apos;t broken.</td></tr><tr><td>8</td><td>A list is only as strong as its weakest link. — Donald Knuth</td></tr><tr><td>0</td><td>Additional fortune added at request time.</td></tr><tr><td>3</td><td>After enough decimal places, nobody gives a damn.</td></tr><tr><td>7</td><td>Any program that runs right is obsolete.</td></tr><tr><td>10</td><td>Computers make very fast, very accurate mistakes.</td></tr><tr><td>6</td><td>Emacs is a nice operating system, but I prefer UNIX. — Tom Christaensen</td></tr><tr><td>9</td><td>Feature: A bug with seniority.</td></tr><tr><td>1</td><td>fortune: No such file or directory</td></tr><tr><td>12</td><td>フレームワークのベンチマーク</td></tr></table></body></html>

For a more detailed description of the requirements, see the Source Code and Requirements section.

Results

Requirements summary

In this test, each response is a JSON serialization of a freshly-instantiated object that maps the key message to the value Hello, World!

Example response:

HTTP/1.1 200 OK Content-Type: application/json; charset=UTF-8 Content-Length: 28 Server: Example Date: Wed, 17 Apr 2013 12:00:00 GMT {"message":"Hello, World!"}

For a more detailed description of the requirements, see the Source Code and Requirements section.

Results

Requirements summary

In this test, the framework responds with the simplest of responses: a "Hello, World" message rendered as plain text. The size of the response is kept small so that gigabit Ethernet is not the limiting factor for all implementations. HTTP pipelining is enabled and higher client-side concurrency levels are used for this test (see the "Data table" view).

Example response:

HTTP/1.1 200 OK Content-Length: 15 Content-Type: text/plain; charset=UTF-8 Server: Example Date: Wed, 17 Apr 2013 12:00:00 GMT Hello, World!

For a more detailed description of the requirements, see the Source Code and Requirements section.

Comments

If you have any comments about this round, please post at the Framework Benchmarks Google Group.

Results testing

Running these benchmarks in your own test environment? You can visualize the results by copying and pasting the contents of your results.json file in the text box below.

Test duration: seconds
Visualize results

Introduction

This is a performance comparison of many web application frameworks executing fundamental tasks such as JSON serialization, database access, and server-side template composition. Each framework is operating in a realistic production configuration. Results are captured on Amazon EC2 and on physical hardware. The project is still evolving, and as it does so, the GitHub repository for the project is turning into a showcase of sorts for each framework's best-practices.

Note: We're using the word "framework" loosely to refer to platforms, micro-frameworks, and full-stack frameworks.

In a March 2013 blog entry, we published the results of comparing the performance of several web application frameworks executing simple but representative tasks: serializing JSON objects and querying databases. Since then, community input has been tremendous. We—speaking now for all contributors to the project—have been regularly updating the test implementations, expanding coverage, and capturing results in semi-regular updates that we call "rounds."

Results

View the latest results from Round 9. Or check out the previous rounds.

Making improvements

We expect that all frameworks' tests could be improved with community input. For that reason, we are extremely happy to receive pull requests from fans of any framework. We would like our tests for every framework to perform optimally, so we invite you to please join in.

What's to come

Feedback has been continuous and we plan to keep updating the project in several ways, such as:

  • Coverage of more frameworks. Thanks to community contributions to-date, the number of frameworks covered has already grown quite large. We're happy to add more if you submit a pull request.
  • Additional test types.
  • Enhancements to this site, such as better rendering of error information and sorting within the data tables and charts. We would also like to add side-by-side comparison of data sets (to visualize the changes between rounds or test types) and to accept additional results files from the community. How does EC2 compare to, say, Rackspace Cloud? We don't have the data now, but if you have a Rackspace Cloud account and are willing to run the full test suite, we'd like to be able to render that here.

Questions or comments

Check out the motivation and questions section for answers to some common questions.

If you have other questions, comments, recommendations, criticisms, or any other form of feedback, please post at the framework-benchmarks Google Group or contact us via e-mail.

Current and Previous Rounds

Round 9 Thanks to the contribution of a 10-gigabit testing environment by Peak Hosting, the network barrier that frustrated top-performing frameworks in previous rounds has been removed. The Dell R720xd servers in this new environment feature dual Xeon E5-2660 v2 processors and illustrate how the spectrum of frameworks scale to forty processor cores.

Round 8 Six more frameworks contributed by the community takes the total count to 90 frameworks and 230 permutations (variations of configuration). Meanwhile, several implementations have been updated and the highest-performance platforms jockey for the top spot on each test's charts.

Round 7 After a several month hiatus, another large batch of frameworks have been added by the community. Even after consolidating a few, Round 7 counts 84 frameworks and over 200 test permutations! This round also was the first to use a community-review process. Future rounds will see roughly one week of preview and review by the community prior to release to the public here.

Round 6 Still more tests were contributed by the developer community, bringing the number of frameworks to 74! Round 6 also introduces an "plaintext" test type that exercises HTTP pipelining and higher client-side concurrency levels.

Round 5 The developer community comes through with the addition of ASP.NET tests ready to run on Windows. This round is the first with Windows tests, and we seek assistance from Windows experts to apply additional tuning to bring the results to parity with the Linux tests. Round 5 also introduces an "update" test type to exercise ORM and database writes.

Round 4 With 57 frameworks in the benchmark suite, we've added a filter control allowing you to narrow your view to only the frameworks you want to see. Round 4 also introduces the "Fortune" test to exercise server-side templates and collections.

Round 3 We created this stand-alone site for comparing the results data captured across many web application frameworks. Even more frameworks have been contributed by the community and the testing methodology was changed slightly thanks to enhancements to the testing tool named Wrk.

Round 2 In April, we published a follow-up blog entry named "Frameworks Round 2" where we incorporated changes suggested and contributed by the community.

Round 1 In a March 2013 blog entry, we published the results of comparing the performance of several web application frameworks executing simple but representative tasks: serializing JSON objects and querying databases. The community reaction was terrific. We are flattered by the volume of feedback. We received dozens of comments, suggestions, questions, criticisms, and most importantly, GitHub pull requests at the repository we set up for this project.

Motivation

Choosing a web application framework involves evaluation of many factors. While comparatively easy to measure, performance is frequently given little consideration. We hope to help change that. Application performance can be directly mapped to hosting dollars, and for companies both large and small, hosting costs can be a pain point. Weak performance can also cause premature and costly scale pain, user experience degradation, and penalties levied by search engines.

What if building an application on one framework meant that at the very best your hardware is suitable for one tenth as much load as it would be had you chosen a different framework? The differences aren't always that extreme, but in some cases, they might be. Especially with several modern high-performance frameworks offering respectable developer efficiency, it's worth knowing what you're getting into.

Terminology

framework
We use the word framework loosely to refer to any HTTP stack—a full-stack framework, a micro-framework, or even a web platform such as Rack, Servlet, or plain PHP.
permutation
A combination of attributes that compose a full technology stack being tested (such as node.js paired with MongoDB or node.js paired with MySQL). Some frameworks have seen many permutations contributed by the community; others only one or few.
test type
One of the workloads we exercise, such as JSON serialization, single-query, multiple-query, fortunes, data updates, and plaintext.
test
An individual test is a measurement of the performance of a permutation's implementation of a test type. For example, a test might be measuring Wicket paired with MySQL running the single-query test type.
implementation
Sometimes called "test implementations," these are the bodies of code and configuration created to test permutations according to the requirements. These are frequently contributed by the developer community. Basically, together with the toolset, test implementations are the meat of this project.
toolset
A set of Python scripts that run our tests.
run
An execution of the benchmark toolset across the suite of test implementations, either in full or in part, in order to capture results for any purpose.
preview
A capture of data from a run used by project participants to sanity-check prior to an official round.
round
A posting of "official" results on this web site. This is mostly for ease of consumption by readers and good-spirited & healthy competitive bragging rights. For in-depth analysis, we encourage you to examine the source code and run the tests on your own hardware.

Expected questions

We expect that you might have a bunch of questions. Here are some that we're anticipating. But please contact us if you have a question we're not dealing with here or just want to tell us we're doing it wrong.

Frameworks and configuration

  1. "You call x a framework, but it's a platform." See the terminology section above. We are using the word "framework" loosely to refer to anything found on the spectrum ranging from full-stack frameworks, micro-frameworks, to platforms. If it's used to build web applications, it probably qualifies. That said, we understand that comparing a full-stack framework versus platforms or vice-versa is unusual. We feel it's valuable to be able to compare these, for example to understand the performance overhead of additional abstraction. You can use the filters in the results viewer to adjust the rows you see in the charts.
  2. "You configured framework x incorrectly, and that explains the numbers you're seeing." Whoops! Please let us know how we can fix it, or submit a GitHub pull request, so we can get it right.
  3. "Why include this Gemini framework I've never heard of?" We have included our in-house Java web framework, Gemini, in our tests. We've done so because it's of interest to us. You can consider it a stand-in for any relatively lightweight minimal-locking Java framework. While we're proud of how it performs among the well-established field, this exercise is not about Gemini.
  4. "Why don't you test framework X?" We'd love to, if we can find the time. Even better, craft the test implementation yourself and submit a GitHub pull request so we can get it in there faster!
  5. "Some frameworks use process-level concurrency; have you accounted for that?" Yes, we've attempted to use production-grade configuration settings for all frameworks, including those that rely on process-level concurrency. For the EC2 tests, for example, such frameworks are configured to utilize the two virtual cores provided on an m1.large instance. For the i7 tests, they are configured to use the eight hyper-threading cores of our hardware's i7 CPUs.
  6. "Have you enabled APC for the PHP tests?" Yes, the PHP tests run with APC and PHP-FPM on nginx.
  7. "Why are you using a (slightly) old version of framework X?" It's nothing personal! With so many frameworks we have a never-ending game of whack-a-mole. If you think an update will affect the results, please let us know (or better yet, submit a GitHub pull request) and we'll get it updated!
  8. "It's unfair and possibly even incorrect to compare X and Y!" It may be alarming at first to see the full results table, where one may evaluate frameworks vs platforms; MySQL vs Postgres; Go vs Python; ORM vs raw database connectivity; and any number of other possibly irrational comparisons. Many readers desire the ability to compare these and other permutations. If you prefer to view an unpolluted subset, you may use the filters available at the top of the results page. We believe that comparing frameworks with plausible and diverse technology stacks, despite the number of variables, is precisely the value of this project. With sufficient time and effort, we hope to continuously broaden the test permutations. But we recommend against ignoring the data on the basis of concerns about multi-variable comparisons. Read more opinion on this at Brian Hauer's personal blog.
  9. "If you are testing production deployments, why is logging disabled?" At present, we have elected to run tests with logging features disabled. Although this is not consistent with production deployments, we avoid a few complications related to logging, most notably disk capacity and consistent granularity of logging across all test implementations. In spot tests, we have not observed significant performance impact from logging when enabled. If there is strong community consensus that logging is necessary, we will reconsider this.
  10. "Tell me about the Windows configuration." We are very thankful to the community members who have contributed Windows tests. In fact, nearly the entirety of the Windows configuration has been contributed by subject-matter experts from the community. Thanks to their effort, we now have tests covering both Windows paired with Linux databases and Windows paired with Microsoft SQL Server. As with all aspects of this project, we welcome continued input and tuning by other experts. If you have advice on better tuning the Windows tests, please submit GitHub issues or pull requests.

The tests

  1. "Framework X has in-memory caching, why don't you use that?" In-memory caching, as provided by some frameworks, yields higher performance than repeatedly hitting a database, but isn't available in all frameworks, so we omitted in-memory caching from these tests. Cache tests are planned for later rounds.
  2. "What about other caching approaches, then?" Remote-memory or near-memory caching, as provided by Memcached and similar solutions, also improves performance and we would like to conduct future tests simulating a more expensive query operation versus Memcached. However, curiously, in spot tests, some frameworks paired with Memcached were conspicuously slower than other frameworks directly querying the authoritative MySQL database (recognizing, of course, that MySQL had its entire data-set in its own memory cache). For simple "get row ID n" and "get all rows" style fetches, a fast framework paired with MySQL may be faster and easier to work with versus a slow framework paired with Memcached.
  3. "Why doesn't your test include more substantial algorithmic work?" Great suggestion. We hope to in the future!
  4. "What about reverse proxy options such as Varnish?" We are expressly not using reverse proxies on this project. There are other benchmark projects that evaluate the performance of reverse proxy software. This project measures the performance of web applications in any scenario where requests reach the application server. Given that objective, allowing the web application to avoid doing the work thanks to a reverse proxy would invalidate the results. If it's difficult to conceptualize the value of measuring performance beyond the reverse proxy, imagine a scenario where every response provides user-specific and varying data. It's also notable that some platforms respond with sufficient performance to potentially render a reverse proxy unnecessary.
  5. "Do all the database tests use connection pooling?" Yes, our expectation is that all tests use connection pooling.
  6. "How is each test run?" Each test is executed as follows:
    1. Restart the database servers.
    2. Start the platform and framework using their start-up mechanisms.
    3. Run a 5-second primer at 8 client-concurrency to verify that the server is in fact running. These results are not captured.
    4. Run a 15-second warmup at 256 client-concurrency to allow lazy-initialization to execute and just-in-time compilation to run. These results are not captured.
    5. Run a 15-second captured test for each of the concurrency levels (or iteration counts) exercised by the test type. Concurrency-variable test types are tested at 8, 16, 32, 64, 128, and 256 client-side concurrency. The high-concurrency plaintext test type is tested at 256, 1,024, 4,096, and 16,384 client-side concurrency.
    6. Stop the platform and framework.
  7. "Hold on, 15 seconds is not enough to gather useful data." This is a reasonable concern. But in examining the data, we have seen no evidence that the results have changed by reducing the individual test durations from 60 seconds to 15 seconds. The duration reduction was made necessary by the growing number of test permutations and a target that the full suite complete in less than one day. With additional effort, we aim to build a continuously-running test environment that will pull the latest source and begin a new run as soon as a previous run completes. When we have such an environment ready, we will be comfortable with multi-day execution times, so we plan to extend the duration of each test when that happens.
  8. "Also, a 15-second warmup is not sufficient." On the contrary, we have not yet seen evidence suggesting that any additional warmup time is beneficial to any framework. In fact, for frameworks based on JIT platforms such as the Java Virtual Machine (JVM), spot tests show that the JIT has even completed its work already after just the primer and before the warmup starts—the warmup (256-concurrency) and real 256-concurrency tests yield results that are separated only by test noise. However, as with test durations, we intend to increase the duration of the warmup when we have a continuously-running test environment.

Environment

  1. "What is Wrk?" Although many web performance tests use ApacheBench from Apache to generate HTTP requests, we now use Wrk for this project. ApacheBench remains a single-threaded tool, meaning that for higher-performance test scenarios, ApacheBench itself is a limiting factor. Wrk is a multithreaded tool that provides a similar function, allowing tests to run for a prescribed amount of time (rather than limited to a number of requests) and providing us result data including total requests completed and latency information.
  2. "Doesn't benchmarking on Amazon EC2 invalidate the results?" Our opinion is that doing so confirms precisely what we're trying to test: performance of web applications within realistic production environments. Selecting EC2 as a platform also allows the tests to be readily verified by anyone interested in doing so. However, we've also executed tests on our Core i7 (Sandy Bridge) workstations running Ubuntu as a non-virtualized comparison. Doing so confirmed our suspicion that the ranked order and relative performance across frameworks is mostly consistent between EC2 and physical hardware. That is, while the EC2 instances were slower than the physical hardware, they were slower by roughly the same proportion across the spectrum of frameworks.
  3. "Tell me about your physical hardware." For the tests we refer to as "i7" tests, we're using our office workstations. These use Intel i7-2600K processors, making them a little antiquated, to be honest. These are connected via an unmanaged low-cost gigabit Ethernet switch. In previous rounds, we used a two-machine configuration where the load-generation and database role coexisted. Although these two roles were not crowding one another out (neither role was starved for CPU time), as of Round 7, we are using a three-machine configuration for the physical hardware tests. The machine roles are:
    • Application server, which hosts the application code and web server, where applicable.
    • Database server, which hosts the common databases. Starting with Round 5, we equipped the database server with a Samsung 840 Pro SSD.
    • Load generator, which makes HTTP requests to the Application server via the Wrk load generation tool.
  4. "What is Resin? Why aren't you using Tomcat for the Java frameworks?" Resin is a Java application server. The GPL version that we used for our tests is a relatively lightweight Servlet container. We tested on Tomcat as well but ultimately dropped Tomcat from our tests because Resin was slightly faster across all Servlet-based frameworks.
  5. "Do you run any warmups before collecting results data?" Yes. See "how is each test run" above. Every test is preceded by a warmup and brief (several seconds) cooldown prior to gathering test data.

Results

  1. "I am about to start a new web application project; how should I interpret these results?" Most importantly, recognize that performance data should be one part of your decision-making process. High-performance web applications reduce hosting costs and improve user experience. Additionally, recognize that while we have aimed to select test types that represent workloads that are common for web applications, nothing beats conducting performance tests yourself for the specific workload of your application. In addition to performance, consider other requirements such as your language and platform preference; your invested knowledge in one or more of the frameworks we've tested; and the documentation and support provided by the framework's community. Combined with an examination of the source code, the results seen here should help you identify a platform and framework that is high-performance while still meeting your other requirements.
  2. "Why are the leaderboards for JSON Serialization and Plaintext so different on EC2 versus i7?" Put briefly, for fast frameworks on our i7 physical hardware, the limiting factor for the JSON test is our gigabit Ethernet; whereas on EC2, the limit is the CPU. Assuming proper response headers are provided, at approximately 200,000 non-pipelined and 550,000 pipelined responses per second and above, the network is saturated.
  3. "Where did earlier rounds go?" To better capture HTTP errors reported by Wrk, we have restructured the format of our results.json file. The test tool changed at Round 2 and some framework IDs were changed at Round 3. As a result, the results.json for Rounds 1 and 2 would have required manual editing and we opted to simply remove the previous rounds from this site. You can still see those rounds at our blog: Round 1, Round 2.
  4. "What does 'Did not complete' mean?" Starting with Round 9, we have added validation checks to confirm that implementations are behaving as we have specified in the requirements section of this site. An implementation that does not return the correct results, bypasses some of the requirements, or even formats the results in a manner inconsistent with the requirements will be marked as "Did not complete." We have solicited corrections from prior contributors and have attempted to address many of these, but it will take more time for all implementations to be correct. If you are a project participant and your contribution is marked as "Did not complete," please help us resolve this by contacting us at the GitHub repository. We may ultimately need a pull request from you, but we'd be happy to help you understand what specifically is triggering a validation error with your implementation.

Join the conversation

Post questions, comments, criticism, and suggestions on the framework-benchmarks Google Group.

Simulating production environments

For this project, we aimed to configure every framework according to the best practices for production deployments gleaned from documentation and popular community opinion. The goal is approximating a sensible production deployment as accurately as possible. We also want this project to be as transparent as possible, so we have posted our test suites on GitHub.

Environment details

Hardware
  • Peak: Dell R720xd dual Xeon E5-2660 v2 servers with 32 GB memory; database servers equipped with SSDs in RAID; switched 10-gigabit Ethernet
  • i7: Sandy Bridge Core i7-2600K workstations with 8 GB memory (early 2011 vintage); database server equipped with Samsung 840 Pro SSD; switched gigabit Ethernet
  • EC2: Amazon EC2 m1.large instances; switched gigabit Ethernet
Ruby
Go
C / C++
PHP
Lua
Operating systems
Databases
Java / JVM
Dart
Nimrod
Web servers
Load simulator
Python
Erlang
JavaScript
Haskell
Racket
Ur
C# / .NET / Mono
Perl

GitHub repository

All of the code used to produce the comparison of web frameworks seen here can be found in the project's GitHub repository. If you have any corrections or contributions, please submit a pull request!

Forum / mailing list

Join the conversation about this project on the framework-benchmarks Google Group.

Test requirements

We invite fans of frameworks and especially authors or maintainers of frameworks to join us in expanding the coverage of this project by implementing tests and contributing to the GitHub repository. The following are specifications for each of the test types we have included to-date in this project. Do not be alarmed; the specifications read quite verbose, but that's because they are specifications. The implementations tend to be quite easy in practice.

This project is evolving and we will periodically add new test types. As new test types are added, we encourage but do not require contributors of previous implementations to implement tests for the new test types. Wholly new test implementations are also encouraged to include all test types but are not required to do so. If you have limited time, we recommend you start with the easiest test types (1, 2, 3, and 6) and then continue beyond those as time permits.

General requirements

The following requirements apply to all test types below.

  1. All test implementations should be production-grade. The particulars of this will vary by framework and platform, but the general sentiment is that the code and configuration should be suitable for a production deployment. The word should is used here because production-grade is our goal, but we don't want this to be a roadblock. If you're submitting a new test and uncertain whether your code is production-grade, submit it anyway and then solicit input from other subject-matter experts.
  2. All test implementations must disable all disk logging. For many reasons, we expect all tests will run without writing logs to disk. Most importantly, the volume of requests is sufficiently high to fill up disks even with only a single line written to disk per request. Please disable all forms of disk logging. We recommend but do not require disabling console logging as well.
  3. Specific characters and character case matter. Assume the client consuming your service's JSON responses will be using a case-sensitive language such as JavaScript. In other words, if a test specifies that a map's key is id, use id. Do not use Id or ID. This strictness is required not only because it's sensible but also because our automated validation checks are picky.

Test type 1: JSON serialization

This test exercises the framework fundamentals including keep-alive support, request routing, request header parsing, object instantiation, JSON serialization, response header generation, and request count throughput.

Requirements

  1. For each request, an object mapping the key message to Hello, World! must be instantiated.
  2. The recommended URI is /json.
  3. A JSON serializer must be used to convert the object to JSON.
  4. The response text must be {"message":"Hello, World!"}, but white-space variations are acceptable.
  5. The response content length should be approximately 28 bytes.
  6. The response content type must be set to application/json.
  7. The response headers must include either Content-Length or Transfer-Encoding.
  8. The response headers must include Server and Date.
  9. gzip compression is not permitted.
  10. Server support for HTTP Keep-Alive is strongly encouraged but not required.
  11. If HTTP Keep-Alive is enabled, no maximum Keep-Alive timeout is specified by this test.
  12. The request handler will be exercised at concurrency levels ranging from 8 to 256.
  13. The request handler will be exercised using GET requests.

Example request

GET /json HTTP/1.1 Host: server User-Agent: Mozilla/5.0 (X11; Linux x86_64) Gecko/20130501 Firefox/30.0 AppleWebKit/600.00 Chrome/30.0.0000.0 Trident/10.0 Safari/600.00 Cookie: uid=12345678901234567890; __utma=1.1234567890.1234567890.1234567890.1234567890.12; wd=2560x1600 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Connection: keep-alive

Example response

HTTP/1.1 200 OK Content-Type: application/json; charset=UTF-8 Content-Length: 28 Server: Example Date: Wed, 17 Apr 2013 12:00:00 GMT {"message":"Hello, World!"}

Test type 2: Single database query

This test exercises the framework's object-relational mapper (ORM), random number generator, database driver, and database connection pool.

Requirements

  1. For every request, a single row from a World table must be retrieved from a database table.
  2. The recommended URI is /db.
  3. The schema for World is id (int, primary key) and randomNumber (int), except for MongoDB, wherein the identity column is _id, with the leading underscore.
  4. The World table is known to contain 10,000 rows.
  5. The row retrieved must be selected by its id using a random number generator (ids range from 1 to 10,000).
  6. The row should be converted to an object using an object-relational mapping (ORM) tool. Tests that do not use an ORM will be classified as "raw" meaning they use the platform's raw database connectivity.
  7. The object (or database row, if an ORM is not used) must be serialized to JSON.
  8. The response content length should be approximately 32 bytes.
  9. The response content type must be set to application/json.
  10. The response headers must include either Content-Length or Transfer-Encoding.
  11. The response headers must include Server and Date.
  12. Use of an in-memory cache of World objects or rows by the application is not permitted.
  13. Use of prepared statements for SQL database tests (e.g., for MySQL) is encouraged but not required.
  14. gzip compression is not permitted.
  15. Server support for HTTP Keep-Alive is strongly encouraged but not required.
  16. If HTTP Keep-Alive is enabled, no maximum Keep-Alive timeout is specified by this test.
  17. The request handler will be exercised at concurrency levels ranging from 8 to 256.
  18. The request handler will be exercised using GET requests.

Example request

GET /db HTTP/1.1 Host: server User-Agent: Mozilla/5.0 (X11; Linux x86_64) Gecko/20130501 Firefox/30.0 AppleWebKit/600.00 Chrome/30.0.0000.0 Trident/10.0 Safari/600.00 Cookie: uid=12345678901234567890; __utma=1.1234567890.1234567890.1234567890.1234567890.12; wd=2560x1600 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Connection: keep-alive

Example response

HTTP/1.1 200 OK Content-Length: 32 Content-Type: application/json; charset=UTF-8 Server: Example Date: Wed, 17 Apr 2013 12:00:00 GMT {"id":3217,"randomNumber":2149}

Test type 3: Multiple database queries

This test is a variation of Test #2 and also uses the World table. Multiple rows are fetched to more dramatically punish the database driver and connection pool. At the highest queries-per-request tested (20), this test demonstrates all frameworks' convergence toward zero requests-per-second as database activity increases.

Requirements

  1. For every request, an integer query string parameter named queries must be retrieved from the request. The parameter specifies the number of database queries to execute in preparing the HTTP response (see below).
  2. The recommended URI is /queries.
  3. The queries parameter must be bounded to between 1 and 500. If the parameter is missing, is not an integer, or is an integer less than 1, the value should be interpreted as 1; if greater than 500, the value should be interpreted as 500.
  4. The request handler must retrieve a set of World objects, equal in count to the queries parameter, from the World database table.
  5. Each row must be selected randomly in the same fashion as the single database query test (Test #2 above).
  6. Since this test is designed to exercise multiple queries, each row must be selected individually by a query. It is not acceptable to retrieve all required rows using a SELECT ... WHERE id IN (...) clause.
  7. Each World object must be added to a list or array.
  8. The list or array must be serialized to JSON and sent as a response.
  9. The response content type must be set to application/json.
  10. The response headers must include either Content-Length or Transfer-Encoding.
  11. The response headers must include Server and Date.
  12. Use of an in-memory cache of World objects or rows by the application is not permitted.
  13. Use of prepared statements for SQL database tests (e.g., for MySQL) is encouraged but not required.
  14. gzip compression is not permitted.
  15. Server support for HTTP Keep-Alive is strongly encouraged but not required.
  16. If HTTP Keep-Alive is enabled, no maximum Keep-Alive timeout is specified by this test.
  17. The request handler will be exercised at 256 concurrency only.
  18. The request handler will be exercised with query counts of 1, 5, 10, 15, and 20.
  19. The request handler will be exercised using GET requests.

Example request

GET /queries?queries=10 HTTP/1.1 Host: server User-Agent: Mozilla/5.0 (X11; Linux x86_64) Gecko/20130501 Firefox/30.0 AppleWebKit/600.00 Chrome/30.0.0000.0 Trident/10.0 Safari/600.00 Cookie: uid=12345678901234567890; __utma=1.1234567890.1234567890.1234567890.1234567890.12; wd=2560x1600 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Connection: keep-alive

Example response

HTTP/1.1 200 OK Content-Length: 315 Content-Type: application/json; charset=UTF-8 Server: Example Date: Wed, 17 Apr 2013 12:00:00 GMT [{"id":4174,"randomNumber":331},{"id":51,"randomNumber":6544},{"id":4462,"randomNumber":952},{"id":2221,"randomNumber":532},{"id":9276,"randomNumber":3097},{"id":3056,"randomNumber":7293},{"id":6964,"randomNumber":620},{"id":675,"randomNumber":6601},{"id":8414,"randomNumber":6569},{"id":2753,"randomNumber":4065}]

Test type 4: Fortunes

This test exercises the ORM, database connectivity, dynamic-size collections, sorting, server-side templates, XSS countermeasures, and character encoding.

Requirements

  1. The recommended URI is /fortunes.
  2. A Fortune database table contains a dozen Unix-style fortune-cookie messages.
  3. The schema for Fortune is id (int, primary key) and message (varchar), except for MongoDB, wherein the identity column is _id, with the leading underscore.
  4. Using an ORM, all Fortune objects must be fetched from the Fortune table, and placed into a list data structure. Tests that do not use an ORM will be classified as "raw" meaning they use the platform's raw database connectivity.
  5. The list data structure must be a dynamic-size or equivalent and should not be dimensioned using foreknowledge of the row-count of the database table.
  6. Within the scope of the request, a new Fortune object must be constructed and added to the list. This confirms that the data structure is dynamic-sized. The new fortune is not persisted to the database; it is ephemeral for the scope of the request.
  7. The new Fortune's message must be "Additional fortune added at request time."
  8. The list of Fortune objects must be sorted by the order of the message field. No ORDER BY clause is permitted in the database query (ordering within the query would be of negligible value anyway since a newly instantiated Fortune is added to the list prior to sorting).
  9. The sorted list must be provided to a server-side template and rendered to simple HTML (see below for minimum template). The resulting HTML table displays each Fortune's id number and message text.
  10. This test does not include external assets (CSS, JavaScript); a later test type will include assets.
  11. The HTML generated by the template must be sent as a response.
  12. Be aware that the message text fields are stored as UTF-8 and one of the fortune cookie messages is in Japanese.
  13. The resulting HTML must be delivered using UTF-8 encoding.
  14. The Japanese fortune cookie message must be displayed correctly.
  15. Be aware that at least one of the message text fields includes a <script> tag.
  16. The server-side template must assume the message text cannot be trusted and must escape the message text properly.
  17. The implementation is encouraged to use best practices for templates such as layout inheritence, separate header and footer files, and so on. However, this is not required. We request that implementations do not manage assets (JavaScript, CSS, images). We are deferring asset management until we can craft a more suitable test.
  18. The response content type must be set to text/html.
  19. The response headers must include either Content-Length or Transfer-Encoding.
  20. The response headers must include Server and Date.
  21. Use of an in-memory cache of Fortune objects or rows by the application is not permitted.
  22. Use of prepared statements for SQL database tests (e.g., for MySQL) is encouraged but not required.
  23. gzip compression is not permitted.
  24. Server support for HTTP Keep-Alive is strongly encouraged but not required.
  25. If HTTP Keep-Alive is enabled, no maximum Keep-Alive timeout is specified by this test.
  26. The request handler will be exercised at concurrency levels ranging from 8 to 256.
  27. The request handler will be exercised using GET requests.

Example request

GET /fortunes HTTP/1.1 Host: server User-Agent: Mozilla/5.0 (X11; Linux x86_64) Gecko/20130501 Firefox/30.0 AppleWebKit/600.00 Chrome/30.0.0000.0 Trident/10.0 Safari/600.00 Cookie: uid=12345678901234567890; __utma=1.1234567890.1234567890.1234567890.1234567890.12; wd=2560x1600 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Connection: keep-alive

Example response

HTTP/1.1 200 OK Content-Length: 1196 Content-Type: text/html; charset=UTF-8 Server: Example Date: Wed, 17 Apr 2013 12:00:00 GMT <!DOCTYPE html><html><head><title>Fortunes</title></head><body><table><tr><th>id</th><th>message</th></tr><tr><td>11</td><td>&lt;script&gt;alert(&quot;This should not be displayed in a browser alert box.&quot;);&lt;/script&gt;</td></tr><tr><td>4</td><td>A bad random number generator: 1, 1, 1, 1, 1, 4.33e+67, 1, 1, 1</td></tr><tr><td>5</td><td>A computer program does what you tell it to do, not what you want it to do.</td></tr><tr><td>2</td><td>A computer scientist is someone who fixes things that aren&apos;t broken.</td></tr><tr><td>8</td><td>A list is only as strong as its weakest link. — Donald Knuth</td></tr><tr><td>0</td><td>Additional fortune added at request time.</td></tr><tr><td>3</td><td>After enough decimal places, nobody gives a damn.</td></tr><tr><td>7</td><td>Any program that runs right is obsolete.</td></tr><tr><td>10</td><td>Computers make very fast, very accurate mistakes.</td></tr><tr><td>6</td><td>Emacs is a nice operating system, but I prefer UNIX. — Tom Christaensen</td></tr><tr><td>9</td><td>Feature: A bug with seniority.</td></tr><tr><td>1</td><td>fortune: No such file or directory</td></tr><tr><td>12</td><td>フレームワークのベンチマーク</td></tr></table></body></html>

Minimum template

Along with the example response above, the following Mustache template illustrates the minimum requirements for the server-side template. White-space can be optionally eliminated.

<!DOCTYPE html> <html> <head><title>Fortunes</title></head> <body> <table> <tr><th>id</th><th>message</th></tr> {{#.}} <tr><td>{{id}}</td><td>{{message}}</td></tr> {{/.}} </table> </body> </html>

Test type 5: Database updates

This test is a variation of Test #3 that exercises the ORM's persistence of objects and the database driver's performance at running UPDATE statements or similar. The spirit of this test is to exercise a variable number of read-then-write style database operations.

Requirements

  1. The recommended URI is /updates.
  2. For every request, an integer query string parameter named queries must be retrieved from the request. The parameter specifies the number of rows to fetch and update in preparing the HTTP response (see below).
  3. The queries parameter must be bounded to between 1 and 500. If the parameter is missing, is not an integer, or is an integer less than 1, the value should be interpreted as 1; if greater than 500, the value should be interpreted as 500.
  4. The request handler must retrieve a set of World objects, equal in count to the queries parameter, from the World database table.
  5. Each row must be selected randomly using one query in the same fashion as the single database query test (Test #2 above). As with the read-only multiple-query test type (#3 above), use of IN clauses or similar means to consolidate multiple queries into one operation is not permitted.
  6. At least the randomNumber field must be read from the database result set.
  7. Each World object must have its randomNumber field updated to a new random integer between 1 and 10000.
  8. Each World object must be persisted to the database with its new randomNumber value.
  9. Use of batch updates is acceptable but not required.
  10. Use of transactions is acceptable but not required. If transactions are used, a transaction should only encapsulate a single iteration, composed of a single read and single write. Transactions should not be used to consolidate multiple iterations into a single operation.
  11. For raw tests (that is, tests without an ORM), each updated row must receive a unique new randomNumber value. It is not acceptable to change the randomNumber value of all rows to the same random number using an UPDATE ... WHERE id IN (...) clause.
  12. Each World object must be added to a list or array.
  13. The list or array must be serialized to JSON and sent as a response.
  14. The response content type must be set to application/json.
  15. The response headers must include either Content-Length or Transfer-Encoding.
  16. The response headers must include Server and Date.
  17. Use of an in-memory cache of World objects or rows by the application is not permitted.
  18. Use of prepared statements for SQL database tests (e.g., for MySQL) is encouraged but not required.
  19. gzip compression is not permitted.
  20. Server support for HTTP Keep-Alive is strongly encouraged but not required.
  21. If HTTP Keep-Alive is enabled, no maximum Keep-Alive timeout is specified by this test.
  22. The request handler will be exercised at 256 concurrency only.
  23. The request handler will be exercised with query counts of 1, 5, 10, 15, and 20.
  24. The request handler will be exercised using GET requests.

Example request

GET /updates?queries=10 HTTP/1.1 Host: server User-Agent: Mozilla/5.0 (X11; Linux x86_64) Gecko/20130501 Firefox/30.0 AppleWebKit/600.00 Chrome/30.0.0000.0 Trident/10.0 Safari/600.00 Cookie: uid=12345678901234567890; __utma=1.1234567890.1234567890.1234567890.1234567890.12; wd=2560x1600 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Connection: keep-alive

Example response

HTTP/1.1 200 OK Content-Length: 315 Content-Type: application/json; charset=UTF-8 Server: Example Date: Wed, 17 Apr 2013 12:00:00 GMT [{"id":4174,"randomNumber":331},{"id":51,"randomNumber":6544},{"id":4462,"randomNumber":952},{"id":2221,"randomNumber":532},{"id":9276,"randomNumber":3097},{"id":3056,"randomNumber":7293},{"id":6964,"randomNumber":620},{"id":675,"randomNumber":6601},{"id":8414,"randomNumber":6569},{"id":2753,"randomNumber":4065}]

Test type 6: Plaintext

This test is an exercise of the request-routing fundamentals only, designed to demonstrate the capacity of high-performance platforms in particular. The response payload is still small, meaning good performance is still necessary in order to saturate the gigabit Ethernet of the test environment.

Requirements

  1. The recommended URI is /plaintext.
  2. The response content type must be set to text/plain.
  3. The response body must be Hello, World!.
  4. This test is not intended to exercise the allocation of memory or instantiation of objects. Therefore it is acceptable but not required to re-use a single buffer for the response text (Hello, World). However, the response must be fully composed from this and its headers within the scope of each request and it is not acceptable to store the entire payload of the response, headers inclusive, as a pre-rendered buffer.
  5. The response headers must include either Content-Length or Transfer-Encoding.
  6. The response headers must include Server and Date.
  7. gzip compression is not permitted.
  8. Server support for HTTP Keep-Alive is strongly encouraged but not required.
  9. Server support for HTTP pipelining is strongly encouraged but not required.
  10. If HTTP Keep-Alive is enabled, no maximum Keep-Alive timeout is specified by this test.
  11. The request handler will be exercised at 256, 1024, 4096, and 16,384 concurrency.
  12. The request handler will be exercised using GET requests.

Example request

GET /plaintext HTTP/1.1 Host: server User-Agent: Mozilla/5.0 (X11; Linux x86_64) Gecko/20130501 Firefox/30.0 AppleWebKit/600.00 Chrome/30.0.0000.0 Trident/10.0 Safari/600.00 Cookie: uid=12345678901234567890; __utma=1.1234567890.1234567890.1234567890.1234567890.12; wd=2560x1600 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Connection: keep-alive

Example response

HTTP/1.1 200 OK Content-Length: 15 Content-Type: text/plain; charset=UTF-8 Server: Example Date: Wed, 17 Apr 2013 12:00:00 GMT Hello, World!