Blog

You are viewing entries for March, 2013 and may want to check out the most recent entries.

March 28, 2013

Frameworks Round 1

How much does your framework choice affect performance? The answer may surprise you.

Authors' Note: We're using the word "framework" loosely to refer to platforms, micro-frameworks, and full-stack frameworks. We have our own personal favorites among these frameworks, but we've tried our best to give each a fair shot. See our Expected Questions section at the end for details.

Show me the winners!

We know you're curious (we were too!) so here is a chart of representative results.

Whoa! Netty, Vert.x, and Java servlets are fast, but we were surprised how much faster they are than Ruby, Django, and friends. Before we did the benchmarks, we were guessing there might be a 4x difference. But a 40x difference between Vert.x and Ruby on Rails is staggering. And let us simply draw the curtain of charity over the Cake PHP results.

If these results were surprising to you, too, then read on so we can share our methodology and other test results. Even better, maybe you can spot a place where we mistakenly hobbled a framework and we can improve the tests. We've done our best, but we are not experts in most of them so help is welcome!

Motivation

Among the many factors to consider when choosing a web development framework, raw performance is easy to objectively measure. Application performance can be directly mapped to hosting dollars, and for a start-up company in its infancy, hosting costs can be a pain point. Weak performance can also cause premature scale pain, user experience degradation, and associated penalties levied by search engines.

What if building an application on one framework meant that at the very best your hardware is suitable for one tenth as much load as it would be had you chosen a different framework? The differences aren't always that extreme, but in some cases, they might be. It's worth knowing what you're getting into.

Simulating production environments

For this exercise, we aimed to configure every framework according to the best practices for production deployments gleaned from documentation and popular community opinion. Our goal is to approximate a sensible production deployment as accurately as possible. For each framework, we describe the configuration approach we've used and cite the sources recommending that configuration.

We want it all to be as transparent as possible, so we have posted our test suites on GitHub.

Results

We ran each test on EC2 and our i7 hardware. See Environment Details below for more information.

JSON serialization test

First up is plain JSON serialization on Amazon EC2 large instances. This is repeated in the introduction, above.

The high-performance Netty platform takes a commanding lead for JSON serialization on EC2. Since Vert.x is built on Netty, it too achieved full saturation of the CPU cores and impressive numbers. In third place is plain Java Servlets running on Caucho's Resin Servlet container. Plain Go delivers the best showing for a non-JVM framework.
We expected a fairly wide field, but we were surprised to see results that span four orders of magnitude.

Dedicated hardware

Here is the same test on our Sandy Bridge i7 hardware.

On our dedicated hardware, plain Servlets take the lead with over 210,000 requests per second. Vert.x remains strong but tapers off at higher concurrency levels despite being given eight workers, one for each HT core.

Database access test (single query)

How many requests can be handled per second if each request is fetching a random record from a data store? Starting again with EC2.

For database access tests, we considered dropping Cake to constrain our EC2 costs. This test exercises the database driver and connection pool and illustrates how well each scales with concurrency. Compojure makes a respectable showing but plain Servlets paired with the standard connection pool provided by MySQL is strongest at high concurrency. Gemini is using its built-in connection pool and lightweight ORM.
It's worth pausing to appreciate that this shows an EC2 Large instance can query a remote MySQL instance at least 8,800 times per second, putting aside the additional work of each query being part of an HTTP request and response cycle.

Dedicated hardware

The dedicated hardware impresses us with its ability to process nearly 100,000 requests per second with one query per request. JVM frameworks are especially strong here thanks to JDBC and efficient connection pools. In this test, we suspect Vert.x is being hit very hard by its connectivity to MongoDB. We are especially interested in community feedback related to tuning these MongoDB numbers.

Database access test (multiple queries)

The following tests are all run at 256 concurrency and vary the number of database queries per request. The tests are 1, 5, 10, 15, and 20 queries per request. The 1-query samples, leftmost on the line charts, should be similar (within sampling error) of the single-query test above.

As expected, as we increase the number of queries per request, the lines converge to zero. However, looking at the 20-queries bar chart, roughly the same ranked order we've seen elsewhere is still in play, demonstrating the headroom afforded by higher-performance frameworks.
We were surprised by the performance of Raw PHP in this test. We suspect the PHP MySQL driver and connection pool are particularly well tuned. However, the penalty for using an ORM on PHP is severe.

Dedicated hardware

The dedicated hardware produces numbers nearly ten times greater than EC2 with the punishing 20 queries per request. Again, Raw PHP makes an extremely strong showing, but PHP with an ORM and Cake—the only PHP framework in our test—are at the opposite end of the spectrum.

How we designed the tests

This exercise aims to provide a "baseline" for performance across the variety of frameworks. By baseline we mean the starting point, from which any real-world application's performance can only get worse. We aim to know the upper bound being set on an application's performance per unit of hardware by each platform and framework.

But we also want to exercise some of the frameworks' components such as its JSON serializer and data-store/database mapping. While each test boils down to a measurement of the number of requests per second that can be processed by a single server, we are exercising a sample of the components provided by modern frameworks, so we believe it's a reasonable starting point.

For the data-connected test, we've deliberately constructed the tests to avoid any framework-provided caching layer. We want this test to require repeated requests to an external service (MySQL or MongoDB, for example) so that we exercise the framework's data mapping code. Although we expect that the external service is itself caching the small number of rows our test consumes, the framework is not allowed to avoid the network transmission and data mapping portion of the work.

Not all frameworks provide components for all of the tests. For these situations, we attempted to select a popular best-of-breed option.

Each framework was tested using 2^3 to 2^8 (8, 16, 32, 64, 128, and 256) request concurrency. On EC2, WeigHTTP was configured to use two threads (one per core) and on our i7 hardware, it was configured to use eight threads (one per HT core). For each test, WeigHTTP simulated 100,000 HTTP requests with keep-alives enabled.

For each test, the framework was warmed up by running a full test prior to capturing performance numbers.

Finally, for each framework, we collected the framework's best performance across the various concurrency levels for plotting as peak bar charts.

We used two machines for all tests, configured in the following roles:

  • Application server. This machine is responsible for hosting the web application exclusively. Note, however, that when community best practices specified use of a web server in front of the application container, we had the web server installed on the same machine.
  • Load client and database server. This machine is responsible for generating HTTP traffic to the application server using WeigHTTP and also for hosting the database server. In all of our tests, the database server (MySQL or MongoDB) used very little CPU time; and WeigHTTP was not starved of CPU resource. In the database tests, the network was being used to provide result sets to the application server and to provide HTTP responses in the opposite direction. However, even with the quickest frameworks, network utilization was lower in database tests than in the plain JSON tests, so this is unlikely to be a concern.

Ultimately, a three-machine configuration would dismiss the concern of double-duty for the second machine. However, we doubt that the results would be noticeably different.

The Tests

We ran three types of tests. Not all tests were run for all frameworks. See details below.

JSON serialization

For this test, each framework simply responds with the following object, encoded using the framework's JSON serializer.

{"message" : "Hello, World!"}

With the content type set to application/json. If the framework provides no JSON serializer, a best-of-breed for the platform is selected. For example, on the Java platform, Jackson was used for frameworks that do not provide a serializer.

Database access (single query)

In this test, we use the ORM of choice for each framework to grab one simple object selected at random from a table containing 10,000 rows. We use the same JSON serialization tested earlier to serialize that object as JSON. Caveat: when the data store provides data as JSON in situ (such as with the MongoDB tests), no transcoding is done; the string of JSON is sent as-is.

As with JSON serialization, we've selected a best-of-breed ORM when the framework is agnostic. For example, we used Sequelize for the JavaScript MySQL tests.

We tested with MySQL for most frameworks, but where MongoDB is more conventional (as with node.js), we tested that instead or in addition. We also did some spot tests with PostgreSQL but have not yet captured any of those results in this effort. Preliminary results showed RPS performance about 25% lower than with MySQL. Since PostgreSQL is considered favorable from a durability perspective, we plan to include more PostgreSQL testing in the future.

Database access (multiple queries)

This test repeats the work of the single-query test with an adjustable queries-per-request parameter. Tests are run at 5, 10, 15, and 20 queries per request. Each query selects a random row from the same table exercised in the previous test with the resulting array then serialized to JSON as a response.

This test is intended to illustrate how all frameworks inevitably will converge to zero requests per second as the complexity of each request increases. Admittedly, especially at 20 queries per request, this particular test is unnaturally database heavy compared to real-world applications. Only grossly inefficient applications or uncommonly complex requests would make that many database queries per request.

Environment Details

Hardware
  • Two Intel Sandy Bridge Core i7-2600K workstations with 8 GB memory each (early 2011 vintage) for the i7 tests
  • Two Amazon EC2 m1.large instances for the EC2 tests
  • Switched gigabit Ethernet
Load simulator
Databases
Ruby
JavaScript
PHP
Operating system
Web servers
Python
Go
Java / JVM

Notes

  • For the database tests, any framework with the suffix "raw" in its name is using its platform's raw database connectivity without an object-relational map (ORM) of any flavor. For example, servlet-raw is using raw JDBC. All frameworks without the "raw" suffix in their name are using either the framework-provided ORM or a best-of-breed for the platform (e.g., ActiveRecord).

Code examples

You can find the full source code for all of the tests on Github. Below are the relevant portions of the code to fetch a configurable number of random database records, serialize the list of records as JSON, and then send the JSON as an HTTP response.

Cake

View on Githubpublic function index() { $query_count = $this->request->query('queries'); if ($query_count == null) { $query_count = 1; } $arr = array(); for ($i = 0; $i < $query_count; $i++) { $id = mt_rand(1, 10000); $world = $this->World->find('first', array('conditions' => array('id' => $id))); $arr[] = array("id" => $world['World']['id'], "randomNumber" => $world['World']['randomNumber']); } $this->set('worlds', $arr); $this->set('_serialize', array('worlds')); }

Compojure

View on Github(defn get-world [] (let [id (inc (rand-int 9999))] ; Num between 1 and 10,000 (select world (fields :id :randomNumber) (where {:id id })))) (defn run-queries [queries] (vec ; Return as a vector (flatten ; Make it a list of maps (take queries ; Number of queries to run (repeatedly get-world)))))

Django

View on Githubdef db(request): queries = int(request.GET.get('queries', 1)) worlds = [] for i in range(queries): worlds.append(World.objects.get(id=random.randint(1, 10000))) return HttpResponse(serializers.serialize("json", worlds), mimetype="application/json")

Express

View on Githubapp.get('/mongoose', function(req, res) { var queries = req.query.queries || 1, worlds = [], queryFunctions = []; for (var i = 1; i <= queries; i++ ) { queryFunctions.push(function(callback) { MWorld.findOne({ id: (Math.floor(Math.random() * 10000) + 1 )}) .exec(function (err, world) { worlds.push(world); callback(null, 'success'); }); }); } async.parallel(queryFunctions, function(err, results) { res.send(worlds); }); });

Gemini

View on Github@PathSegment public boolean db() { final Random random = ThreadLocalRandom.current(); final int queries = context().getInt("queries", 1, 1, 500); final World[] worlds = new World[queries]; for (int i = 0; i < queries; i++) { worlds[i] = store.get(World.class, random.nextInt(DB_ROWS) + 1); } return json(worlds); }

Grails

View on Githubdef db() { def random = ThreadLocalRandom.current(); def queries = params.queries ? params.int('queries') : 1 def worlds = [] for (int i = 0; i < queries; i++) { worlds.add(World.read(random.nextInt(10000) + 1)); } render worlds as JSON }

Node.js

View on Githubif (path === '/mongoose') { var queries = 1, worlds = [], queryFunctions = [], values = url.parse(req.url, true); if (values.query.queries) { queries = values.query.queries; } res.writeHead(200, {'Content-Type': 'application/json; charset=UTF-8'}); for (var i = 1; i <= queries; i++ ) { queryFunctions.push(function(callback) { MWorld.findOne({ id: (Math.floor(Math.random() * 10000) + 1 )}) .exec(function (err, world) { worlds.push(world); callback(null, 'success'); }); }); } async.parallel(queryFunctions, function(err, results) { res.end(JSON.stringify(worlds)); }); }

PHP (Raw)

View on Github$query_count = 1; if (!empty($_GET)) { $query_count = $_GET["queries"]; } $arr = array(); $statement = $pdo->prepare("SELECT * FROM World WHERE id = :id"); for ($i = 0; $i < $query_count; $i++) { $id = mt_rand(1, 10000); $statement->bindValue(':id', $id, PDO::PARAM_INT); $statement->execute(); $row = $statement->fetch(PDO::FETCH_ASSOC); $arr[] = array("id" => $id, "randomNumber" => $row['randomNumber']); } echo json_encode($arr);

PHP (ORM)

View on Github$query_count = 1; if (!empty($_GET)) { $query_count = $_GET["queries"]; } $arr = array(); for ($i = 0; $i < $query_count; $i++) { $id = mt_rand(1, 10000); $world = World::find_by_id($id); $arr[] = $world->to_json(); } echo json_encode($arr);

Play

View on Githubpublic static Result db(Integer queries) { final Random random = ThreadLocalRandom.current(); final World[] worlds = new World[queries]; for (int i = 0; i < queries; i++) { worlds[i] = World.find.byId((long)(random.nextInt(DB_ROWS) + 1)); } return ok(Json.toJson(worlds)); }

Rails

View on Githubdef db queries = params[:queries] || 1 results = [] (1..queries.to_i).each do results << World.find(Random.rand(10000) + 1) end render :json => results end

Servlet

View on Githubres.setHeader(HEADER_CONTENT_TYPE, CONTENT_TYPE_JSON); final DataSource source = mysqlDataSource; int count = 1; try { count = Integer.parseInt(req.getParameter("queries")); } catch (NumberFormatException nfexc) { } final World[] worlds = new World[count]; final Random random = ThreadLocalRandom.current(); try (Connection conn = source.getConnection()) { try (PreparedStatement statement = conn.prepareStatement(DB_QUERY, ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY)) { for (int i = 0; i < count; i++) { final int id = random.nextInt(DB_ROWS) + 1; statement.setInt(1, id); try (ResultSet results = statement.executeQuery()) { if (results.next()) { worlds[i] = new World(id, results.getInt("randomNumber")); } } } } } catch (SQLException sqlex) { System.err.println("SQL Exception: " + sqlex); } try { mapper.writeValue(res.getOutputStream(), worlds); } catch (IOException ioe) { }

Sinatra

View on Githubget '/db' do queries = params[:queries] || 1 results = [] (1..queries.to_i).each do results << World.find(Random.rand(10000) + 1) end results.to_json end

Spring

View on Github@RequestMapping(value = "/db") public Object index(HttpServletRequest request, HttpServletResponse response, Integer queries) { if (queries == null) { queries = 1; } final World[] worlds = new World[queries]; final Random random = ThreadLocalRandom.current(); final Session session = HibernateUtil.getSessionFactory() .openSession(); for(int i = 0; i < queries; i++) { worlds[i] = (World)session.byId(World.class) .load(random.nextInt(DB_ROWS) + 1); } session.close(); try { new MappingJackson2HttpMessageConverter().write( worlds, MediaType.APPLICATION_JSON, new ServletServerHttpResponse(response)); } catch (IOException e) { } return null; }

Tapestry

View on GithubStreamResponse onActivate() { int queries = 1; String qString = this.request.getParameter("queries"); if (qString != null) { queries = Integer.parseInt(qString); } if (queries <= 0) { queries = 1; } final World[] worlds = new World[queries]; final Random rand = ThreadLocalRandom.current(); for (int i = 0; i < queries; i++) { worlds[i] = (World)session.get(World.class, new Integer(rand.nextInt(DB_ROWS) + 1)); } String response = ""; try { response = HelloDB.mapper.writeValueAsString(worlds); } catch (IOException ex) { } return new TextStreamResponse("application/json", response); }

Vert.x

View on Githubprivate void handleDb(final HttpServerRequest req) { int queriesParam = 1; try { queriesParam = Integer.parseInt(req.params().get("queries")); } catch(Exception e) { } final DbHandler dbh = new DbHandler(req, queriesParam); final Random random = ThreadLocalRandom.current(); for (int i = 0; i < queriesParam; i++) { this.getVertx().eventBus().send( "hello.persistor", new JsonObject() .putString("action", "findone") .putString("collection", "world") .putObject("matcher", new JsonObject().putNumber("id", (random.nextInt(10000) + 1))), dbh); } } class DbHandler implements Handler<Message<JsonObject>> { private final HttpServerRequest req; private final int queries; private final List<Object> worlds = new CopyOnWriteArrayList<>(); public DbHandler(HttpServerRequest request, int queriesParam) { this.req = request; this.queries = queriesParam; } @Override public void handle(Message<JsonObject> reply) { final JsonObject body = reply.body; if ("ok".equals(body.getString("status"))) { this.worlds.add(body.getObject("result")); } if (this.worlds.size() == this.queries) { try { final String result = mapper.writeValueAsString(worlds); final int contentLength = result .getBytes(StandardCharsets.UTF_8).length; this.req.response.putHeader("Content-Type", "application/json; charset=UTF-8"); this.req.response.putHeader("Content-Length", contentLength); this.req.response.write(result); this.req.response.end(); } catch (IOException e) { req.response.statusCode = 500; req.response.end(); } } } }

Wicket

View on Githubprotected ResourceResponse newResourceResponse(Attributes attributes) { final int queries = attributes.getRequest().getQueryParameters() .getParameterValue("queries").toInt(1); final World[] worlds = new World[queries]; final Random random = ThreadLocalRandom.current(); final ResourceResponse response = new ResourceResponse(); response.setContentType("application/json"); response.setWriteCallback(new WriteCallback() { public void writeData(Attributes attributes) { final Session session = HibernateUtil.getSessionFactory() .openSession(); for (int i = 0; i < queries; i++) { worlds[i] = (World)session.byId(World.class) .load(random.nextInt(DB_ROWS) + 1); } session.close(); try { attributes.getResponse().write(HelloDbResponse.mapper .writeValueAsString(worlds)); } catch (IOException ex) { } } }); return response; }

Expected questions

We expect that you might have a bunch of questions. Here are some that we're anticipating. But please contact us if you have a question we're not dealing with here or just want to tell us we're doing it wrong.

  1. "You configured framework x incorrectly, and that explains the numbers you're seeing." Whoops! Please let us know how we can fix it, or submit a Github pull request, so we can get it right.
  2. "Why WeigHTTP?" Although many web performance tests use ApacheBench from Apache to generate HTTP requests, we have opted to use WeigHTTP from the LigHTTP team. ApacheBench remains a single-threaded tool, meaning that for higher-performance test scenarios, ApacheBench itself is a limiting factor. WeigHTTP is essentially a multithreaded clone of ApacheBench. If you have a recommendation for an even better benchmarking tool, please let us know.
  3. "Doesn't benchmarking on Amazon EC2 invalidate the results?" Our opinion is that doing so confirms precisely what we're trying to test: performance of web applications within realistic production environments. Selecting EC2 as a platform also allows the tests to be readily verified by anyone interested in doing so. However, we've also executed tests on our Core i7 (Sandy Bridge) workstations running Ubuntu 12.04 as a non-virtualized sanity check. Doing so confirmed our suspicion that the ranked order and relative performance across frameworks is mostly consistent between EC2 and physical hardware. That is, while the EC2 instances were slower than the physical hardware, they were slower by roughly the same proportion across the spectrum of frameworks.
  4. "Why include this Gemini framework I've never heard of?" We have included our in-house Java web framework, Gemini, in our tests. We've done so because it's of interest to us. You can consider it a stand-in for any relatively lightweight minimal-locking Java framework. While we're proud of how it performs among the well-established field, this exercise is not about Gemini. We routinely use other frameworks on client projects and we want this data to inform our recommendations for new projects.
  5. "Why is JRuby performance all over the map?" During the evolution of this project, in some test runs, JRuby would slighly edge out traditional Ruby, and in some cases—with the same test code—the opposite would be true. We also don't have an explanation for the weak performance of Sinatra on JRuby, which is no better than Rails. Ultimately we're not sure about the discrepancy. Hopefully an expert in JRuby can help us here.
  6. "Framework X has in-memory caching, why don't you use that?" In-memory caching, as provided by Gemini and some other frameworks, yields higher performance than repeatedly hitting a database, but isn't available in all frameworks, so we omitted in-memory caching from these tests.
  7. "What about other caching approaches, then?" Remote-memory or near-memory caching, as provided by Memcached and similar solutions, also improves performance and we would like to conduct future tests simulating a more expensive query operation versus Memcached. However, curiously, in spot tests, some frameworks paired with Memcached were conspicuously slower than other frameworks directly querying the authoritative MySQL database (recognizing, of course, that MySQL had its entire data-set in its own memory cache). For simple "get row ID n" and "get all rows" style fetches, a fast framework paired with MySQL may be faster and easier to work with versus a slow framework paired with Memcached.
  8. "Do all the database tests use connection pooling?" Sadly Django provides no connection pooling and in fact closes and re-opens a connection for every request. All the other tests use pooling.
  9. "What is Resin? Why aren't you using Tomcat for the Java frameworks?" Resin is a Java application server. The GPL version that we used for our tests is a relatively lightweight Servlet container. Although we recommend Caucho Resin for Java deployments, in our tests, we found Tomcat to be easier to configure. We ultimately dropped Tomcat from our tests because Resin was slightly faster across all frameworks.
  10. "Why don't you test framework X?" We'd love to, if we can find the time. Even better, craft the test yourself and submit a Github pull request so we can get it in there faster!
  11. "Why doesn't your test include more substantial algorithmic work, or building an HTML response with a server-side template?" Great suggestion. We hope to in the future!
  12. "Why are you using a (slightly) old version of framework X?" It's nothing personal! We tried to keep everything fully up-to-date, but with so many frameworks it became a never-ending game of whack-a-mole. If you think an update will affect the results, please let us know (or submit a Github pull request) and we'll get it updated!

Conclusion

Let go of your technology prejudices.

We think it is important to know about as many good tools as possible to help make the best choices you can. Hopefully we've helped with one aspect of that.

Thanks for sticking with us through all of this! We had fun putting these tests together, and experienced some genuine surprises with the results. Hopefully others find it interesting too. Please let us know what you think or submit Github pull requests to help us out.

About TechEmpower

We provide web and mobile application development services and are passionate about application performance. Read more about what we do.

March 26, 2013

Everything about Java 8

The following post is a comprehensive summary of the developer-facing changes coming in Java 8. As of March 18, 2014, Java 8 is now generally available.

I used preview builds of IntelliJ for my IDE. It had the best support for the Java 8 language features at the time I went looking. You can find those builds here: IntelliJIDEA EAP.

Interface improvements

Interfaces can now define static methods. For instance, a naturalOrder method was added to java.util.Comparator:

public static <T extends Comparable<? super T>>
Comparator<T> naturalOrder() {
    return (Comparator<T>)
        Comparators.NaturalOrderComparator.INSTANCE;
}

A common scenario in Java libraries is, for some interface Foo, there would be a companion utility class Foos with static methods for generating or working with Foo instances. Now that static methods can exist on interfaces, in many cases the Foos utility class can go away (or be made package-private), with its public methods going on the interface instead.

Additionally, more importantly, interfaces can now define default methods. For instance, a forEach method was added to java.lang.Iterable:

public default void forEach(Consumer<? super T> action) {
    Objects.requireNonNull(action);
    for (T t : this) {
        action.accept(t);
    }
}

In the past it was essentially impossible for Java libraries to add methods to interfaces. Adding a method to an interface would mean breaking all existing code that implements the interface. Now, as long as a sensible default implementation of a method can be provided, library maintainers can add methods to these interfaces.

In Java 8, a large number of default methods have been added to core JDK interfaces. I'll discuss many of them later.

Why can't default methods override equals, hashCode, and toString?

An interface cannot provide a default implementation for any of the methods of the Object class. In particular, this means one cannot provide a default implementation for equals, hashCode, or toString from within an interface.

This seems odd at first, given that some interfaces actually define their equals behavior in documentation. The List interface is an example. So, why not allow this?

Brian Goetz gave four reasons in a lengthy response on the Project Lambda mailing list. I'll only describe one here, because that one was enough to convince me:

It would become more difficult to reason about when a default method is invoked. Right now it's simple: if a class implements a method, that always wins over a default implementation. Since all instances of interfaces are Objects, all instances of interfaces have non-default implementations of equals/hashCode/toString already. Therefore, a default version of these on an interface is always useless, and it may as well not compile.

For further reading, see this explanation written by Brian Goetz: response to "Allow default methods to override Object's methods"

Functional interfaces

A core concept introduced in Java 8 is that of a "functional interface". An interface is a functional interface if it defines exactly one abstract method. For instance, java.lang.Runnable is a functional interface because it only defines one abstract method:

public abstract void run();

Note that the "abstract" modifier is implied because the method lacks a body. It is not necessary to specify the "abstract" modifier, as this code does, in order to qualify as a functional interface.

Default methods are not abstract, so a functional interface can define as many default methods as it likes.

A new annotation, @FunctionalInterface, has been introduced. It can be placed on an interface to declare the intention of it being a functional interface. It will cause the interface to refuse to compile unless you've managed to make it a functional interface. It's sort of like @Override in this way; it declares intention and doesn't allow you to use it incorrectly.

Lambdas

An extremely valuable property of functional interfaces is that they can be instantiated using lambdas. Here are a few examples of lambdas:

Comma-separated list of inputs with specified types on the left, a block with a return on the right:

(int x, int y) -> { return x + y; }

Comma-separated list of inputs with inferred types on the left, a return value on the right:

(x, y) -> x + y

Single parameter with inferred type on the left, a return value on the right:

x -> x * x

No inputs on left (official name: "burger arrow"), return value on the right:

() -> x

Single parameter with inferred type on the left, a block with no return (void return) on the right:

x -> { System.out.println(x); }

Static method reference:

String::valueOf

Non-static method reference:

Object::toString

Capturing method reference:

x::toString

Constructor reference:

ArrayList::new

You can think of method reference forms as shorthand for the other lambda forms.

Method reference   Equivalent lambda expression
String::valueOf x -> String.valueOf(x)
Object::toString x -> x.toString()
x::toString () -> x.toString()
ArrayList::new () -> new ArrayList<>()

Of course, methods in Java can be overloaded. Classes can have multiple methods with the same name but different parameters. The same goes for its constructors. ArrayList::new could refer to any of its three constructors. The method it resolves to depends on which functional interface it's being used for.

A lambda is compatible with a given functional interface when their "shapes" match. By "shapes", I'm referring to the types of the inputs, outputs, and declared checked exceptions.

To give a couple of concrete, valid examples:

Comparator<String> c = (a, b) -> Integer.compare(a.length(),
                                                 b.length());

A Comparator<String>'s compare method takes two strings as input, and returns an int. That's consistent with the lambda on the right, so this assignment is valid.

Runnable r = () -> { System.out.println("Running!"); }

A Runnable's run method takes no arguments and does not have a return value. That's consistent with the lambda on the right, so this assignment is valid.

The checked exceptions (if present) in the abstract method's signature matter too. The lambda can only throw a checked exception if the functional interface declares that exception in its signature.

Capturing versus non-capturing lambdas

Lambdas are said to be "capturing" if they access a non-static variable or object that was defined outside of the lambda body. For example, this lambda captures the variable x:

int x = 5;
return y -> x + y;

In order for this lambda declaration to be valid, the variables it captures must be "effectively final". So, either they must be marked with the final modifier, or they must not be modified after they're assigned.

Whether a lambda is capturing or not has implications for performance. A non-capturing lambda is generally going to be more efficient than a capturing one. Although this is not defined in any specifications (as far as I know), and you shouldn't count on it for a program's correctness, a non-capturing lambda only needs to be evaluated once. From then on, it will return an identical instance. Capturing lambdas need to be evaluated every time they're encountered, and currently that performs much like instantiating a new instance of an anonymous class.

What lambdas don't do

There are a few features that lambdas don't provide, which you should keep in mind. They were considered for Java 8 but were not included, for simplicity and due to time constraints.

Non-final variable capture - If a variable is assigned a new value, it can't be used within a lambda. The "final" keyword is not required, but the variable must be "effectively final" (discussed earlier). This code does not compile:

int count = 0;
List<String> strings = Arrays.asList("a", "b", "c");
strings.forEach(s -> {
    count++; // error: can't modify the value of count
});

Exception transparency - If a checked exception may be thrown from inside a lambda, the functional interface must also declare that checked exception can be thrown. The exception is not propogated to the containing method. This code does not compile:

void appendAll(Iterable<String> values, Appendable out)
        throws IOException { // doesn't help with the error
    values.forEach(s -> {
        out.append(s); // error: can't throw IOException here
                       // Consumer.accept(T) doesn't allow it
    });
}

There are ways to work around this, where you can define your own functional interface that extends Consumer and sneaks the IOException through as a RuntimeException. I tried this out in code and found it to be too confusing to be worthwhile.

Control flow (break, early return) - In the forEach examples above, a traditional continue is possible by placing a "return;" statement within the lambda. However, there is no way to break out of the loop or return a value as the result of the containing method from within the lambda. For example:

final String secret = "foo";
boolean containsSecret(Iterable<String> values) {
    values.forEach(s -> {
        if (secret.equals(s)) {
            ??? // want to end the loop and return true, but can't
        }
    });
}

For further reading about these issues, see this explanation written by Brian Goetz: response to "Checked exceptions within Block<T>

Why abstract classes can't be instantiated using a lambda

An abstract class, even if it declares only one abstract method, cannot be instantiated with a lambda.

Two examples of classes with one abstract method are Ordering and CacheLoader from the Guava library. Wouldn't it be nice to be able to declare instances of them using lambdas like this?

Ordering<String> order = (a, b) -> ...;
CacheLoader<String, String> loader = (key) -> ...;

The most common argument against this was that it would add to the difficulty of reading a lambda. Instantiating an abstract class in this way could lead to execution of hidden code: that in the constructor of the abstract class.

Another reason is that it throws out possible optimizations for lambdas. In the future, it may be the case that lambdas are not evaluated into object instances. Letting users declare abstract classes with lambdas would prevent optimizations like this.

Besides, there's an easy workaround. Actually, the two example classes from Guava already demonstrate this workaround. Add factory methods to convert from a lambda to an instance:

Ordering<String> order = Ordering.from((a, b) -> ...);
CacheLoader<String, String> loader =
    CacheLoader.from((key) -> ...);

For further reading, see this explanation written by Brian Goetz: response to "Allow lambdas to implement abstract classes"

java.util.function

Package summary: java.util.function

As demonstrated earlier with Comparator and Runnable, interfaces already defined in the JDK that happen to be functional interfaces are compatible with lambdas. The same goes for any functional interfaces defined in your own code or in third party libraries.

But there are certain forms of functional interfaces that are widely, commonly useful, which did not exist previously in the JDK. A large number of these interfaces have been added to the new java.util.function package. Here are a few:

  • Function<T, R> - take a T as input, return an R as ouput
  • Predicate<T> - take a T as input, return a boolean as output
  • Consumer<T> - take a T as input, perform some action and don't return anything
  • Supplier<T> - with nothing as input, return a T
  • BinaryOperator<T> - take two T's as input, return one T as output, useful for "reduce" operations

Primitive specializations for most of these exist as well. They're provided in int, long, and double forms. For instance:

  • IntConsumer - take an int as input, perform some action and don't return anything

These exist for performance reasons, to avoid boxing and unboxing when the inputs or outputs are primitives.

java.util.stream

Package summary: java.util.stream

The new java.util.stream package provides utilities "to support functional-style operations on streams of values" (quoting the javadoc). Probably the most common way to obtain a stream will be from a collection:

Stream<T> stream = collection.stream();

A stream is something like an iterator. The values "flow past" (analogy to a stream of water) and then they're gone. A stream can only be traversed once, then it's used up. Streams may also be infinite.

Streams can be sequential or parallel. They start off as one and may be switched to the other using stream.sequential() or stream.parallel(). The actions of a sequential stream occur in serial fashion on one thread. The actions of a parallel stream may be happening all at once on multiple threads.

So, what do you do with a stream? Here is the example given in the package javadocs:

int sumOfWeights = blocks.stream().filter(b -> b.getColor() == RED)
                                  .mapToInt(b -> b.getWeight())
                                  .sum();

Note: The above code makes use of a primitive stream, and a sum() method is only available on primitive streams. There will be more detail on primitive streams shortly.

A stream provides a fluent API for transforming values and performing some action on the results. Stream operations are either "intermediate" or "terminal".

  • Intermediate - An intermediate operation keeps the stream open and allows further operations to follow. The filter and map methods in the example above are intermediate operations. The return type of these methods is Stream; they return the current stream to allow chaining of more operations.
  • Terminal - A terminal operation must be the final operation invoked on a stream. Once a terminal operation is invoked, the stream is "consumed" and is no longer usable. The sum method in the example above is a terminal operation.

Usually, dealing with a stream will involve these steps:

  1. Obtain a stream from some source.
  2. Perform one or more intermediate operations.
  3. Perform one terminal operation.

It's likely that you'll want to perform all those steps within one method. That way, you know the properties of the source and the stream and can ensure that it's used properly. You probably don't want to accept arbitrary Stream<T> instances as input to your method because they may have properties you're ill-equipped to deal with, such as being parallel or infinite.

There are a couple more general properties of stream operations to consider:

  • Stateful - A stateful operation imposes some new property on the stream, such as uniqueness of elements, or a maximum number of elements, or ensuring that the elements are consumed in sorted fashion. These are typically more expensive than stateless intermediate operations.
  • Short-circuiting - A short-circuiting operation potentially allows processing of a stream to stop early without examining all the elements. This is an especially desirable property when dealing with infinite streams; if none of the operations being invoked on a stream are short-circuiting, then the code may never terminate.

Here are short, general descriptions for each Stream method. See the javadocs for more thorough explanations. Links are provided below for each overloaded form of the operation.

Intermediate operations:

  • filter 1 - Exclude all elements that don't match a Predicate.
  • map 1 2 3 4 - Perform a one-to-one transformation of elements using a Function.
  • flatMap 1 2 3 4 - Transform each element into zero or more elements by way of another Stream.
  • peek 1 - Perform some action on each element as it is encountered. Primarily useful for debugging.
  • distinct 1 - Exclude all duplicate elements according to their .equals behavior. This is a stateful operation.
  • sorted 1 2 - Ensure that stream elements in subsequent operations are encountered according to the order imposed by a Comparator. This is a stateful operation.
  • limit 1 - Ensure that subsequent operations only see up to a maximum number of elements. This is a stateful, short-circuiting operation.
  • skip 1 - Ensure that subsequent operations do not see the first n elements. This is a stateful operation.

Terminal operations:

  • forEach 1 - Perform some action for each element in the stream.
  • toArray 1 2 - Dump the elements in the stream to an array.
  • reduce 1 2 3 - Combine the stream elements into one using a BinaryOperator.
  • collect 1 2 - Dump the elements in the stream into some container, such as a Collection or Map.
  • min 1 - Find the minimum element of the stream according to a Comparator.
  • max 1 - Find the maximum element of the stream according to a Comparator.
  • count 1 - Find the number of elements in the stream.
  • anyMatch 1 - Find out whether at least one of the elements in the stream matches a Predicate. This is a short-circuiting operation.
  • allMatch 1 - Find out whether every element in the stream matches a Predicate. This is a short-circuiting operation.
  • noneMatch 1 - Find out whether zero elements in the stream match a Predicate. This is a short-circuiting operation.
  • findFirst 1 - Find the first element in the stream. This is a short-circuiting operation.
  • findAny 1 - Find any element in the stream, which may be cheaper than findFirst for some streams. This is a short-circuiting operation.

As noted in the javadocs, intermediate operations are lazy. Only a terminal operation will start the processing of stream elements. At that point, no matter how many intermediate operations were included, the elements are then consumed in (usually, but not quite always) a single pass. (Stateful operations such as sorted() and distinct() may require a second pass over the elements.)

Streams try their best to do as little work as possible. There are micro-optimizations such as eliding a sorted() operation when it can determine the elements are already in order. In operations that include limit(x) or substream(x,y), a stream can sometimes avoid performing intermediate map operations on the elements it knows aren't necessary to determine the result. I'm not going to be able to do the implementation justice here; it's clever in lots of small but significant ways, and it's still improving.

Returning to the concept of parallel streams, it's important to note that parallelism is not free. It's not free from a performance standpoint, and you can't simply swap out a sequential stream for a parallel one and expect the results to be identical without further thought. There are properties to consider about your stream, its operations, and the destination for its data before you can (or should) parallelize a stream. For instance: Does encounter order matter to me? Are my functions stateless? Is my stream large enough and are my operations complex enough to make parallelism worthwhile?

There are primitive-specialized versions of Stream for ints, longs, and doubles:

One can convert back and forth between an object stream and a primitive stream using the primitive-specialized map and flatMap functions, among others. To give a few contrived examples:

List<String> strings = Arrays.asList("a", "b", "c");
strings.stream()                    // Stream<String>
       .mapToInt(String::length)    // IntStream
       .longs()                     // LongStream
       .mapToDouble(x -> x / 10.0)  // DoubleStream
       .boxed()                     // Stream<Double>
       .mapToLong(x -> 1L)          // LongStream
       .mapToObj(x -> "")           // Stream<String>
       ...

The primitive streams also provide methods for obtaining basic numeric statistics about the stream as a data structure. You can find the count, sum, min, max, and mean of the elements all from one terminal operation.

There are not primitive versions for the rest of the primitive types because it would have required an unacceptable amount of bloat in the JDK. IntStream, LongStream, and DoubleStream were deemed useful enough to include, and streams of other numeric primitives can represented using these three via widening primitive conversion.

One of the most confusing, intricate, and useful terminal stream operations is collect. It introduces a new interface called Collector. This interface is somewhat difficult to understand, but fortunately there is a Collectors utility class for generating all sorts of useful Collectors. For example:

List<String> strings = values.stream()
                             .filter(...)
                             .map(...)
                             .collect(Collectors.toList());

If you want to put your stream elements into a Collection, Map, or String, then Collectors probably has what you need. It's definitely worthwhile to browse through the javadoc of that class.

Generic type inference improvements

Summary of proposal: JEP 101: Generalized Target-Type Inference

This was an effort to improve the ability of the compiler to determine generic types where it was previously unable to. There were many cases in previous versions of Java where the compiler could not figure out the generic types for a method in the context of nested or chained method invocations, even when it seemed "obvious" to the programmer. Those situations required the programmer to explicitly specify a "type witness". It's a feature of generics that surprisingly few Java programmers know about (I'm saying this based on personal interactions and reading StackOverflow questions). It looks like this:

// In Java 7:
foo(Utility.<Type>bar());
Utility.<Type>foo().bar();

Without the type witnesses, the compiler might fill in <Object> as the generic type, and the code would fail to compile if a more specific type was required instead.

Java 8 improves this situation tremendously. In many more cases, it can figure out a more specific generic type based on the context.

// In Java 8:
foo(Utility.bar());
Utility.foo().bar();

This one is still a work in progress, so I'm not sure how many of the examples listed in the proposal will actually be included for Java 8. Hopefully it's all of them.

java.time

Package summary: java.time

The new date/time API in Java 8 is contained in the java.time package. If you're familiar with Joda Time, it will be really easy to pick up. Actually, I think it's so well-designed that even people who have never heard of Joda Time should find it easy to pick up.

Almost everything in the API is immutable, including the value types and the formatters. No more worrying about exposing Date fields or dealing with thread-local date formatters.

The intermingling with the legacy date/time API is minimal. It was a clean break:

The new API prefers enums over integer constants for things like months and days of the week.

So, what's in it? The package-level javadocs do an excellent job of explaining the additional types. I'll give a brief rundown of some noteworthy parts.

Extremely useful value types:

Less useful value types:

Other useful types:

  • DateTimeFormatter - for converting datetime objects to strings
  • ChronoUnit - for figuring out the amount of time bewteen two points, e.g. ChronoUnit.DAYS.between(t1, t2)
  • TemporalAdjuster - e.g. date.with(TemporalAdjuster.firstDayOfMonth())

The new value types are, for the most part, supported by JDBC. There are minor exceptions, such as ZonedDateTime which has no counterpart in SQL.

Collections API additions

The fact that interfaces can define default methods allowed the JDK authors to make a large number of additions to the collection API interfaces. Default implementations for these are provided on all the core interfaces, and more efficient or well-behaved overridden implementations were added to all the concrete classes, where applicable.

Here's a list of the new methods:

Also, Iterator.remove() now has a default, throwing implementation, which makes it slightly easier to define unmodifiable iterators.

Collection.stream() and Collection.parallelStream() are the main gateways into the stream API. There are other ways to generate streams, but those are going to be the most common by far.

The addition of List.sort(Comparator) is fantastic. Previously, the way to sort an ArrayList was this:

Collections.sort(list, comparator);

That code, which was your only option in Java 7, was frustratingly inefficient. It would dump the list into an array, sort the array, then use a ListIterator to insert the array contents into the list in new positions.

The default implementation of List.sort(Comparator) still does this, but concrete implementing classes are free to optimize. For instance, ArrayList.sort invokes Arrays.sort on the ArrayList's internal array. CopyOnWriteArrayList does the same.

Performance isn't the only potential gain from these new methods. They can have more desirable semantics, too. For instance, sorting a Collections.synchronizedList() is an atomic operation using list.sort. You can iterate over all its elements as an atomic operation using list.forEach. Previously this was not possible.

Map.computeIfAbsent makes working with multimap-like structures easier:

// Index strings by length:
Map<Integer, List<String>> map = new HashMap<>();
for (String s : strings) {
    map.computeIfAbsent(s.length(),
                        key -> new ArrayList<String>())
       .add(s);
}

// Although in this case the stream API may be a better choice:
Map<Integer, List<String>> map = strings.stream()
    .collect(Collectors.groupingBy(String::length));

Concurrency API additions

ForkJoinPool.commonPool() is the structure that handles all parallel stream operations. It is intended as an easy, good way to obtain a ForkJoinPool/ExecutorService/Executor when you need one.

ConcurrentHashMap<K, V> was completely rewritten. Internally it looks nothing like the version that was in Java 7. Externally it's mostly the same, except it has a large number of bulk operation methods: many forms of reduce, search, and forEach.

ConcurrentHashMap.newKeySet() provides a concurrent java.util.Set implementation. It is essentially another way of writing Collections.newSetFromMap(new ConcurrentHashMap<T, Boolean>()).

StampedLock is a new lock implementation that can probably replace ReentrantReadWriteLock in most cases. It performs better than RRWL when used as a plain read-write lock. Is also provides an API for "optimistic reads", where you obtain a weak, cheap version of a read lock, do the read operation, then check afterwards if your lock was invalidated by a write. There's more detail about this class and its performance in a set of slides put together by Heinz Kabutz (starting about half-way through the set of slides): "Phaser and StampedLock Presentation"

CompletableFuture<T> is a nice implementation of the Future interface that provides a ton of methods for performing (and chaining together) asynchronous tasks. It relies on functional interfaces heavily; lambdas are a big reason this class was worth adding. If you are currently using Guava's Future utilities, such as Futures, ListenableFuture, and SettableFuture, you may want to check out CompletableFuture as a potential replacement.

IO/NIO API additions

Most of these additions give you ways to obtain java.util.stream.Stream from files and InputStreams. They're a bit different from the streams you obtain from regular collections though. For one, they may throw UncheckedIOException. Also, they are instances of streams where using the stream.close() method is necessary. Streams implement AutoCloseable and can therefore be used in try-with-resources statements. Streams also have an onClose(Runnable) intermediate operation that I didn't list in the earlier section about streams. It allows you to attach handlers to a stream that execute when it is closed. Here is an example:

// Print the lines in a file, then "done"
try (Stream lines = Files.lines(path, UTF_8)) {
    lines.onClose(() -> System.out.println("done"))
	     .forEach(System.out::println);
}

Reflection and annotation changes

Annotations are allowed in more places, e.g. List<@Nullable String>. The biggest impact of this is likely to be for static analysis tools such as Sonar and FindBugs.

This JSR 308 website does a better job of explaining the motivation for these changes than I could possibly do: "Type Annotations (JSR 308) and the Checker Framework"

Nashorn JavaScript Engine

Summary of proposal: JEP 174: Nashorn JavaScript Engine

I did not experiment with Nashorn so I know very little beyond what's described in the proposal above. Short version: It's the successor to Rhino. Rhino is old and a little bit slow, and the developers decided they'd be better off starting from scratch.

Other miscellaneous additions to java.lang, java.util, and elsewhere

There is too much there to talk about, but I'll pick out a few noteworthy items.

ThreadLocal.withInitial(Supplier<T>) makes declaring thread-local variables with initial values much nicer. Previously you would supply an initial value like this:

ThreadLocal<List<String>> strings =
    new ThreadLocal<List<String>>() {
        @Override
        protected List<String> initialValue() {
             return new ArrayList<>();
        }
    };

Now it's like this:

ThreadLocal<List<String>> strings =
    ThreadLocal.withInital(ArrayList::new);

Optional<T> appears in the stream API as the return value for methods like min/max, findFirst/Any, and some forms of reduce. It's used because there might not be any elements in the stream, and it provides a fluent API for handling the "some result" versus "no result" cases. You can provide a default value, throw an exception, or execute some action only if the result exists.

It's very, very similar to Guava's Optional class. It's nothing at all like Option in Scala, nor is it trying to be, and the name similarity there is purely coincidental.

Aside: it's interesting that Java 8's Optional and Guava's Optional ended up being so similar, despite the absurd amount of debate that occurred over its addition to both libraries.

"FYI.... Optional was the cause of possibly the single greatest conflagration on the internal Java libraries discussion lists ever."

Kevin Bourrillion in response to "Some new Guava classes targeted for release 10"

"On a purely practical note, the discussions surrounding Optional have exceeded its design budget by several orders of magnitude."

Brian Goetz in response to "Optional require(s) NonNull"

StringJoiner and String.join(...) are long, long overdue. They are so long overdue that the vast majority of Java developers likely have already written or have found utilities for joining strings, but it is nice for the JDK to finally provide this itself. Everyone has encountered situations where joining strings is required, and it is a Good Thing™ that we can now express that through a standard API that every Java developer (eventually) will know.

Comparator provides some very nice new methods for doing chained comparisons and field-based comparisons. For example:

people.sort(
    Comparator.comparing(Person::getLastName)
        .thenComparing(Person::getFirstName)
        .thenComparing(
            Person::getEmailAddress,
            Comparator.nullsLast(CASE_INSENSITIVE_ORDER)));

These additions provide good, readable shorthand for complex sorts. Many of the use cases served by Guava's ComparisonChain and Ordering utility classes are now served by these JDK additions. And for what it's worth, I think the JDK verions read better than the functionally-equivalent versions expressed in Guava-ese.

More?

There are lots of various small bug fixes and performance improvements that were not covered in this post. But they are appreciated too!

This post was intended to cover every single language-level and API-level change coming in Java 8. If any were missed, it was an error that should be corrected. Please let me know if you discover an omission. You can contact me via e-mail.