|
| 1 | +--- |
| 2 | +layout: blog-detail |
| 3 | +post-type: blog |
| 4 | +by: Julien Richard-Foy |
| 5 | +title: On Performance of the New Collections |
| 6 | +--- |
| 7 | + |
| 8 | +In a [previous blog post](/blog/2017/11/28/view-based-collections.html), I explained |
| 9 | +how [Scala 2.13’s new collections](http://www.scala-lang.org/blog/2017/02/28/collections-rework.html) |
| 10 | +have been designed so that the default implementations of transformation operations work |
| 11 | +with both strict and non-strict types of collections. In essence, we abstract over |
| 12 | +the evaluation mode (strict or non strict) of concrete collection types. |
| 13 | + |
| 14 | +After we published that blog post, the community |
| 15 | +[raised concerns](https://www.reddit.com/r/scala/comments/7g52cy/let_them_be_lazy/dqgol36/) |
| 16 | +about possible performance implications of having more levels of abstraction than before. |
| 17 | + |
| 18 | +This blog article gives more information about the overhead of the |
| 19 | +collections’ view-based design and our solution to remove that |
| 20 | +overhead. |
| 21 | + |
| 22 | +For reference, the source code of the new collections is available in |
| 23 | +[this GitHub repository](https://github.com/scala/collection-strawman). |
| 24 | + |
| 25 | +## Overhead Of View Based Implementations |
| 26 | + |
| 27 | +Let’s be clear: the view based implementations are in general slower than their |
| 28 | +builder based versions. How much slower exactly varies with the type of collection |
| 29 | +(e.g. `List`, `Vector`, `Set`), the operation (e.g. `map`, `flatMap`, `filter`) |
| 30 | +and the number of elements in the collection. In my benchmark on `Vector`, on |
| 31 | +the `map`, `filter` and `flatMap` operations, with 1 element to 7 million of |
| 32 | +elements, I measured an average slowdown of 25%. |
| 33 | + |
| 34 | +## How To Fix That Performance Regression? |
| 35 | + |
| 36 | +Our solution is simply to go back to builder based implementations for strict collections: we |
| 37 | +override the default view based implementations with more efficient builder based |
| 38 | +ones. We actually and up with the same implementations as in the old collections. |
| 39 | + |
| 40 | +In practice these implementations are factored out in traits that can be mixed |
| 41 | +into concrete collection types. Such trait names are always prefixed with |
| 42 | +`StrictOptimized`. For instance, here is an excerpt of the `StrictOptimizedIterableOps` |
| 43 | +trait: |
| 44 | + |
| 45 | +~~~ scala |
| 46 | +trait StrictOptimizedIterableOps[+A, +CC[_], +C] extends IterableOps[A, CC, C] { |
| 47 | + |
| 48 | + override def map[B](f: A => B): CC[B] = { |
| 49 | + val b = iterableFactory.newBuilder[B]() |
| 50 | + val it = iterator() |
| 51 | + while (it.hasNext) { |
| 52 | + b += f(it.next()) |
| 53 | + } |
| 54 | + b.result() |
| 55 | + } |
| 56 | + |
| 57 | +} |
| 58 | +~~~ |
| 59 | + |
| 60 | +Then, to implement the `Vector` collection, we just mix the above trait: |
| 61 | + |
| 62 | +~~~ scala |
| 63 | +trait Vector[+A] extends IndexedSeq[A] |
| 64 | + with IndexedSeqOps[A, Vector, Vector[A]] |
| 65 | + with StrictOptimizedSeqOps[A, Vector, Vector[A]] |
| 66 | +~~~ |
| 67 | + |
| 68 | +Here we use `StrictOptimizedSeqOps`, which is a specialization of `StrictOptimizedIterableOps` |
| 69 | +for `Seq` collections. |
| 70 | + |
| 71 | +## Is The View Based Design Worth It? |
| 72 | + |
| 73 | +In my previous article I explained that a drawback of the old builder based design was that, |
| 74 | +on non strict collections (e.g. `Stream` or `View`), we had to carefully override all the |
| 75 | +default implementations of transformation operations to make them non strict. |
| 76 | + |
| 77 | +Now it seems that the situation is just reversed: the default implementations work well |
| 78 | +with non strict collections, but we have to override them in strict collections. |
| 79 | + |
| 80 | +So, is the new design worth it? To answer this question I will quote a comment posted |
| 81 | +by Stefan Zeiger [here](https://www.reddit.com/r/scala/comments/7g52cy/let_them_be_lazy/dqixt8d/): |
| 82 | + |
| 83 | +> The lazy-by-default approach is mostly beneficial when you're implementing lazy |
| 84 | +> collections because you don't have to override pretty much everything or get |
| 85 | +> incorrect semantics. The reverse risk is smaller: If you don't override a lazy |
| 86 | +> implementation for a strict collection type you only suffer a small performance |
| 87 | +> impact but it's still correct. |
| 88 | +
|
| 89 | +In short: implementations are **correct first** in the new design but you might want to |
| 90 | +override them for performance reasons on strict collections. |
| 91 | + |
| 92 | +## Performance Comparison With 2.12’s Collections |
| 93 | + |
| 94 | +Talking about performance, how performant are the new collections compared to the old ones? |
| 95 | + |
| 96 | +Again, the answer depends on the type of collection, the operations and the number of elements. |
| 97 | +My `Vector` benchmarks show a 20% speedup on average: |
| 98 | + |
| 99 | + |
| 100 | + |
| 101 | + |
| 102 | + |
| 103 | + |
| 104 | + |
| 105 | +These charts show the execution time (vertically) of the `filter`, `map` and `flatMap` |
| 106 | +operations, according to the number of elements (horizontally). Note that scales are |
| 107 | +logarithmic in both axis. The blue line shows the performance of the old `Vector`, |
| 108 | +the green line shows the performance of the new `Vector` if it used only view based |
| 109 | +implementations, and the red line shows the actual performance of the new `Vector` |
| 110 | +(with strict optimized implementations). Benchmark source code and numbers can be found |
| 111 | +[here](https://gist.github.com/julienrf/f1cb2b062cd9783a35e2f35778959c76). |
| 112 | + |
| 113 | +Since operation implementations end up being the same, why do we get better performance |
| 114 | +at all? Well, these numbers are specific to `Vector`, and are due to the fact that |
| 115 | +we more agressively inlined a few critical methods. I don’t expect the new collections |
| 116 | +to be *always* 20% faster than the old collections. However, there is no reason for |
| 117 | +them to be slower since the execution path, when calling an operation, can be made |
| 118 | +exactly the same as in the old collections. |
| 119 | + |
| 120 | +## Conclusion |
| 121 | + |
| 122 | +This article studied the performance of the new collections. I’ve reported that view |
| 123 | +based operation implementations are about 25% slower than builder based implementations |
| 124 | +and I’ve explained how we restored builder based implementations on strict collections. |
| 125 | + |
| 126 | +I expect the new collections to be as fast or slightly faster than the previous collections. |
| 127 | +Indeed, we took advantage of the rewrite to apply some more optimizations here and |
| 128 | +again. |
| 129 | + |
| 130 | +More significant performance improvements can be achieved by using different |
| 131 | +data structures. For instance, we recently |
| 132 | +[merged](https://github.com/scala/collection-strawman/pull/342) |
| 133 | +a completely new implementation of immutable `Set` and `Map` based on [compressed |
| 134 | +hash-array mapped prefix-trees](https://michael.steindorfer.name/publications/oopsla15.pdf). |
| 135 | +This data structure has a smaller memory footprint than the old `HashSet` and `HashMap`, |
| 136 | +and some operations are an order of magnitude faster. |
0 commit comments