Skip to content

Commit 2d86bdc

Browse files
committed
Add blog article about the new collections performance
1 parent ea6565e commit 2d86bdc

File tree

4 files changed

+136
-0
lines changed

4 files changed

+136
-0
lines changed
Lines changed: 136 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
---
2+
layout: blog-detail
3+
post-type: blog
4+
by: Julien Richard-Foy
5+
title: On Performance of the New Collections
6+
---
7+
8+
In a [previous blog post](/blog/2017/11/28/view-based-collections.html), I explained
9+
how [Scala 2.13’s new collections](http://www.scala-lang.org/blog/2017/02/28/collections-rework.html)
10+
have been designed so that the default implementations of transformation operations work
11+
with both strict and non-strict types of collections. In essence, we abstract over
12+
the evaluation mode (strict or non strict) of concrete collection types.
13+
14+
After we published that blog post, the community
15+
[raised concerns](https://www.reddit.com/r/scala/comments/7g52cy/let_them_be_lazy/dqgol36/)
16+
about possible performance implications of having more levels of abstraction than before.
17+
18+
This blog article gives more information about the overhead of the
19+
collections’ view-based design and our solution to remove that
20+
overhead.
21+
22+
For reference, the source code of the new collections is available in
23+
[this GitHub repository](https://github.com/scala/collection-strawman).
24+
25+
## Overhead Of View Based Implementations
26+
27+
Let’s be clear: the view based implementations are in general slower than their
28+
builder based versions. How much slower exactly varies with the type of collection
29+
(e.g. `List`, `Vector`, `Set`), the operation (e.g. `map`, `flatMap`, `filter`)
30+
and the number of elements in the collection. In my benchmark on `Vector`, on
31+
the `map`, `filter` and `flatMap` operations, with 1 element to 7 million of
32+
elements, I measured an average slowdown of 25%.
33+
34+
## How To Fix That Performance Regression?
35+
36+
Our solution is simply to go back to builder based implementations for strict collections: we
37+
override the default view based implementations with more efficient builder based
38+
ones. We actually and up with the same implementations as in the old collections.
39+
40+
In practice these implementations are factored out in traits that can be mixed
41+
into concrete collection types. Such trait names are always prefixed with
42+
`StrictOptimized`. For instance, here is an excerpt of the `StrictOptimizedIterableOps`
43+
trait:
44+
45+
~~~ scala
46+
trait StrictOptimizedIterableOps[+A, +CC[_], +C] extends IterableOps[A, CC, C] {
47+
48+
override def map[B](f: A => B): CC[B] = {
49+
val b = iterableFactory.newBuilder[B]()
50+
val it = iterator()
51+
while (it.hasNext) {
52+
b += f(it.next())
53+
}
54+
b.result()
55+
}
56+
57+
}
58+
~~~
59+
60+
Then, to implement the `Vector` collection, we just mix the above trait:
61+
62+
~~~ scala
63+
trait Vector[+A] extends IndexedSeq[A]
64+
with IndexedSeqOps[A, Vector, Vector[A]]
65+
with StrictOptimizedSeqOps[A, Vector, Vector[A]]
66+
~~~
67+
68+
Here we use `StrictOptimizedSeqOps`, which is a specialization of `StrictOptimizedIterableOps`
69+
for `Seq` collections.
70+
71+
## Is The View Based Design Worth It?
72+
73+
In my previous article I explained that a drawback of the old builder based design was that,
74+
on non strict collections (e.g. `Stream` or `View`), we had to carefully override all the
75+
default implementations of transformation operations to make them non strict.
76+
77+
Now it seems that the situation is just reversed: the default implementations work well
78+
with non strict collections, but we have to override them in strict collections.
79+
80+
So, is the new design worth it? To answer this question I will quote a comment posted
81+
by Stefan Zeiger [here](https://www.reddit.com/r/scala/comments/7g52cy/let_them_be_lazy/dqixt8d/):
82+
83+
> The lazy-by-default approach is mostly beneficial when you're implementing lazy
84+
> collections because you don't have to override pretty much everything or get
85+
> incorrect semantics. The reverse risk is smaller: If you don't override a lazy
86+
> implementation for a strict collection type you only suffer a small performance
87+
> impact but it's still correct.
88+
89+
In short: implementations are **correct first** in the new design but you might want to
90+
override them for performance reasons on strict collections.
91+
92+
## Performance Comparison With 2.12’s Collections
93+
94+
Talking about performance, how performant are the new collections compared to the old ones?
95+
96+
Again, the answer depends on the type of collection, the operations and the number of elements.
97+
My `Vector` benchmarks show a 20% speedup on average:
98+
99+
![](/resources/img/new-collections-performance-filter.png)
100+
101+
![](/resources/img/new-collections-performance-map.png)
102+
103+
![](/resources/img/new-collections-performance-flatMap.png)
104+
105+
These charts show the execution time (vertically) of the `filter`, `map` and `flatMap`
106+
operations, according to the number of elements (horizontally). Note that scales are
107+
logarithmic in both axis. The blue line shows the performance of the old `Vector`,
108+
the green line shows the performance of the new `Vector` if it used only view based
109+
implementations, and the red line shows the actual performance of the new `Vector`
110+
(with strict optimized implementations). Benchmark source code and numbers can be found
111+
[here](https://gist.github.com/julienrf/f1cb2b062cd9783a35e2f35778959c76).
112+
113+
Since operation implementations end up being the same, why do we get better performance
114+
at all? Well, these numbers are specific to `Vector`, and are due to the fact that
115+
we more agressively inlined a few critical methods. I don’t expect the new collections
116+
to be *always* 20% faster than the old collections. However, there is no reason for
117+
them to be slower since the execution path, when calling an operation, can be made
118+
exactly the same as in the old collections.
119+
120+
## Conclusion
121+
122+
This article studied the performance of the new collections. I’ve reported that view
123+
based operation implementations are about 25% slower than builder based implementations
124+
and I’ve explained how we restored builder based implementations on strict collections.
125+
126+
I expect the new collections to be as fast or slightly faster than the previous collections.
127+
Indeed, we took advantage of the rewrite to apply some more optimizations here and
128+
again.
129+
130+
More significant performance improvements can be achieved by using different
131+
data structures. For instance, we recently
132+
[merged](https://github.com/scala/collection-strawman/pull/342)
133+
a completely new implementation of immutable `Set` and `Map` based on [compressed
134+
hash-array mapped prefix-trees](https://michael.steindorfer.name/publications/oopsla15.pdf).
135+
This data structure has a smaller memory footprint than the old `HashSet` and `HashMap`,
136+
and some operations are an order of magnitude faster.
Loading
Loading
Loading

0 commit comments

Comments
 (0)