Skip to content

Add blog article about the new collections performance #832

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 9, 2018
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
139 changes: 139 additions & 0 deletions blog/_posts/2018-02-07-collections-performance.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
---
layout: blog-detail
post-type: blog
by: Julien Richard-Foy
title: On Performance of the New Collections
---

In a [previous blog post](/blog/2017/11/28/view-based-collections.html), I explained
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would group all those paragraphs. until Overhead Of View Based Implementations

how [Scala 2.13’s new collections](http://www.scala-lang.org/blog/2017/02/28/collections-rework.html)
have been designed so that the default implementations of transformation operations work
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is designed such

with both strict and non-strict types of collections. In essence, we abstract over
the evaluation mode (strict or non strict) of concrete collection types.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-strict


After we published that blog post, the community
[raised concerns](https://www.reddit.com/r/scala/comments/7g52cy/let_them_be_lazy/dqgol36/)
about possible performance implications of having more levels of abstraction than before.

This blog article gives more information about the overhead of the
collections’ view-based design and our solution to remove that
overhead.

For reference, the source code of the new collections is available in
[this GitHub repository](https://github.com/scala/collection-strawman).

## Overhead Of View Based Implementations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

View-based


Let’s be clear, the view based implementations are in general slower than their
builder based versions. How much slower exactly varies with the type of collection
(e.g. `List`, `Vector`, `Set`), the operation (e.g. `map`, `flatMap`, `filter`)
and the number of elements in the collection. In my benchmark on `Vector`, on
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our

the `map`, `filter` and `flatMap` operations, with 1 to 7 million of
elements, I measured an average slowdown of 25%.

## How To Fix That Performance Regression?

Our solution is simply to go back to builder based implementations for strict collections: we
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

builder-based

override the default view based implementations with more efficient builder based
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

view-based

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

builder-based

ones. We actually end up with the same implementations as in the old collections.

In practice these implementations are factored out in traits that can be mixed
into concrete collection types. Such trait names are always prefixed with
`StrictOptimized`. For instance, here is an excerpt of the `StrictOptimizedIterableOps`
trait:

~~~ scala
trait StrictOptimizedIterableOps[+A, +CC[_], +C] extends IterableOps[A, CC, C] {

override def map[B](f: A => B): CC[B] = {
val b = iterableFactory.newBuilder[B]()
val it = iterator()
while (it.hasNext) {
b += f(it.next())
}
b.result()
}

}
~~~

Then, to implement the `Vector` collection, we just mix the above trait:

~~~ scala
trait Vector[+A] extends IndexedSeq[A]
with IndexedSeqOps[A, Vector, Vector[A]]
with StrictOptimizedSeqOps[A, Vector, Vector[A]]
~~~

Here we use `StrictOptimizedSeqOps`, which is a specialization of `StrictOptimizedIterableOps`
for `Seq` collections.

## Is The View Based Design Worth It?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

View-Based


In my previous article, I explained a drawback of the old builder based design.
On non strict collections (e.g. `Stream` or `View`), we had to carefully override all the
default implementations of transformation operations to make them non strict.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-strict


Now it seems that the situation is just reversed: the default implementations work well
with non strict collections, but we have to override them in strict collections.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-strict


So, is the new design worth it? To answer this question I will quote a comment posted
by Stefan Zeiger [here](https://www.reddit.com/r/scala/comments/7g52cy/let_them_be_lazy/dqixt8d/):

> The lazy-by-default approach is mostly beneficial when you're implementing lazy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

most beneficial

> collections because you don't have to override pretty much everything or get
> incorrect semantics. The reverse risk is smaller: If you don't override a lazy
> implementation for a strict collection type you only suffer a small performance
> impact but it's still correct.

In short, implementations are **correct first** in the new design but you might want to
override them for performance reasons on strict collections.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence seems like the crux of the whole post. I suggest saying something this right at the top. Many people will read partway in and then bail, so you should hit the takeaways early, then proceed with more detailed explanations.


## Performance Comparison With 2.12’s Collections

Talking about performance, how performant are the new collections compared to the old ones?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I suggest including some brief answer to this very near the top of the whole blog post.


Again, the answer depends on the type of collection, the operations and the number of elements.
My `Vector` benchmarks show a 35% speedup on average:

![](/resources/img/new-collections-performance-filter.png)
Copy link
Contributor

@MasseGuillaume MasseGuillaume Feb 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unit for the y-axis? s or ms?

the title should be: Vector.filter (log-scaled)


![](/resources/img/new-collections-performance-map.png)

![](/resources/img/new-collections-performance-flatMap.png)

These charts show the execution time (vertically) of the `filter`, `map` and `flatMap`
operations, according to the number of elements (horizontally). Note that scales are
logarithmic in both axes. The blue line shows the performance of the old `Vector`,
the green line shows the performance of the new `Vector` if it used only view based
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

view-based

implementations, and the red line shows the actual performance of the new `Vector`
(with strict optimized implementations). Benchmark source code and numbers can be found
[here](https://gist.github.com/julienrf/f1cb2b062cd9783a35e2f35778959c76).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the logarithmic scale it's not obvious to see the speedup factor. could you say something about it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mention that the scale is logarithmic and I’ve given the speedup factors in the text (view-based implementations are 25% slower than builder-based ones, and the new collections are 35% faster than the old collections).

What should I add?

Copy link
Member

@lrytz lrytz Feb 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I missed the mention of 35 above the charts. I guess my concern is that the graphs the thing that really jumps at people's eyes. The current for with the logarithmic scale doesn't really visualize the nice improvements. Maybe you could do some bars on a linear scale, normalize the old collection to 1, and add time values to the bars? Something like https://docs.google.com/spreadsheets/d/1Jcw3iG5sx_Xo-3svjb_qmB4Fprz5sEX7nCFpCd4EcWQ/edit?usp=sharing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was considering of doing something exactly like that! OK, let’s do it then.


Since operation implementations end up being the same, why do we get better performance
at all? Well, these numbers are specific to `Vector` and the tested operations, they
are due to the fact that
we more aggressively inlined a few critical methods. I don’t expect the new collections
to be *always* faster than the old collections. However, there is no reason for
them to be slower since the execution path, when calling an operation, can be made
exactly the same as in the old collections.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly the same => the same


## Conclusion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would include "correct first" somewhere in the conclusion, it's an interesting point worth repeating.


This article studied the performance of the new collections. I’ve reported that view
based operation implementations are about 25% slower than builder based implementations,
and I’ve explained how we restored builder based implementations on strict collections.
Last but not least, I’ve shown that defaulting to the view based implementations does
make sense for the sake of correctness.

I expect the new collections to be equally fast or slightly faster than the previous collections.
Indeed, we took advantage of the rewrite to apply some more optimizations here and
again.

More significant performance improvements can be achieved by using different
data structures. For instance, we recently
[merged](https://github.com/scala/collection-strawman/pull/342)
a completely new implementation of immutable `Set` and `Map` based on [compressed
hash-array mapped prefix-trees](https://michael.steindorfer.name/publications/oopsla15.pdf).
This data structure has a smaller memory footprint than the old `HashSet` and `HashMap`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you include concrete numbers here? Sounds amazing!

and some operations can be an order of magnitude faster (e.g. `==` is up to 7x faster).
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added resources/img/new-collections-performance-map.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.