Skip to content

Commit 596bea4

Browse files
committed
WIP blog article about the new collections
1 parent 210dd53 commit 596bea4

File tree

7 files changed

+199
-0
lines changed

7 files changed

+199
-0
lines changed
Lines changed: 199 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,199 @@
1+
---
2+
layout: blog-detail
3+
post-type: blog
4+
by: Julien Richard-Foy
5+
title: Scala 2.13’s Collections
6+
---
7+
8+
One more article about the standard collections, really? Indeed, during the last
9+
18 months a lot of work has been done on the collections and we’ve published
10+
several blog articles and given several talks to the explain various changes or
11+
challenges we were facing. This article attempts to summarize **what is going
12+
to change from an end-user perspective**.
13+
14+
In case you’ve thoroughly followed our previous blog posts and talks, you might
15+
not learn much from this article. Otherwise, this is the perfect opportunity
16+
to catch up on the topic in a few minutes!
17+
18+
The next section presents the internal changes in the collections implementation
19+
that might have some visible impact on the surface. Then, I will show why I think
20+
that the removal of `CanBuildFrom` made the API more beginner friendly. Next, I
21+
will introduce some new operations available in the collections. Finally, I
22+
will mention the main deprecations, the motivation behind them, and their
23+
recommended replacement.
24+
25+
## Under The Hood: A Cleaner Ground
26+
27+
![iceberg](/resources/img/blog/iceberg.jpeg)
28+
29+
The most important change in the new collections framework is that transformation
30+
operations (such as `map` or `filter`) are now implemented in a way that works with both
31+
strict collections (such as `List`) and non-strict collections (such as `Stream`).
32+
This is a change because this was not the case before. Indeed, the previous
33+
implementations were strict and had to be overridden by non-strict collection types.
34+
You can find more details about that in
35+
[this blog post](/blog/2017/11/28/view-based-collections.html).
36+
37+
The good news is that the new design is more **correct** in the sense that you can
38+
now implement custom non-strict collection types without having to worry about
39+
re-implementing a ton of operations. Another benefit is that transformation
40+
operations defined outside of the collections (like in the
41+
[cvogt/scala-extensions](https://github.com/cvogt/scala-extensions) project)
42+
now work with non-strict collections (such as `View` or `Stream`).
43+
44+
Speaking of non-strict collections, the `View` type has been redesigned and
45+
views should behave in a more predictable way. Also, `Stream` has been
46+
deprecated in favor of `LazyList` (see the last section).
47+
48+
## Life Without `CanBuildFrom`
49+
50+
I think the most visible change for end-users is that transformation operations
51+
don’t use `CanBuildFrom` anymore. I believe this will be quite visible despite our previous
52+
efforts to *hide* `CanBuildFrom` from the API documentation in the current collections.
53+
Indeed, if you take a look at the
54+
[current `List` API](/api/2.12.6/scala/collection/immutable/List.html), the signature
55+
shown for the `map` operation does not mention `CanBuildFrom`:
56+
57+
![there is no CanBuildFrom](/resources/img/blog/scaladoc-list-map.png)
58+
59+
However, if you use this operation in your code, then your IDE reveals its actual signature:
60+
61+
![what is That](/resources/img/blog/ij-list-map.png)
62+
63+
As you can see, the type signature shown in the API documentation has been “simplified”
64+
to make it more approachable, but I believe that this is probably introducing more
65+
confusion to the users. Especially when you look at the
66+
[`TreeMap[A, B]` API](/api/2.12.6/scala/collection/immutable/TreeMap.html):
67+
68+
![wtf](/resources/img/blog/scaladoc-treemap-map.png)
69+
70+
This type signature makes no sense: the result type can not be `TreeMap[B]` since
71+
`TreeMap` takes *two* type parameters (the type of keys and the type
72+
of values). Also, the function `f` actually takes a *key-value pair* as parameter,
73+
not just a key (as incorrectly indicated by the type `A`).
74+
75+
`CanBuildFrom` was used for good reasons, in particular the type `That` shown
76+
in the above screenshot was *computed* according to the type of the source
77+
collection and the type of elements of the new collection. The case of `TreeMap`
78+
is compelling: in case you transform your key-value pairs into other key-value
79+
pairs for which the type of keys has an implicit `Ordering` instance, then `map`
80+
returns a `TreeMap`, but if there is no such `Ordering` instance then the best
81+
collection type that can be returned is `Map`. And if you transform the key-value
82+
pairs into something that is not even a pair, then the best collection type
83+
that can be returned is `Iterable`. These three cases were supported by
84+
a single operation implementation, and `CanBuildFrom` was used to abstract over
85+
the various possible return types.
86+
87+
In the new collections we wanted to have simpler type signatures so that we
88+
can shoulder their actual form in the API documentation and auto-completion
89+
provided by IDEs is not scary. We achieve that by using overloading, as
90+
explained in more details in
91+
[this blog article](/blog/2017/05/30/tribulations-canbuildfrom.html).
92+
93+
In practice, this means that the new `TreeMap` has three overloads of the
94+
`map` operation:
95+
96+
![](/resources/img/blog/scaladoc-new-treemap-map.png)
97+
98+
These type signatures are the actual ones and they essentially translate
99+
“in types” what I’ve written above about the possible result types of `map`
100+
according to the type of elements returned by the transformation function `f`.
101+
I believe that the new API is simpler to understand.
102+
103+
## New And Noteworthy
104+
105+
We have introduced a few new operations. The following sections
106+
present some of them.
107+
108+
### `groupMap`
109+
110+
A common pattern with the current collection is to use `groupBy`
111+
followed by `mapValues` to transform the groups. For instance,
112+
this is how we can index the names of a collection of users by
113+
their age:
114+
115+
~~~ scala
116+
case class User(name: String, age: Int)
117+
118+
def namesByAge(users: Seq[User]): Map[Int, Seq[String]] =
119+
users.groupBy(_.age).mapValues(users => users.map(_.name))
120+
~~~
121+
122+
There is a subtlety in this code. The static return type is `Map`
123+
but the `Map` implementation actually returned is lazy and evaluates
124+
its elements each time it is traversed (ie the `users => users.map(_.name)`
125+
function is evaluated each time the `Map` is traversed).
126+
127+
In the new collections the return type of `mapValues` is a `MapView` instead
128+
of a `Map`, to clearly indicate that its contents is evaluated each time it
129+
is traversed.
130+
131+
Furthermore, we have introduced an operation named `groupMap`
132+
that both groups elements and transforms the groups. The above code
133+
can be rewritten as follows to take advantage of `groupMap`:
134+
135+
~~~
136+
def namesByAge(users: Seq[User]): Map[Int, Seq[String]] =
137+
users.groupMap(_.age)(_.name)
138+
~~~
139+
140+
The returned `Map` is strict: it eagerly evaluates its elements
141+
once. Also, the fact that it is implemented as a single operation
142+
makes it possible to apply some optimizations that make it
143+
~1.3x faster than the version that uses `mapValues`.
144+
145+
### `InPlace` Transformation Operations
146+
147+
Mutable collections have a couple of new operations for transforming
148+
their elements in place: instead of returning a new collection (like
149+
`map` and `filter` do) they mutate the source collection. These
150+
operations are suffixed with `InPlace`. For instance, to remove
151+
users whose name start with the letter `J` from a buffer and then
152+
increment their age, one can now write:
153+
154+
~~~ scala
155+
val users = ArrayBuffer(…)
156+
users
157+
.filterInPlace(user => !user.name.startsWith("J"))
158+
.mapInPlace(user => user.copy(age = user.age + 1))
159+
~~~
160+
161+
## Deprecations For Less Confusion
162+
163+
A consequence of cleaning and simplifying the collections framework
164+
is that several types or operations have been deprecated.
165+
166+
### `Iterable` Is The Top Collection Type
167+
168+
TBD
169+
170+
### `LazyList` Is Preferred Over `Stream`
171+
172+
`Stream` is deprecated in favor of `LazyList`. As its name suggests,
173+
a `LazyList` is a linked list whose elements are lazily evaluated. An
174+
important semantic difference with `Stream` is that in `LazyList` both
175+
the head and the tail are lazy, whereas in `Stream` only the tail is lazy.
176+
177+
### Insertion And Removal Operations Are Not Available On Generic Collections
178+
179+
In the current framework, the `scala.collection.Map` type has a `+` and a `-` operations
180+
to add and remove entries. The semantics of these operations is to return a new collection
181+
with the added or removed entries, without changing the source collection.
182+
183+
These operations are then inherited by the mutable branch of the collections. But the mutable
184+
collection types also introduce their own insertion and removal operations, namely `+=` and `-=`,
185+
which modify the source collection in place. This means that the `scala.collection.mutable.Map` type
186+
has `+` and `+=`, as well as `-` and `-=`.
187+
188+
Having all these operations can be handy in some cases but can also introduce confusion. If you want
189+
to use `+` or `-`, then you probably wanted to use an immutable collection type in the first place…
190+
Another example is the `updated` operation, which is available on mutable `Map` but returns a new
191+
collection.
192+
193+
We think that by deprecating these insertion and removal operations from generic collection
194+
types and by having distinct operations between the `mutable` and `immutable` branches we make
195+
the situation clearer.
196+
197+
## Summary
198+
199+
TBD

resources/img/blog/iceberg.jpeg

49.7 KB
Loading

resources/img/blog/ij-list-map.png

16.6 KB
Loading
12.2 KB
Loading
10.5 KB
Loading
25.4 KB
Loading
11.6 KB
Loading

0 commit comments

Comments
 (0)