Skip to content

Commit a7f5fb6

Browse files
committed
Add blogpost: Tribulations of CanBuildFrom
1 parent d07b589 commit a7f5fb6

File tree

2 files changed

+215
-0
lines changed

2 files changed

+215
-0
lines changed
Lines changed: 193 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,193 @@
1+
---
2+
layout: blog
3+
post-type: blog
4+
by: Julien Richard-Foy
5+
title: Tribulations of CanBuildFrom
6+
---
7+
8+
[`CanBuildFrom`](/api/2.12.2/scala/collection/generic/CanBuildFrom.html) is probably the most
9+
infamous abstraction of the current collections. It is mainly criticised for making scary type
10+
signatures.
11+
12+
Our ongoing [collections redesign](https://github.com/scala/collection-strawman) is an opportunity
13+
to try alternative designs. This blogposts explains the (many!) problems solved by `CanBuildFrom`
14+
and the alternative solutions implemented in the new collections.
15+
16+
## Transforming the elements of a collection
17+
18+
It’s useful to think of `String` as a collection of `Char` elements: you can then use
19+
the common collection operations like `++`, `find`, etc. on `String` values.
20+
21+
However the `map` method is challenging because this one
22+
transforms the `Char` elements into something that might or might not be `Char`s.
23+
Then, what should be the return type of the `map` method on `String` values? Ideally,
24+
we want to get back a `String` if we transform each `Char` into another `Char`, but we
25+
want to get some `Seq[B]` if we transform each `Char` into a different type `B`. And this
26+
is the way it currently works:
27+
28+
~~~
29+
Welcome to Scala 2.12.2 (OpenJDK 64-Bit Server VM, Java 1.8.0_131).
30+
Type in expressions for evaluation. Or try :help.
31+
32+
scala> "foo".map(c => c.toInt)
33+
res1: scala.collection.immutable.IndexedSeq[Int] = Vector(102, 111, 111)
34+
35+
scala> "foo".map(c => c.toUpper)
36+
res2: String = FOO
37+
~~~
38+
39+
This feature is not limited to the `map` method: `flatMap`, `collect`, `concat` and a few
40+
others also work the same. Moreover, `String` is not the only
41+
collection type that needs this feature: [`BitSet`](/api/2.12.2/index.html?search=bitset)
42+
and [`Map`](/api/2.12.2/index.html?search=map) are other examples.
43+
44+
The current collections rely on `CanBuildFrom` to implement this feature. The `map`
45+
method is defined as follows:
46+
47+
~~~ scala
48+
def map[B, That](f: Char => B)(implicit bf: CanBuildFrom[String, B, That]): That
49+
~~~
50+
51+
When the implicit `CanBuildFrom` parameter is resolved it fixes the return type `That`.
52+
The resolution is driven by the actual `B` type: if `B` is `Char` then `That` is fixed
53+
to `String`, otherwise it is `immutable.IndexedSeq`.
54+
55+
The drawback of this solution is that the type signature of the `map` method looks cryptic.
56+
57+
In the new design we solve this problem by defining two overloads of the `map`
58+
method: one that handles `Char` to `Char` transformations, and one that handles other
59+
transformations. The type signatures of these `map` methods are straightforward:
60+
61+
~~~ scala
62+
def map(f: Char => Char): String
63+
def map[B](f: Char => B): Seq[B]
64+
~~~
65+
66+
Then, if you call `map` with a function that returns a `Char`, the first overload is
67+
selected and you get a `String`. Otherwise, the second overload is selected and you
68+
get a `Seq[B]`.
69+
70+
Thus, we got rid of the cryptic method signatures while still supporting the feature
71+
of returning a different type of result according to the type of the transformation function.
72+
73+
## Kind polymorphism
74+
75+
The collections are hierarchically organized. Essentially, the most generic collection
76+
is `Iterable[A]`, and then we have three main kinds of collections: `Seq[A]`, `Set[A]`
77+
and `Map[K, V]`.
78+
79+
![](/resources/img/blog/collections-hierarchy.svg)
80+
81+
It is worth noting that `Map[K, V]` takes two type parameters (`K` and `V`) whereas the
82+
other collection types take only one type parameter. This makes it difficult to
83+
generically define, at the level of `Iterable[A]`, operations that will
84+
return a `Map[K, V]` when specialized.
85+
86+
For instance, consider again the case of the `map` method. We want to generically define
87+
it on `Iterable[A]`, but which return type should we use? When this method will
88+
be inherited by `List[A]` we want its return type to be `List[B]`, but when
89+
it will be inherited by `HashMap[K, V]`, we want its return type to be `HashMap[L, W]`.
90+
It is clear that we want to abstract over the type constructor of the concrete collections,
91+
but the difficulty is that they don’t always take the same number of type parameters.
92+
93+
That’s a second problem solved by `CanBuildFrom` in the current collections.
94+
Look again at the type signature of the (generic) `map` method on `Iterable[A]`:
95+
96+
~~~ scala
97+
def map[B, That](f: A => B)(implicit bf: CanBuildFrom[Repr, B, That]): That
98+
~~~
99+
100+
The return type `That` is inferred from the resolved `CanBuildFrom` instance at call-site.
101+
Both the `Repr` and `B` types actually drive the implicit resolution: when `Repr` is `List[_]`
102+
the parameter `That` is fixed to `List[B]`, and when `Repr` is `HashMap[_, _]` and `B` is a
103+
tuple `(K, V)` then `That` is fixed to `HashMap[K, V]`.
104+
105+
In the new design we solve this problem by defining two “branches” in the hierarchy:
106+
107+
- `IterableOps` for collections whose type constructor takes one parameter,
108+
- `MapOps` for collections whose type constructor takes two parameters.
109+
110+
Here is a simplified version of `IterableOps`:
111+
112+
~~~ scala
113+
trait IterableOps[A, CC[_]] {
114+
def map[B](f: A => B): CC[B]
115+
}
116+
~~~
117+
118+
The `CC` type parameter stands for *C*ollection type *C*onstructor. Then, the `List[A]`
119+
concrete collection extends `IterableOps[A, List]` to set its correct self-type constructor.
120+
121+
Similarly, here is a simplified version of `MapOps`:
122+
123+
~~~ scala
124+
trait MapOps[K, V, CC[_, _]] extends IterableOps[(K, V), Iterable] {
125+
def map[L, W](f: ((K, V)) => (L, W)): CC[L, W]
126+
}
127+
~~~
128+
129+
And then the `HashMap[K, V]` concrete collection extends `MapOps[K, V, HashMap]` to set
130+
its correct self-type constructor. Note that `MapOps` extends `IterableOps`: consequently it
131+
inherits from its `map` method, which will be selected when the transformation function
132+
passed to `map` does not return a tuple.
133+
134+
## Sorted collections
135+
136+
The third challenge is about sorted collections (like `TreeSet` and `TreeMap`, for instance).
137+
These collections define their order of iteration according to an ordering relationship for the
138+
type of their elements.
139+
140+
As a consequence, when you transform the type of the elements (e.g. by using the -- now familiar! --
141+
`map` method), an implicit ordering instance for the new type of elements has to be available.
142+
143+
With `CanBuildFrom`, the solution relies (again) on the implicit resolution mechanism:
144+
the implicit `CanBuildFrom[TreeSet[_], X, TreeSet[X]]` instance is available for some
145+
type `X` only if an implicit `Ordering[X]` instance is also available.
146+
147+
In the new design we solve this problem by introducing a new branch in the hierarchy.
148+
This one defines transformation operations that require an ordering instance for the element
149+
type of the resulting collection:
150+
151+
~~~ scala
152+
trait SortedIterableOps[A, CC[_]] {
153+
def map[B : Ordering](f: A => B): CC[B]
154+
}
155+
~~~
156+
157+
However, as mentioned in the previous section, we need to also abstract over the kind of the
158+
type constructor of the concrete collections. Consequently we have in total four branches:
159+
160+
kind | not sorted | sorted
161+
------------|-------------|-------------------
162+
`CC[_]` |`IterableOps`|`SortedIterableOps`
163+
`CC[_, _]` |`MapOps` |`SortedMapOps`
164+
165+
In summary, instead of having one `map` method that supports all the use cases described in
166+
this section and the previous ones, we specialized the hierarchy to have overloads of
167+
the `map` method, each one supporting a specific use case. The benefit is that the type
168+
signatures immediately tell you the story: you don’t have to have a look at the actual
169+
implicit resolution to know the result you will get from calling `map`.
170+
171+
## Implicit builders
172+
173+
In the current collections, the fact that `CanBuildFrom` instances are available in the
174+
implicit scope is useful to implement, separately from the collections, generic operations
175+
that work with any collection type.
176+
177+
Examples of use cases are:
178+
179+
- [`Future.traverse`](https://github.com/scala/scala/blob/92ffe04070f25452b8d48ee7fbced587ddafbf6d/src/library/scala/concurrent/Future.scala#L822-L840)
180+
- type-driven builders (e.g. in [play-json](https://github.com/playframework/play-json/blob/8642c485c79e32263b7bef5f991abb486523b3ef/play-json/shared/src/main/scala/Reads.scala#L144-L170), or [slick](https://github.com/slick/slick/blob/51e14f2756ed29b8c92a24b0ae24f2acd0b85c6f/slick/src/main/scala/slick/jdbc/PositionedResult.scala#L150-L154))
181+
- extension methods (e.g. in [scala-extensions](https://github.com/cvogt/scala-extensions/blob/master/src/main/scala/collection.scala#L14-L28))
182+
183+
In the new design we are still experimenting with solutions to support these features. So far
184+
the decision is to not put implicit builders in the collections implementation. We might
185+
provide them as an optional dependency instead, but it seems that most of these use cases
186+
could be supported even without implicit builders: you could just use an existing collection
187+
instance and navigate through its companion object (providing the builder), or you could just
188+
use the companion object directly to get a builder.
189+
190+
## Summary
191+
192+
In this article we have reviewed the features built on top of `CanBuildFrom` and explained
193+
the design decision we made for the new collections to support these features without `CanBuildFrom`.
Lines changed: 22 additions & 0 deletions
Loading

0 commit comments

Comments
 (0)