From 9ff18e4d8b4c9663271c9032a28c2b3109c9ec50 Mon Sep 17 00:00:00 2001 From: Luc Henninger Date: Mon, 3 Oct 2022 19:16:18 +0200 Subject: [PATCH 1/7] Add code tabs for collections-2.13/arrays --- _overviews/collections-2.13/arrays.md | 224 +++++++++++++++++++------- 1 file changed, 169 insertions(+), 55 deletions(-) diff --git a/_overviews/collections-2.13/arrays.md b/_overviews/collections-2.13/arrays.md index b06b4c9361..b772397e23 100644 --- a/_overviews/collections-2.13/arrays.md +++ b/_overviews/collections-2.13/arrays.md @@ -14,23 +14,35 @@ permalink: /overviews/collections-2.13/:title.html [Array](https://www.scala-lang.org/api/{{ site.scala-version }}/scala/Array.html) is a special kind of collection in Scala. On the one hand, Scala arrays correspond one-to-one to Java arrays. That is, a Scala array `Array[Int]` is represented as a Java `int[]`, an `Array[Double]` is represented as a Java `double[]` and a `Array[String]` is represented as a Java `String[]`. But at the same time, Scala arrays offer much more than their Java analogues. First, Scala arrays can be _generic_. That is, you can have an `Array[T]`, where `T` is a type parameter or abstract type. Second, Scala arrays are compatible with Scala sequences - you can pass an `Array[T]` where a `Seq[T]` is required. Finally, Scala arrays also support all sequence operations. Here's an example of this in action: - scala> val a1 = Array(1, 2, 3) - a1: Array[Int] = Array(1, 2, 3) - scala> val a2 = a1 map (_ * 3) - a2: Array[Int] = Array(3, 6, 9) - scala> val a3 = a2 filter (_ % 2 != 0) - a3: Array[Int] = Array(3, 9) - scala> a3.reverse - res0: Array[Int] = Array(9, 3) +{% tabs arrays_1 %} +{% tab 'Scala 2 and 3' for=arrays_1 %} +``` +scala> val a1 = Array(1, 2, 3) +a1: Array[Int] = Array(1, 2, 3) +scala> val a2 = a1 map (_ * 3) +a2: Array[Int] = Array(3, 6, 9) +scala> val a3 = a2 filter (_ % 2 != 0) +a3: Array[Int] = Array(3, 9) +scala> a3.reverse +res0: Array[Int] = Array(9, 3) +``` +{% endtab %} +{% endtabs %} Given that Scala arrays are represented just like Java arrays, how can these additional features be supported in Scala? The Scala array implementation makes systematic use of implicit conversions. In Scala, an array does not pretend to _be_ a sequence. It can't really be that because the data type representation of a native array is not a subtype of `Seq`. Instead there is an implicit "wrapping" conversion between arrays and instances of class `scala.collection.mutable.ArraySeq`, which is a subclass of `Seq`. Here you see it in action: - scala> val seq: collection.Seq[Int] = a1 - seq: scala.collection.Seq[Int] = ArraySeq(1, 2, 3) - scala> val a4: Array[Int] = seq.toArray - a4: Array[Int] = Array(1, 2, 3) - scala> a1 eq a4 - res1: Boolean = false +{% tabs arrays_2 %} +{% tab 'Scala 2 and 3' for=arrays_2 %} +``` +scala> val seq: collection.Seq[Int] = a1 +seq: scala.collection.Seq[Int] = ArraySeq(1, 2, 3) +scala> val a4: Array[Int] = seq.toArray +a4: Array[Int] = Array(1, 2, 3) +scala> a1 eq a4 +res1: Boolean = false +``` +{% endtab %} +{% endtabs %} The interaction above demonstrates that arrays are compatible with sequences, because there's an implicit conversion from arrays to `ArraySeq`s. To go the other way, from an `ArraySeq` to an `Array`, you can use the `toArray` method defined in `Iterable`. The last REPL line above shows that wrapping and then unwrapping with `toArray` produces a copy of the original array. @@ -38,82 +50,184 @@ There is yet another implicit conversion that gets applied to arrays. This conve The difference between the two implicit conversions on arrays is shown in the next REPL dialogue: - scala> val seq: collection.Seq[Int] = a1 - seq: scala.collection.Seq[Int] = ArraySeq(1, 2, 3) - scala> seq.reverse - res2: scala.collection.Seq[Int] = ArraySeq(3, 2, 1) - scala> val ops: collection.ArrayOps[Int] = a1 - ops: scala.collection.ArrayOps[Int] = scala.collection.ArrayOps@2d7df55 - scala> ops.reverse - res3: Array[Int] = Array(3, 2, 1) +{% tabs arrays_3 %} +{% tab 'Scala 2 and 3' for=arrays_3 %} +``` +scala> val seq: collection.Seq[Int] = a1 +seq: scala.collection.Seq[Int] = ArraySeq(1, 2, 3) +scala> seq.reverse +res2: scala.collection.Seq[Int] = ArraySeq(3, 2, 1) +scala> val ops: collection.ArrayOps[Int] = a1 +ops: scala.collection.ArrayOps[Int] = scala.collection.ArrayOps@2d7df55 +scala> ops.reverse +res3: Array[Int] = Array(3, 2, 1) +``` +{% endtab %} +{% endtabs %} You see that calling reverse on `seq`, which is an `ArraySeq`, will give again a `ArraySeq`. That's logical, because arrayseqs are `Seqs`, and calling reverse on any `Seq` will give again a `Seq`. On the other hand, calling reverse on the ops value of class `ArrayOps` will give an `Array`, not a `Seq`. The `ArrayOps` example above was quite artificial, intended only to show the difference to `ArraySeq`. Normally, you'd never define a value of class `ArrayOps`. You'd just call a `Seq` method on an array: - scala> a1.reverse - res4: Array[Int] = Array(3, 2, 1) +{% tabs arrays_4 %} +{% tab 'Scala 2 and 3' for=arrays_4 %} +``` +scala> a1.reverse +res4: Array[Int] = Array(3, 2, 1) +``` +{% endtab %} +{% endtabs %} The `ArrayOps` object gets inserted automatically by the implicit conversion. So the line above is equivalent to - scala> intArrayOps(a1).reverse - res5: Array[Int] = Array(3, 2, 1) +{% tabs arrays_5 %} +{% tab 'Scala 2 and 3' for=arrays_5 %} +``` +scala> intArrayOps(a1).reverse +res5: Array[Int] = Array(3, 2, 1) +``` +{% endtab %} +{% endtabs %} where `intArrayOps` is the implicit conversion that was inserted previously. This raises the question of how the compiler picked `intArrayOps` over the other implicit conversion to `ArraySeq` in the line above. After all, both conversions map an array to a type that supports a reverse method, which is what the input specified. The answer to that question is that the two implicit conversions are prioritized. The `ArrayOps` conversion has a higher priority than the `ArraySeq` conversion. The first is defined in the `Predef` object whereas the second is defined in a class `scala.LowPriorityImplicits`, which is inherited by `Predef`. Implicits in subclasses and subobjects take precedence over implicits in base classes. So if both conversions are applicable, the one in `Predef` is chosen. A very similar scheme works for strings. So now you know how arrays can be compatible with sequences and how they can support all sequence operations. What about genericity? In Java, you cannot write a `T[]` where `T` is a type parameter. How then is Scala's `Array[T]` represented? In fact a generic array like `Array[T]` could be at run-time any of Java's eight primitive array types `byte[]`, `short[]`, `char[]`, `int[]`, `long[]`, `float[]`, `double[]`, `boolean[]`, or it could be an array of objects. The only common run-time type encompassing all of these types is `AnyRef` (or, equivalently `java.lang.Object`), so that's the type to which the Scala compiler maps `Array[T]`. At run-time, when an element of an array of type `Array[T]` is accessed or updated there is a sequence of type tests that determine the actual array type, followed by the correct array operation on the Java array. These type tests slow down array operations somewhat. You can expect accesses to generic arrays to be three to four times slower than accesses to primitive or object arrays. This means that if you need maximal performance, you should prefer concrete to generic arrays. Representing the generic array type is not enough, however, there must also be a way to create generic arrays. This is an even harder problem, which requires a little of help from you. To illustrate the issue, consider the following attempt to write a generic method that creates an array. - // this is wrong! - def evenElems[T](xs: Vector[T]): Array[T] = { - val arr = new Array[T]((xs.length + 1) / 2) - for (i <- 0 until xs.length by 2) - arr(i / 2) = xs(i) - arr - } +{% tabs arrays_6 class=tabs-scala-version %} +{% tab 'Scala 2' for=arrays_6 %} +```scala mdoc:fail +// this is wrong! +def evenElems[T](xs: Vector[T]): Array[T] = { + val arr = new Array[T]((xs.length + 1) / 2) + for (i <- 0 until xs.length by 2) + arr(i / 2) = xs(i) + arr +} +``` +{% endtab %} +{% tab 'Scala 3' for=arrays_6 %} +```scala +// this is wrong! +def evenElems[T](xs: Vector[T]): Array[T] = + val arr = new Array[T]((xs.length + 1) / 2) + for i <- 0 until xs.length by 2 do + arr(i / 2) = xs(i) + arr +``` +{% endtab %} +{% endtabs %} The `evenElems` method returns a new array that consist of all elements of the argument vector `xs` which are at even positions in the vector. The first line of the body of `evenElems` creates the result array, which has the same element type as the argument. So depending on the actual type parameter for `T`, this could be an `Array[Int]`, or an `Array[Boolean]`, or an array of some other primitive types in Java, or an array of some reference type. But these types have all different runtime representations, so how is the Scala runtime going to pick the correct one? In fact, it can't do that based on the information it is given, because the actual type that corresponds to the type parameter `T` is erased at runtime. That's why you will get the following error message if you compile the code above: - error: cannot find class manifest for element type T - val arr = new Array[T]((arr.length + 1) / 2) - ^ +{% tabs arrays_7 class=tabs-scala-version %} +{% tab 'Scala 2' for=arrays_7 %} +``` +error: cannot find class manifest for element type T + val arr = new Array[T]((arr.length + 1) / 2) + ^ +``` +{% endtab %} +{% tab 'Scala 3' for=arrays_7 %} +``` +-- Error: ---------------------------------------------------------------------- +3 | val arr = new Array[T]((xs.length + 1) / 2) + | ^ + | No ClassTag available for T +``` +{% endtab %} +{% endtabs %} What's required here is that you help the compiler out by providing some runtime hint what the actual type parameter of `evenElems` is. This runtime hint takes the form of a class manifest of type `scala.reflect.ClassTag`. A class manifest is a type descriptor object which describes what the top-level class of a type is. Alternatively to class manifests there are also full manifests of type `scala.reflect.Manifest`, which describe all aspects of a type. But for array creation, only class manifests are needed. The Scala compiler will construct class manifests automatically if you instruct it to do so. "Instructing" means that you demand a class manifest as an implicit parameter, like this: - def evenElems[T](xs: Vector[T])(implicit m: ClassTag[T]): Array[T] = ... +{% tabs arrays_8 class=tabs-scala-version %} +{% tab 'Scala 2' for=arrays_8 %} +``` +def evenElems[T](xs: Vector[T])(implicit m: ClassTag[T]): Array[T] = ... +``` +{% endtab %} +{% tab 'Scala 3' for=arrays_8 %} +``` +def evenElems[T](xs: Vector[T])(using m: ClassTag[T]): Array[T] = ... +``` +{% endtab %} +{% endtabs %} Using an alternative and shorter syntax, you can also demand that the type comes with a class manifest by using a context bound. This means following the type with a colon and the class name `ClassTag`, like this: - import scala.reflect.ClassTag - // this works - def evenElems[T: ClassTag](xs: Vector[T]): Array[T] = { - val arr = new Array[T]((xs.length + 1) / 2) - for (i <- 0 until xs.length by 2) - arr(i / 2) = xs(i) - arr - } +{% tabs arrays_9 class=tabs-scala-version %} +{% tab 'Scala 2' for=arrays_9 %} +```scala +import scala.reflect.ClassTag +// this works +def evenElems[T: ClassTag](xs: Vector[T]): Array[T] = { + val arr = new Array[T]((xs.length + 1) / 2) + for (i <- 0 until xs.length by 2) + arr(i / 2) = xs(i) + arr +} +``` +{% endtab %} +{% tab 'Scala 3' for=arrays_9 %} +```scala +import scala.reflect.ClassTag +// this works +def evenElems[T: ClassTag](xs: Vector[T]): Array[T] = + val arr = new Array[T]((xs.length + 1) / 2) + for i <- 0 until xs.length by 2 do + arr(i / 2) = xs(i) + arr +``` +{% endtab %} +{% endtabs %} The two revised versions of `evenElems` mean exactly the same. What happens in either case is that when the `Array[T]` is constructed, the compiler will look for a class manifest for the type parameter T, that is, it will look for an implicit value of type `ClassTag[T]`. If such a value is found, the manifest is used to construct the right kind of array. Otherwise, you'll see an error message like the one above. Here is some REPL interaction that uses the `evenElems` method. - scala> evenElems(Vector(1, 2, 3, 4, 5)) - res6: Array[Int] = Array(1, 3, 5) - scala> evenElems(Vector("this", "is", "a", "test", "run")) - res7: Array[java.lang.String] = Array(this, a, run) +{% tabs arrays_10 %} +{% tab 'Scala 2 and 3' for=arrays_10 %} +``` +scala> evenElems(Vector(1, 2, 3, 4, 5)) +res6: Array[Int] = Array(1, 3, 5) +scala> evenElems(Vector("this", "is", "a", "test", "run")) +res7: Array[java.lang.String] = Array(this, a, run) +``` +{% endtab %} +{% endtabs %} In both cases, the Scala compiler automatically constructed a class manifest for the element type (first, `Int`, then `String`) and passed it to the implicit parameter of the `evenElems` method. The compiler can do that for all concrete types, but not if the argument is itself another type parameter without its class manifest. For instance, the following fails: - scala> def wrap[U](xs: Vector[U]) = evenElems(xs) - :6: error: No ClassTag available for U. - def wrap[U](xs: Vector[U]) = evenElems(xs) - ^ +{% tabs arrays_11 class=tabs-scala-version %} +{% tab 'Scala 2' for=arrays_11 %} +``` +scala> def wrap[U](xs: Vector[U]) = evenElems(xs) +:6: error: No ClassTag available for U. + def wrap[U](xs: Vector[U]) = evenElems(xs) + ^ +``` +{% endtab %} +{% tab 'Scala 3' for=arrays_11 %} +``` +-- Error: ---------------------------------------------------------------------- +6 |def wrap[U](xs: Vector[U]) = evenElems(xs) + | ^ + | No ClassTag available for U +``` +{% endtab %} +{% endtabs %} What happened here is that the `evenElems` demands a class manifest for the type parameter `U`, but none was found. The solution in this case is, of course, to demand another implicit class manifest for `U`. So the following works: - scala> def wrap[U: ClassTag](xs: Vector[U]) = evenElems(xs) - wrap: [U](xs: Vector[U])(implicit evidence$1: scala.reflect.ClassTag[U])Array[U] +{% tabs arrays_12 %} +{% tab 'Scala 2 and 3' for=arrays_12 %} +``` +scala> def wrap[U: ClassTag](xs: Vector[U]) = evenElems(xs) +wrap: [U](xs: Vector[U])(implicit evidence$1: scala.reflect.ClassTag[U])Array[U] +``` +{% endtab %} +{% endtabs %} This example also shows that the context bound in the definition of `U` is just a shorthand for an implicit parameter named here `evidence$1` of type `ClassTag[U]`. From 4e2d59ed19dad35f71d01030c106b43d440be682 Mon Sep 17 00:00:00 2001 From: Luc Henninger Date: Mon, 3 Oct 2022 19:18:48 +0200 Subject: [PATCH 2/7] Add code tabs for collections-2.13/strings --- _overviews/collections-2.13/strings.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/_overviews/collections-2.13/strings.md b/_overviews/collections-2.13/strings.md index 485410df49..0fb78d0dc4 100644 --- a/_overviews/collections-2.13/strings.md +++ b/_overviews/collections-2.13/strings.md @@ -14,6 +14,9 @@ permalink: /overviews/collections-2.13/:title.html Like arrays, strings are not directly sequences, but they can be converted to them, and they also support all sequence operations on strings. Here are some examples of operations you can invoke on strings. +{% tabs strings_1 %} +{% tab 'Scala 2 and 3' for=strings_1 %} + scala> val str = "hello" str: java.lang.String = hello scala> str.reverse @@ -27,4 +30,7 @@ Like arrays, strings are not directly sequences, but they can be converted to th scala> val s: Seq[Char] = str s: Seq[Char] = hello +{% endtab %} +{% endtabs %} + These operations are supported by two implicit conversions. The first, low-priority conversion maps a `String` to a `WrappedString`, which is a subclass of `immutable.IndexedSeq`, This conversion got applied in the last line above where a string got converted into a Seq. The other, high-priority conversion maps a string to a `StringOps` object, which adds all methods on immutable sequences to strings. This conversion was implicitly inserted in the method calls of `reverse`, `map`, `drop`, and `slice` in the example above. From 56228979de1aebd03437254dd5d4aa1976614269 Mon Sep 17 00:00:00 2001 From: Luc Henninger Date: Mon, 3 Oct 2022 19:22:14 +0200 Subject: [PATCH 3/7] Add code tabs for collections-2.13/equality --- _overviews/collections-2.13/equality.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/_overviews/collections-2.13/equality.md b/_overviews/collections-2.13/equality.md index 3f6d249f83..b73c5cabca 100644 --- a/_overviews/collections-2.13/equality.md +++ b/_overviews/collections-2.13/equality.md @@ -16,6 +16,9 @@ The collection libraries have a uniform approach to equality and hashing. The id It does not matter for the equality check whether a collection is mutable or immutable. For a mutable collection one simply considers its current elements at the time the equality test is performed. This means that a mutable collection might be equal to different collections at different times, depending on what elements are added or removed. This is a potential trap when using a mutable collection as a key in a hashmap. Example: +{% tabs equality_1 %} +{% tab 'Scala 2 and 3' for=equality_1 %} + scala> import collection.mutable.{HashMap, ArrayBuffer} import collection.mutable.{HashMap, ArrayBuffer} scala> val buf = ArrayBuffer(1, 2, 3) @@ -31,4 +34,7 @@ It does not matter for the equality check whether a collection is mutable or imm java.util.NoSuchElementException: key not found: ArrayBuffer(2, 2, 3) +{% endtab %} +{% endtabs %} + In this example, the selection in the last line will most likely fail because the hash-code of the array `buf` has changed in the second-to-last line. Therefore, the hash-code-based lookup will look at a different place than the one where `buf` was stored. From 9bbb3c3e1448bda7ec529c2c4b952dcb6a74d1b9 Mon Sep 17 00:00:00 2001 From: Luc Henninger Date: Mon, 3 Oct 2022 21:36:18 +0200 Subject: [PATCH 4/7] Add code tabs for collections-2.13/views --- _overviews/collections-2.13/views.md | 100 ++++++++++++++++++++++++++- 1 file changed, 97 insertions(+), 3 deletions(-) diff --git a/_overviews/collections-2.13/views.md b/_overviews/collections-2.13/views.md index edf0dde6b1..6f32795a56 100644 --- a/_overviews/collections-2.13/views.md +++ b/_overviews/collections-2.13/views.md @@ -18,9 +18,21 @@ There are two principal ways to implement transformers. One is _strict_, that is As an example of a non-strict transformer consider the following implementation of a lazy map operation: - def lazyMap[T, U](coll: Iterable[T], f: T => U) = new Iterable[U] { - def iterator = coll.iterator map f - } +{% tabs views_1 class=tabs-scala-version %} +{% tab 'Scala 2' for=views_1 %} +```scala mdoc +def lazyMap[T, U](coll: Iterable[T], f: T => U) = new Iterable[U] { + def iterator = coll.iterator map f +} +``` +{% endtab %} +{% tab 'Scala 3' for=views_1 %} +```scala +def lazyMap[T, U](coll: Iterable[T], f: T => U) = new Iterable[U]: + def iterator = coll.iterator map f +``` +{% endtab %} +{% endtabs %} Note that `lazyMap` constructs a new `Iterable` without stepping through all elements of the given collection `coll`. The given function `f` is instead applied to the elements of the new collection's `iterator` as they are demanded. @@ -30,6 +42,9 @@ To go from a collection to its view, you can use the `view` method on the collec Let's see an example. Say you have a vector of Ints over which you want to map two functions in succession: +{% tabs views_2 %} +{% tab 'Scala 2 and 3' for=views_2 %} + scala> val v = Vector(1 to 10: _*) v: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) @@ -37,36 +52,69 @@ Let's see an example. Say you have a vector of Ints over which you want to map t res5: scala.collection.immutable.Vector[Int] = Vector(4, 6, 8, 10, 12, 14, 16, 18, 20, 22) +{% endtab %} +{% endtabs %} + In the last statement, the expression `v map (_ + 1)` constructs a new vector which is then transformed into a third vector by the second call to `map (_ * 2)`. In many situations, constructing the intermediate result from the first call to map is a bit wasteful. In the example above, it would be faster to do a single map with the composition of the two functions `(_ + 1)` and `(_ * 2)`. If you have the two functions available in the same place you can do this by hand. But quite often, successive transformations of a data structure are done in different program modules. Fusing those transformations would then undermine modularity. A more general way to avoid the intermediate results is by turning the vector first into a view, then applying all transformations to the view, and finally forcing the view to a vector: +{% tabs views_3 %} +{% tab 'Scala 2 and 3' for=views_3 %} + scala> (v.view map (_ + 1) map (_ * 2)).to(Vector) res12: scala.collection.immutable.Vector[Int] = Vector(4, 6, 8, 10, 12, 14, 16, 18, 20, 22) +{% endtab %} +{% endtabs %} + Let's do this sequence of operations again, one by one: +{% tabs views_4 %} +{% tab 'Scala 2 and 3' for=views_4 %} + scala> val vv = v.view vv: scala.collection.IndexedSeqView[Int] = IndexedSeqView() +{% endtab %} +{% endtabs %} + The application `v.view` gives you an `IndexedSeqView[Int]`, i.e. a lazily evaluated `IndexedSeq[Int]`. Like with `LazyList`, the `toString` operation of views does not force the view elements, that’s why the content of `vv` is shown as `IndexedSeqView()`. Applying the first `map` to the view gives: +{% tabs views_5 %} +{% tab 'Scala 2 and 3' for=views_5 %} + scala> vv map (_ + 1) res13: scala.collection.IndexedSeqView[Int] = IndexedSeqView() +{% endtab %} +{% endtabs %} + The result of the `map` is another `IndexedSeqView[Int]` value. This is in essence a wrapper that *records* the fact that a `map` with function `(_ + 1)` needs to be applied on the vector `v`. It does not apply that map until the view is forced, however. Let's now apply the second `map` to the last result. +{% tabs views_6 %} +{% tab 'Scala 2 and 3' for=views_6 %} + scala> res13 map (_ * 2) res14: scala.collection.IndexedSeqView[Int] = IndexedSeqView() +{% endtab %} +{% endtabs %} + Finally, forcing the last result gives: +{% tabs views_7 %} +{% tab 'Scala 2 and 3' for=views_7 %} + scala> res14.to(Vector) res15: scala.collection.immutable.Vector[Int] = Vector(4, 6, 8, 10, 12, 14, 16, 18, 20, 22) +{% endtab %} +{% endtabs %} + Both stored functions get applied as part of the execution of the `to` operation and a new vector is constructed. That way, no intermediate data structure is needed. In general, transformation operations applied to views never build a new data structure, and accessing the elements of a view @@ -84,33 +132,79 @@ These operations are documented as “always forcing the collection elements”. The main reason for using views is performance. You have seen that by switching a collection to a view the construction of intermediate results can be avoided. These savings can be quite important. As another example, consider the problem of finding the first palindrome in a list of words. A palindrome is a word which reads backwards the same as forwards. Here are the necessary definitions: +{% tabs views_8 %} +{% tab 'Scala 2 and 3' for=views_8 %} + def isPalindrome(x: String) = x == x.reverse def findPalindrome(s: Seq[String]) = s find isPalindrome +{% endtab %} +{% endtabs %} + Now, assume you have a very long sequence words, and you want to find a palindrome in the first million words of that sequence. Can you re-use the definition of `findPalindrome`? Of course, you could write: +{% tabs views_9 %} +{% tab 'Scala 2 and 3' for=views_9 %} + findPalindrome(words take 1000000) +{% endtab %} +{% endtabs %} + This nicely separates the two aspects of taking the first million words of a sequence and finding a palindrome in it. But the downside is that it always constructs an intermediary sequence consisting of one million words, even if the first word of that sequence is already a palindrome. So potentially, 999'999 words are copied into the intermediary result without being inspected at all afterwards. Many programmers would give up here and write their own specialized version of finding palindromes in some given prefix of an argument sequence. But with views, you don't have to. Simply write: +{% tabs views_10 %} +{% tab 'Scala 2 and 3' for=views_10 %} + findPalindrome(words.view take 1000000) +{% endtab %} +{% endtabs %} + This has the same nice separation of concerns, but instead of a sequence of a million elements it will only construct a single lightweight view object. This way, you do not need to choose between performance and modularity. After having seen all these nifty uses of views you might wonder why have strict collections at all? One reason is that performance comparisons do not always favor lazy over strict collections. For smaller collection sizes the added overhead of forming and applying closures in views is often greater than the gain from avoiding the intermediary data structures. A probably more important reason is that evaluation in views can be very confusing if the delayed operations have side effects. Here's an example which bit a few users of versions of Scala before 2.8. In these versions the `Range` type was lazy, so it behaved in effect like a view. People were trying to create a number of actors like this: +{% tabs views_11 class=tabs-scala-version %} +{% tab 'Scala 2' for=views_11 %} + val actors = for (i <- 1 to 10) yield actor { ... } +{% endtab %} +{% tab 'Scala 3' for=views_11 %} + + val actors = for i <- 1 to 10 yield actor { ... } + +{% endtab %} +{% endtabs %} + They were surprised that none of the actors was executing afterwards, even though the actor method should create and start an actor from the code that's enclosed in the braces following it. To explain why nothing happened, remember that the for expression above is equivalent to an application of map: +{% tabs views_12 %} +{% tab 'Scala 2 and 3' for=views_12 %} + val actors = (1 to 10) map (i => actor { ... }) +{% endtab %} +{% endtabs %} + Since previously the range produced by `(1 to 10)` behaved like a view, the result of the map was again a view. That is, no element was computed, and, consequently, no actor was created! Actors would have been created by forcing the range of the whole expression, but it's far from obvious that this is what was required to make the actors do their work. To avoid surprises like this, the current Scala collections library has more regular rules. All collections except lazy lists and views are strict. The only way to go from a strict to a lazy collection is via the `view` method. The only way to go back is via `to`. So the `actors` definition above would now behave as expected in that it would create and start 10 actors. To get back the surprising previous behavior, you'd have to add an explicit `view` method call: +{% tabs views_13 class=tabs-scala-version %} +{% tab 'Scala 2' for=views_13 %} + val actors = for (i <- (1 to 10).view) yield actor { ... } +{% endtab %} +{% tab 'Scala 3' for=views_13 %} + + val actors = for i <- (1 to 10).view yield actor { ... } + +{% endtab %} +{% endtabs %} + In summary, views are a powerful tool to reconcile concerns of efficiency with concerns of modularity. But in order not to be entangled in aspects of delayed evaluation, you should restrict views to purely functional code where collection transformations do not have side effects. What's best avoided is a mixture of views and operations that create new collections while also having side effects. From 6dd6ba0575d67c78be1e1f50b293c39b9ea35eb9 Mon Sep 17 00:00:00 2001 From: Luc Henninger Date: Mon, 3 Oct 2022 22:17:31 +0200 Subject: [PATCH 5/7] Add code tabs for collections-2.13/iterators --- _overviews/collections-2.13/views.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_overviews/collections-2.13/views.md b/_overviews/collections-2.13/views.md index 6f32795a56..d065f1e4d3 100644 --- a/_overviews/collections-2.13/views.md +++ b/_overviews/collections-2.13/views.md @@ -20,7 +20,7 @@ As an example of a non-strict transformer consider the following implementation {% tabs views_1 class=tabs-scala-version %} {% tab 'Scala 2' for=views_1 %} -```scala mdoc +```scala def lazyMap[T, U](coll: Iterable[T], f: T => U) = new Iterable[U] { def iterator = coll.iterator map f } From 3c72a45a1a30382571ec98a3f01807589944b3ee Mon Sep 17 00:00:00 2001 From: Luc Henninger Date: Mon, 3 Oct 2022 22:56:22 +0200 Subject: [PATCH 6/7] Add code tabs for collections-2.13/views --- _overviews/collections-2.13/views.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/_overviews/collections-2.13/views.md b/_overviews/collections-2.13/views.md index d065f1e4d3..7f6fa3c337 100644 --- a/_overviews/collections-2.13/views.md +++ b/_overviews/collections-2.13/views.md @@ -20,16 +20,16 @@ As an example of a non-strict transformer consider the following implementation {% tabs views_1 class=tabs-scala-version %} {% tab 'Scala 2' for=views_1 %} -```scala -def lazyMap[T, U](coll: Iterable[T], f: T => U) = new Iterable[U] { - def iterator = coll.iterator map f +```scala mdoc +def lazyMap[T, U](iter: Iterable[T], f: T => U) = new Iterable[U] { + def iterator = iter.iterator map f } ``` {% endtab %} {% tab 'Scala 3' for=views_1 %} ```scala -def lazyMap[T, U](coll: Iterable[T], f: T => U) = new Iterable[U]: - def iterator = coll.iterator map f +def lazyMap[T, U](iter: Iterable[T], f: T => U) = new Iterable[U]: + def iterator = iter.iterator map f ``` {% endtab %} {% endtabs %} From 6ce28d69fad5efca804b8b7de0185154762b80a9 Mon Sep 17 00:00:00 2001 From: Jamie Thompson Date: Tue, 4 Oct 2022 16:06:30 +0200 Subject: [PATCH 7/7] format code examples --- _overviews/collections-2.13/arrays.md | 71 ++++++++------- _overviews/collections-2.13/equality.md | 33 ++++--- _overviews/collections-2.13/strings.md | 31 ++++--- _overviews/collections-2.13/views.md | 113 +++++++++++++++--------- 4 files changed, 152 insertions(+), 96 deletions(-) diff --git a/_overviews/collections-2.13/arrays.md b/_overviews/collections-2.13/arrays.md index b772397e23..32f9fb0584 100644 --- a/_overviews/collections-2.13/arrays.md +++ b/_overviews/collections-2.13/arrays.md @@ -16,15 +16,18 @@ permalink: /overviews/collections-2.13/:title.html {% tabs arrays_1 %} {% tab 'Scala 2 and 3' for=arrays_1 %} -``` +```scala scala> val a1 = Array(1, 2, 3) -a1: Array[Int] = Array(1, 2, 3) -scala> val a2 = a1 map (_ * 3) -a2: Array[Int] = Array(3, 6, 9) -scala> val a3 = a2 filter (_ % 2 != 0) -a3: Array[Int] = Array(3, 9) +val a1: Array[Int] = Array(1, 2, 3) + +scala> val a2 = a1.map(_ * 3) +val a2: Array[Int] = Array(3, 6, 9) + +scala> val a3 = a2.filter(_ % 2 != 0) +val a3: Array[Int] = Array(3, 9) + scala> a3.reverse -res0: Array[Int] = Array(9, 3) +val res0: Array[Int] = Array(9, 3) ``` {% endtab %} {% endtabs %} @@ -33,13 +36,15 @@ Given that Scala arrays are represented just like Java arrays, how can these add {% tabs arrays_2 %} {% tab 'Scala 2 and 3' for=arrays_2 %} -``` +```scala scala> val seq: collection.Seq[Int] = a1 -seq: scala.collection.Seq[Int] = ArraySeq(1, 2, 3) +val seq: scala.collection.Seq[Int] = ArraySeq(1, 2, 3) + scala> val a4: Array[Int] = seq.toArray -a4: Array[Int] = Array(1, 2, 3) +val a4: Array[Int] = Array(1, 2, 3) + scala> a1 eq a4 -res1: Boolean = false +val res1: Boolean = false ``` {% endtab %} {% endtabs %} @@ -52,15 +57,18 @@ The difference between the two implicit conversions on arrays is shown in the ne {% tabs arrays_3 %} {% tab 'Scala 2 and 3' for=arrays_3 %} -``` +```scala scala> val seq: collection.Seq[Int] = a1 -seq: scala.collection.Seq[Int] = ArraySeq(1, 2, 3) +val seq: scala.collection.Seq[Int] = ArraySeq(1, 2, 3) + scala> seq.reverse -res2: scala.collection.Seq[Int] = ArraySeq(3, 2, 1) +val res2: scala.collection.Seq[Int] = ArraySeq(3, 2, 1) + scala> val ops: collection.ArrayOps[Int] = a1 -ops: scala.collection.ArrayOps[Int] = scala.collection.ArrayOps@2d7df55 +val ops: scala.collection.ArrayOps[Int] = scala.collection.ArrayOps@2d7df55 + scala> ops.reverse -res3: Array[Int] = Array(3, 2, 1) +val res3: Array[Int] = Array(3, 2, 1) ``` {% endtab %} {% endtabs %} @@ -71,9 +79,9 @@ The `ArrayOps` example above was quite artificial, intended only to show the dif {% tabs arrays_4 %} {% tab 'Scala 2 and 3' for=arrays_4 %} -``` +```scala scala> a1.reverse -res4: Array[Int] = Array(3, 2, 1) +val res4: Array[Int] = Array(3, 2, 1) ``` {% endtab %} {% endtabs %} @@ -82,9 +90,9 @@ The `ArrayOps` object gets inserted automatically by the implicit conversion. So {% tabs arrays_5 %} {% tab 'Scala 2 and 3' for=arrays_5 %} -``` +```scala scala> intArrayOps(a1).reverse -res5: Array[Int] = Array(3, 2, 1) +val res5: Array[Int] = Array(3, 2, 1) ``` {% endtab %} {% endtabs %} @@ -121,14 +129,14 @@ The `evenElems` method returns a new array that consist of all elements of the a {% tabs arrays_7 class=tabs-scala-version %} {% tab 'Scala 2' for=arrays_7 %} -``` +```scala error: cannot find class manifest for element type T val arr = new Array[T]((arr.length + 1) / 2) ^ ``` {% endtab %} {% tab 'Scala 3' for=arrays_7 %} -``` +```scala -- Error: ---------------------------------------------------------------------- 3 | val arr = new Array[T]((xs.length + 1) / 2) | ^ @@ -143,12 +151,12 @@ The Scala compiler will construct class manifests automatically if you instruct {% tabs arrays_8 class=tabs-scala-version %} {% tab 'Scala 2' for=arrays_8 %} -``` +```scala def evenElems[T](xs: Vector[T])(implicit m: ClassTag[T]): Array[T] = ... ``` {% endtab %} {% tab 'Scala 3' for=arrays_8 %} -``` +```scala def evenElems[T](xs: Vector[T])(using m: ClassTag[T]): Array[T] = ... ``` {% endtab %} @@ -188,11 +196,12 @@ Here is some REPL interaction that uses the `evenElems` method. {% tabs arrays_10 %} {% tab 'Scala 2 and 3' for=arrays_10 %} -``` +```scala scala> evenElems(Vector(1, 2, 3, 4, 5)) -res6: Array[Int] = Array(1, 3, 5) +val res6: Array[Int] = Array(1, 3, 5) + scala> evenElems(Vector("this", "is", "a", "test", "run")) -res7: Array[java.lang.String] = Array(this, a, run) +val res7: Array[java.lang.String] = Array(this, a, run) ``` {% endtab %} {% endtabs %} @@ -201,7 +210,7 @@ In both cases, the Scala compiler automatically constructed a class manifest for {% tabs arrays_11 class=tabs-scala-version %} {% tab 'Scala 2' for=arrays_11 %} -``` +```scala scala> def wrap[U](xs: Vector[U]) = evenElems(xs) :6: error: No ClassTag available for U. def wrap[U](xs: Vector[U]) = evenElems(xs) @@ -209,7 +218,7 @@ scala> def wrap[U](xs: Vector[U]) = evenElems(xs) ``` {% endtab %} {% tab 'Scala 3' for=arrays_11 %} -``` +```scala -- Error: ---------------------------------------------------------------------- 6 |def wrap[U](xs: Vector[U]) = evenElems(xs) | ^ @@ -222,9 +231,9 @@ What happened here is that the `evenElems` demands a class manifest for the type {% tabs arrays_12 %} {% tab 'Scala 2 and 3' for=arrays_12 %} -``` +```scala scala> def wrap[U: ClassTag](xs: Vector[U]) = evenElems(xs) -wrap: [U](xs: Vector[U])(implicit evidence$1: scala.reflect.ClassTag[U])Array[U] +def wrap[U](xs: Vector[U])(implicit evidence$1: scala.reflect.ClassTag[U]): Array[U] ``` {% endtab %} {% endtabs %} diff --git a/_overviews/collections-2.13/equality.md b/_overviews/collections-2.13/equality.md index b73c5cabca..7fa334c8d9 100644 --- a/_overviews/collections-2.13/equality.md +++ b/_overviews/collections-2.13/equality.md @@ -19,20 +19,27 @@ It does not matter for the equality check whether a collection is mutable or imm {% tabs equality_1 %} {% tab 'Scala 2 and 3' for=equality_1 %} - scala> import collection.mutable.{HashMap, ArrayBuffer} - import collection.mutable.{HashMap, ArrayBuffer} - scala> val buf = ArrayBuffer(1, 2, 3) - buf: scala.collection.mutable.ArrayBuffer[Int] = - ArrayBuffer(1, 2, 3) - scala> val map = HashMap(buf -> 3) - map: scala.collection.mutable.HashMap[scala.collection. - mutable.ArrayBuffer[Int],Int] = Map((ArrayBuffer(1, 2, 3),3)) - scala> map(buf) - res13: Int = 3 - scala> buf(0) += 1 - scala> map(buf) - java.util.NoSuchElementException: key not found: +```scala +scala> import collection.mutable.{HashMap, ArrayBuffer} +import collection.mutable.{HashMap, ArrayBuffer} + +scala> val buf = ArrayBuffer(1, 2, 3) +val buf: scala.collection.mutable.ArrayBuffer[Int] = + ArrayBuffer(1, 2, 3) + +scala> val map = HashMap(buf -> 3) +val map: scala.collection.mutable.HashMap[scala.collection. + mutable.ArrayBuffer[Int],Int] = Map((ArrayBuffer(1, 2, 3),3)) + +scala> map(buf) +val res13: Int = 3 + +scala> buf(0) += 1 + +scala> map(buf) + java.util.NoSuchElementException: key not found: ArrayBuffer(2, 2, 3) +``` {% endtab %} {% endtabs %} diff --git a/_overviews/collections-2.13/strings.md b/_overviews/collections-2.13/strings.md index 0fb78d0dc4..aebe244304 100644 --- a/_overviews/collections-2.13/strings.md +++ b/_overviews/collections-2.13/strings.md @@ -17,18 +17,25 @@ Like arrays, strings are not directly sequences, but they can be converted to th {% tabs strings_1 %} {% tab 'Scala 2 and 3' for=strings_1 %} - scala> val str = "hello" - str: java.lang.String = hello - scala> str.reverse - res6: String = olleh - scala> str.map(_.toUpper) - res7: String = HELLO - scala> str drop 3 - res8: String = lo - scala> str.slice(1, 4) - res9: String = ell - scala> val s: Seq[Char] = str - s: Seq[Char] = hello +```scala +scala> val str = "hello" +val str: java.lang.String = hello + +scala> str.reverse +val res6: String = olleh + +scala> str.map(_.toUpper) +val res7: String = HELLO + +scala> str.drop(3) +val res8: String = lo + +scala> str.slice(1, 4) +val res9: String = ell + +scala> val s: Seq[Char] = str +val s: Seq[Char] = hello +``` {% endtab %} {% endtabs %} diff --git a/_overviews/collections-2.13/views.md b/_overviews/collections-2.13/views.md index 7f6fa3c337..2fa15860c3 100644 --- a/_overviews/collections-2.13/views.md +++ b/_overviews/collections-2.13/views.md @@ -22,14 +22,14 @@ As an example of a non-strict transformer consider the following implementation {% tab 'Scala 2' for=views_1 %} ```scala mdoc def lazyMap[T, U](iter: Iterable[T], f: T => U) = new Iterable[U] { - def iterator = iter.iterator map f + def iterator = iter.iterator.map(f) } ``` {% endtab %} {% tab 'Scala 3' for=views_1 %} ```scala def lazyMap[T, U](iter: Iterable[T], f: T => U) = new Iterable[U]: - def iterator = iter.iterator map f + def iterator = iter.iterator.map(f) ``` {% endtab %} {% endtabs %} @@ -42,15 +42,31 @@ To go from a collection to its view, you can use the `view` method on the collec Let's see an example. Say you have a vector of Ints over which you want to map two functions in succession: -{% tabs views_2 %} -{% tab 'Scala 2 and 3' for=views_2 %} +{% tabs views_2 class=tabs-scala-version %} +{% tab 'Scala 2' for=views_2 %} - scala> val v = Vector(1 to 10: _*) - v: scala.collection.immutable.Vector[Int] = - Vector(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) - scala> v map (_ + 1) map (_ * 2) - res5: scala.collection.immutable.Vector[Int] = - Vector(4, 6, 8, 10, 12, 14, 16, 18, 20, 22) +```scala +scala> val v = Vector(1 to 10: _*) +val v: scala.collection.immutable.Vector[Int] = + Vector(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) + +scala> v.map(_ + 1).map(_ * 2) +val res5: scala.collection.immutable.Vector[Int] = + Vector(4, 6, 8, 10, 12, 14, 16, 18, 20, 22) +``` + +{% endtab %} +{% tab 'Scala 3' for=views_2 %} + +```scala +scala> val v = Vector((1 to 10)*) +val v: scala.collection.immutable.Vector[Int] = + Vector(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) + +scala> v.map(_ + 1).map(_ * 2) +val res5: scala.collection.immutable.Vector[Int] = + Vector(4, 6, 8, 10, 12, 14, 16, 18, 20, 22) +``` {% endtab %} {% endtabs %} @@ -60,9 +76,11 @@ In the last statement, the expression `v map (_ + 1)` constructs a new vector wh {% tabs views_3 %} {% tab 'Scala 2 and 3' for=views_3 %} - scala> (v.view map (_ + 1) map (_ * 2)).to(Vector) - res12: scala.collection.immutable.Vector[Int] = - Vector(4, 6, 8, 10, 12, 14, 16, 18, 20, 22) +```scala +scala> val w = v.view.map(_ + 1).map(_ * 2).to(Vector) +val w: scala.collection.immutable.Vector[Int] = + Vector(4, 6, 8, 10, 12, 14, 16, 18, 20, 22) +``` {% endtab %} {% endtabs %} @@ -72,8 +90,10 @@ Let's do this sequence of operations again, one by one: {% tabs views_4 %} {% tab 'Scala 2 and 3' for=views_4 %} - scala> val vv = v.view - vv: scala.collection.IndexedSeqView[Int] = IndexedSeqView() +```scala +scala> val vv = v.view +val vv: scala.collection.IndexedSeqView[Int] = IndexedSeqView() +``` {% endtab %} {% endtabs %} @@ -86,9 +106,10 @@ Applying the first `map` to the view gives: {% tabs views_5 %} {% tab 'Scala 2 and 3' for=views_5 %} - scala> vv map (_ + 1) - res13: scala.collection.IndexedSeqView[Int] = IndexedSeqView() - +```scala +scala> vv.map(_ + 1) +val res13: scala.collection.IndexedSeqView[Int] = IndexedSeqView() +``` {% endtab %} {% endtabs %} @@ -97,8 +118,10 @@ The result of the `map` is another `IndexedSeqView[Int]` value. This is in essen {% tabs views_6 %} {% tab 'Scala 2 and 3' for=views_6 %} - scala> res13 map (_ * 2) - res14: scala.collection.IndexedSeqView[Int] = IndexedSeqView() +```scala +scala> res13.map(_ * 2) +val res14: scala.collection.IndexedSeqView[Int] = IndexedSeqView() +``` {% endtab %} {% endtabs %} @@ -108,9 +131,11 @@ Finally, forcing the last result gives: {% tabs views_7 %} {% tab 'Scala 2 and 3' for=views_7 %} - scala> res14.to(Vector) - res15: scala.collection.immutable.Vector[Int] = - Vector(4, 6, 8, 10, 12, 14, 16, 18, 20, 22) +```scala +scala> res14.to(Vector) +val res15: scala.collection.immutable.Vector[Int] = + Vector(4, 6, 8, 10, 12, 14, 16, 18, 20, 22) +``` {% endtab %} {% endtabs %} @@ -135,8 +160,10 @@ The main reason for using views is performance. You have seen that by switching {% tabs views_8 %} {% tab 'Scala 2 and 3' for=views_8 %} - def isPalindrome(x: String) = x == x.reverse - def findPalindrome(s: Seq[String]) = s find isPalindrome +```scala +def isPalindrome(x: String) = x == x.reverse +def findPalindrome(s: Seq[String]) = s.find(isPalindrome) +``` {% endtab %} {% endtabs %} @@ -145,9 +172,9 @@ Now, assume you have a very long sequence words, and you want to find a palindro {% tabs views_9 %} {% tab 'Scala 2 and 3' for=views_9 %} - - findPalindrome(words take 1000000) - +```scala +val palindromes = findPalindrome(words.take(1000000)) +``` {% endtab %} {% endtabs %} @@ -155,9 +182,9 @@ This nicely separates the two aspects of taking the first million words of a seq {% tabs views_10 %} {% tab 'Scala 2 and 3' for=views_10 %} - - findPalindrome(words.view take 1000000) - +```scala +val palindromes = findPalindrome(words.view.take(1000000)) +``` {% endtab %} {% endtabs %} @@ -169,14 +196,14 @@ Here's an example which bit a few users of versions of Scala before 2.8. In thes {% tabs views_11 class=tabs-scala-version %} {% tab 'Scala 2' for=views_11 %} - - val actors = for (i <- 1 to 10) yield actor { ... } - +```scala +val actors = for (i <- 1 to 10) yield actor { ... } +``` {% endtab %} {% tab 'Scala 3' for=views_11 %} - - val actors = for i <- 1 to 10 yield actor { ... } - +```scala +val actors = for i <- 1 to 10 yield actor { ... } +``` {% endtab %} {% endtabs %} @@ -185,7 +212,9 @@ They were surprised that none of the actors was executing afterwards, even thoug {% tabs views_12 %} {% tab 'Scala 2 and 3' for=views_12 %} - val actors = (1 to 10) map (i => actor { ... }) +```scala +val actors = (1 to 10).map(i => actor { ... }) +``` {% endtab %} {% endtabs %} @@ -197,12 +226,16 @@ To avoid surprises like this, the current Scala collections library has more reg {% tabs views_13 class=tabs-scala-version %} {% tab 'Scala 2' for=views_13 %} - val actors = for (i <- (1 to 10).view) yield actor { ... } +```scala +val actors = for (i <- (1 to 10).view) yield actor { ... } +``` {% endtab %} {% tab 'Scala 3' for=views_13 %} - val actors = for i <- (1 to 10).view yield actor { ... } +```scala +val actors = for i <- (1 to 10).view yield actor { ... } +``` {% endtab %} {% endtabs %}