Skip to content

Commit bea1d39

Browse files
committed
Added docs and benchmark results.
1 parent 8ab0468 commit bea1d39

File tree

9 files changed

+3883
-35
lines changed

9 files changed

+3883
-35
lines changed

README.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -139,6 +139,31 @@ are used for collections, so this is best done to gather the results of an expen
139139
Finally, there is a Java class, `ScalaStreamer`, that has a series of `from` methods that can be used to
140140
obtain Java 8 Streams from Scala collections from within Java.
141141

142+
#### Performance Considerations
143+
144+
For sequential operations, Scala's `iterator` almost always equals or exceeds the performance of a Java 8 stream. Thus,
145+
one should favor `iterator` (and its richer set of operations) over `seqStream` for general use. However, long
146+
chains of processing of primitive types can sometimes benefit from the manually specialized methods in `DoubleStream`,
147+
`IntStream`, and `LongStream`.
148+
149+
Note that although `iterator` typically has superior performance in a sequential context, the advantage is modest
150+
(usually less than 50% higher throughput for `iterator`).
151+
152+
For parallel operations, `parStream` and even `seqStream.parallel` meets or exceeds the performance of Scala parallel
153+
collections methods (invoked with `.par`). Especially for small collections, the difference can be substantial. In
154+
some cases, when a Scala (parallel) collection is the ultimate result, Scala parallel collections can have an advantage
155+
as the collection can (in some cases) be built in parallel.
156+
157+
Because the wrappers are invoked based on the static type of the collection, there are also cases where parallelization
158+
is inefficient when interfacing with Java 8 Streams (e.g. when a collection is typed as `Seq[String]` so might have linear
159+
access like `List`, but actually is a `WrappedArray[String]` that can be efficiently parallelized) but can be efficient
160+
with Scala parallel collections. The `parStream` method is only available when the static type is known to be compatible
161+
with rapid parallel operation; `seqStream` can be parallelized by using `.parallel`, but may or may not be efficient.
162+
163+
If the operations available on Java 8 Streams are sufficient, the collection type is known statically with enough precision
164+
to enable parStream, and an `Accumulator` or non-collection type is an acceptable result, Java 8 Streams will essentially
165+
always outperform the Scala parallel collections.
166+
142167
#### Scala Usage Example
143168

144169
```scala
@@ -158,6 +183,37 @@ object Test {
158183
}
159184
```
160185

186+
#### Using Java 8 Streams with Scala Function Converters
187+
188+
Scala can emit Java SAMs for lambda expressions that are arguments to methods that take a Java SAM rather than
189+
a Scala Function. However, it can be convenient to restrict the SAM interface to interactions with Java code
190+
(including Java 8 Streams) rather than having it propagate throughout Scala code.
191+
192+
Using Java 8 Stream converters together with function converters allows one to accomplish this with only a modest
193+
amount of fuss.
194+
195+
Example:
196+
197+
```scala
198+
import scala.compat.java8.FunctionConverters._
199+
import scala.compat.java8.StreamConverters._
200+
201+
def mapToSortedString[A](xs: Vector[A], f: A => String, sep: String) =
202+
xs.parStream. // Creates java.util.stream.Stream[String]
203+
map[String](f.asJava).sorted. // Maps A to String and sorts (in parallel)
204+
toArray.mkString(sep) // Back to an Array to use Scala's mkString
205+
```
206+
207+
Note that explicit creation of a new lambda will tend to lead to improved type inference and at least equal
208+
performance:
209+
210+
```scala
211+
def mapToSortedString[A](xs: Vector[A], f: A => String, sep: String) =
212+
xs.parStream.
213+
map[String](a => f(a)).sorted. // Explicit lambda creates a SAM wrapper for f
214+
toArray.mkString(sep)
215+
```
216+
161217
#### Java Usage Example
162218

163219
```java

benchmark/README.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Benchmark suite for Java 8 Streams compatibility layer
2+
3+
This project is intended to support semi-manual benchmarking of the Java 8 streams compatibility layer in Scala collections.
4+
5+
Because the benchmarking is **very computationally expensive** it should be done occasionally, not automatically.
6+
7+
## Code generation step
8+
9+
1. Run `sbt console`
10+
11+
2. If the `JmhBench.scala` file already exists, delete it.
12+
13+
3. Enter `bench.codegen.Generate.jmhBench()` to generate the `JmhBench.scala` file.
14+
15+
## Benchmarking step
16+
17+
1. Make sure your terminal has plenty of lines of scrollback. (A couple thousand should do.)
18+
19+
2. Run `sbt`
20+
21+
3. Enter `jmh:run -i 5 -wi 3 -f5`. Wait overnight.
22+
23+
4. Clip off the last set of lines from the terminal window starting before the line that contains `[info] # Run complete. Total time:` and including that line until the end.
24+
25+
5. Save that in the file `results/jmhbench.log`
26+
27+
## Comparison step
28+
29+
1. Run `sbt console`
30+
31+
2. Enter `bench.examine.SpeedReports()`
32+
33+
3. Look at the ASCII art results showing speed comparisons.

benchmark/project/plugins.sbt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
addSbtPlugin("pl.project13.scala" % "sbt-jmh" % "0.2.4")
1+
addSbtPlugin("pl.project13.scala" % "sbt-jmh" % "0.2.5")

0 commit comments

Comments
 (0)