Skip to content

SIP-59 - Multiple assignments #73

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 19, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
331 changes: 331 additions & 0 deletions content/multiple-assignments.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,331 @@
---
layout: sip
permalink: /sips/:title.html
stage: implementation
status: waiting-for-implementation
presip-thread: https://contributors.scala-lang.org/t/pre-sip-multiple-assignments/6425
title: SIP-59 - Multiple Assignments
---

**By: Dimi Racordon**

## History

| Date | Version |
|---------------|--------------------|
| Jan 17th 2024 | Initial Draft |

## Summary

This proposal discusses the syntax and semantics of a construct to assign multiple variables with a single expression.
This feature would simplify the implementation of operations expressed in terms of relationships between multiple variables, such as [`std::swap`](https://en.cppreference.com/w/cpp/algorithm/swap) in C++.

## Motivation

It happens that one has to assign multiple variables "at once" in an algorithm.
For example, let's consider the Fibonacci sequence:

```scala
class FibonacciIterator() extends Iterator[Int]:

private var a: Int = 0
private var b: Int = 1

def hasNext = true
def next() =
val r = a
val n = a + b
a = b
b = n
r
```

The same iterator could be rewritten more concisely if we could assign multiple variables at once.
For example, we can write the following in Swift:

```swift
struct FibonacciIterator: IteratorProtocol {

private var a: Int = 0
private var b: Int = 1
init() {}

mutating func next() -> Int? {
defer { (a, b) = (b, a + b) }
return a
}

}
```

Though the differences may seem frivolous at first glance, they are in fact important.
If we look at a formal definition of the Fibonacci sequence (e.g., on [Wikipedia](https://en.wikipedia.org/wiki/Fibonacci_sequence)), we might see something like:

> The Fibonacci sequence is given by *F(n) = F(n-1) + F(n+1)* where *F(0) = 0* and *F(1) = 1*.

Although this declarative description says nothing about an evaluation order, it becomes a concern in our Scala implementation as we must encode the relationship into multiple operational steps.
This decomposition offers opportunities to get things wrong:

```scala
def next() =
val r = a
a = b
b = a + b // invalid semantics, the value of `a` changed "too early"
r
```

In contrast, our Swift implementation can remain closer to the formal definition and is therefore more legible and less error-prone.

Multiple assignments show up in many general-purpose algorithms (e.g., insertion sort, partition, min-max element, ...).
But perhaps the most fundamental one is `swap`, which consists of exchanging two values.

We often swap values that are stored in some collection.
In this particular case, all is well in Scala because we can ask the collection to swap elements at given positions:

```scala
extension [T](self: mutable.ArrayBuffer[T])
def swapAt(i: Int, j: Int) =
val t = self(i)
self(i) = self(j)
self(j) = t

val a = mutable.ArrayBuffer(1, 2, 3)
a.swapAt(0, 2)
println(a) // ArrayBuffer(3, 2, 1)
```

Sadly, one can't implement a generic swap method that wouldn't rely on the ability to index a container.
The only way to express this operation in Scala is to "inline" the pattern implemented by `swapAt` every time we need to swap two values.

Having to rewrite this boilerplate is unfortunate.
Here is an example in a realistic algorithm:

```scala
extension [T](self: Seq[T])(using Ordering[T])
def minMaxElements: Option[(T, T)] =
import math.Ordering.Implicits.infixOrderingOps

// Return None for collections smaller than 2 elements.
var i = self.iterator
if (!i.hasNext) { return None }
var l = i.next()
if (!i.hasNext) { return None }
var h = i.next()

// Confirm the initial bounds.
if (h < l) { val t = l; l = h; h = l }

// Process the remaining elements.
def loop(): Option[(T, T)] =
if (i.hasNext) {
val n = i.next()
if (n < l) { l = n } else if (n > h) { h = n }
loop()
} else {
Some((l, h))
}
loop()
```

*Note: implementation shamelessly copied from [swift-algorithms](https://github.com/apple/swift-algorithms/blob/main/Sources/Algorithms/MinMax.swift).*

The swap occurs in the middle of the method with the sequence of expressions `val t = l; l = h; h = l`.
To borrow from the words of Edgar Dijskstra [1, Chapter 11]:

> [that] is combersome and ugly compared with the [multiple] assignment.

While `swap` is a very common operation, it's only an instance of a more general class of operations that are expressed in terms of relationships between multiple variables.
The definition of the Fibonacci sequence is another example.

## Proposed solution

The proposed solution is to add a language construct to assign multiple variables in a single expression.
Using this construct, swapping two values can be written as follows:

```scala
var a = 2
var b = 4
(a, b) = (b, a)
println(s"$a$b") // 42
```

The above Fibonacci iterator can be rewritten as follows:

```scala
class FibonacciIterator() extends Iterator[Int]:

private var a: Int = 0
private var b: Int = 1

def hasNext = true
def next() =
val r = a
(a, b) = (b, a + b)
r
```

Multiple assignments also alleviate the need for a swap method on collections, as the same idiomatic pattern can be reused to exchange elements at given indices:

```scala
val a = mutable.ArrayBuffer(1, 2, 3)
(a(0), a(2)) = (a(2), a(0))
println(a) // ArrayBuffer(3, 2, 1)
```

### Specification

A multiple assignment is an expression of the form `AssignTarget ‘=’ Expr` where:

```
AssignTarget ::= ‘(’ AssignTargetNode {‘,’ AssignTargetNode} ‘)’
AssignTargetNode ::= Expr | AssignTarget
```

An assignment target describes a structural pattern that can only be matched by a compatible composition of tuples.
For example, the following program is legal.

```scala
def f: (Boolean, Int) = (true, 42)
val a = mutable.ArrayBuffer(1, 2, 3)
def b = a
var x = false

(x, a(0)) = (false, 1337)
Copy link
Contributor

@lihaoyi lihaoyi Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the limits of the a(...) syntax?

(x, a("key")) = (false, "value") // Non-integer literals?

val i = 10
(x, a(i)) = (false, 1337) // Integer non-literals?

(x, a(i * 137)) = (false, 1337) // Non-trivial Expressions?

(x, a(1, 2)) = (false, 1337) // Multi-dimensional `update` assignment?

(x, a(foo.bar.qux(1))) = (false, 1337) // Looking up an array by index and using it in the assingment?

From what I understand, all these examples should just work right?

Copy link
Contributor Author

@kyouko-taiga kyouko-taiga Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposal does not suggest any restriction should apply. If a(x) = rhs is legal scala, then (a(x), b) = (rhs, 2) is also legal no matter what kind of expression is substituted for x.

(x, a(1)) = f
Copy link
Member

@bishabosha bishabosha Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be perhaps more simple if it were only syntactically "tuple literal" expression on the left and right, then you can check for arity match in the parser, and perform rewrites in desugar before typechecking. Otherwise here you must introduce pattern match extraction on f which is more complex.

Copy link
Contributor

@lihaoyi lihaoyi Jan 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think allowing tuple-typed values on the RHS is not truly-necessary, but it's definitely a nice-to-have, so even if it adds a bit more implementation complexity I think we should try to do it.

Having to manually unpack tuples to say (x, a(1)) = (f._1, f._2) just sounds awfully verbose and counter-intuitive, especially since (f._1, f._2) is meant to be the same as f in most other contexts

((x, a(1)), b(2)) = (f, 9000)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "nested tuple only" structure here is a bit odd IMO

We don't really have anywhere else in the Scala language where we have "nested tuples only". Scala has things like:

  • Patterns, which can be nested tuples or extractors
  • Expressions, which can be tuples, or function calls
  • Some places where we have flat lists only, e.g. val a, b, c = Value

I wonder if it's possible to generalize this a bit to accept arbitrary patterns? At least then this will "look like" a pattern, so it's similar to other things in Scala? Or if not, e.g. due to syntactic conflict, we should document it in the Alternatives section

(x) = Tuple1(false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just prohibit Tuple1 assignments? I feel like there are sufficiently few places where people actually want a Tuple1 that we do not need to support it. We can also add it later if it turns out to be really necessary, so might be good to keep our options open for the initial proposal

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Also, when it's "really needed", one can always write ._1 on the right-hand-side instead, with the same evaluation order.

```

A mismatch between the structure of a multiple assignment's target and the result of its RHS is a type error.
It cannot be detected during parsing because at this stage the compiler would not be able to determine the shape of an arbitrary expression's result.
For example, all multiple assignments in the following program are ill-typed:

```scala
def f: (Boolean, Int) = (true, 42)
val a = mutable.ArrayBuffer(1, 2, 3)
def b = a
var x = false

(a(1), x) = f // type mismatch
(x, a(1), b(2)) = (f, 9000) // structural mismatch
(x) = false // structural mismatch
(x) = (1, 2) // structural mismatch
```

Likewise, `(x) = Tuple1(false)` is _not_ equivalent to `x = Tuple1(false)`.
The former is a multiple assignment while the latter is a regular assignment, as described by the [current grammar](https://docs.scala-lang.org/scala3/reference/syntax.html) (see `Expr1`).
Though this distinction is subtle, multiple assignments involving unary tuples should be rare.

The operational semantics of multiple assignments (aka concurrent assignments) have been studied extensively in scienific literature (e.g., [1, 2]).
A first intuition is that the most desirable semantics can be achieved by fully evaluating the RHS of the assignment before assigning any expression in the LHS [1].
However, additional considerations must be given w.r.t. the independence of the variables on the LHS to guarantee deterministic results.
For example, consider the following expression:

```scala
(x, x) = (1, 2)
```

While one may conclude that such an expression should be an error [1], it is in general difficult to guarantee value independence in a language with pervasive reference semantics.
Further, it is desirable to write expressions of the form `(a(0), a(2)) = (a(2), a(0))`, as shown in the previous section.
Another complication is that multiple assignments should uphold the general left-to-right evaluation semantics of the Scala language.
For example, `a.b = c` requires `a` to be evaluated _before_ `c`.

Note that regular assignments desugar to function calls (e.g., `a(b) = c` is sugar for `a.update(b, c)`).
One property of these desugarings is always the last expression being evaluated before the method performing the assignment is called.
Given this observation, we address the abovementioned issues by defining the following algorithm:

1. Traverse the LHS structure in inorder and for each leaf:
- Evaluate each outermost subexpression to its value
- Form a closure capturing these values and accepting a single argument to perform the desugared assignment
- Associate that closure to the leaf
2. Compute the value of the RHS, which forms a tree
3. Traverse the LHS and RHS structures pairwise in inorder and for each leaf:
- Apply the closure formerly associated to the LHS on RHS value

For instance, consider the following definitions.

```scala
def f: (Boolean, Int) = (true, 42)
val a = mutable.ArrayBuffer(1, 2, 3)
def b = a
var x = false
```

The evaluation of the expression `((x, a(a(0))), b(2)) = (f, 9000)` is as follows:

1. form a closure `f0 = (rhs) => x_=(rhs)`
2. evaluate `a(0)`; result is `1`
3. form a closure `f1 = (rhs) => a.update(1, rhs)`
4. evaluate `b`; result is `a`
5. evaluate `2`
6. form a closure `f2 = (rhs) => a.update(2, rhs)`
7. evaluate `(f, 9000)`; result is `((true, 42), 9000)`
8. evaluate `f0(true)`
9. evaluate `f1(42)`
10. evaluate `f2(9000)`

After the assignment, `x == true` and `a == List(1, 42, 9000)`.

The compiler is allowed to ignore this procedure and generate different code for optimization purposes as long as it can guarantee that such a change is not observable.
For example, given two local variables `x` and `y`, their assignments in `(x, y) = (1, 2)` can be reordered or even performed in parallel.
Comment on lines +269 to +270
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the committee would prefer stronger wording: that there be a guarantee that no closure is actually allocated in the implementation. The closures are only mentioned to make the behavior clear.


### Compatibility

This proposal is purely additive and have no backward binary or TASTy compatibility consequences.
The semantics of the proposed new construct is fully expressible in terms of desugaring into current syntax, interpreteted with current semantics.

The proposed syntax is not currently legal Scala.
Therefore no currently existing program could be interpreted with different semantics using a newer compiler version supporting multiple assignments.

### Other concerns

One understandable concern of the proposed syntax is that the semantics of multiple assignments resembles that of pattern matching, yet it has different semantics.
For example:

```scala
val (a(x), b) = (true, "!") // 1

(a(x), b) = (true, "!") // 2
```

If `a` is instance of a type with a companion extractor object, the two lines above have completely different semantics.
The first declares two local bindings `x` and `b`, applying pattern matching to determine their value from the tuple `(true, "!")`.
The second is assigning `a(x)` and `b` to the values `true` and `"!"`, respectively.

Though possibly surprising, the difference in behavior is easy to explain.
The first line applies pattern matching because it starts with `val`.
The second doesn't because it involves no pattern matching introducer.
Further, note that a similar situation can already be reproduced in current Scala:

```scala
val a(x) = true // 1

a(x) = true // 2
```

## Alternatives

The current proposal supports arbitrary tree structures on the LHS of the assignment.
A simpler alternative would be to only support flat sequences, allowing the syntax to dispense with parentheses.

```scala
a, b = b, a
```

While this approach is more lightweight, the reduced expressiveness inhibits potentially interesting use cases.
Further, consistently using tuple syntax on both sides of the equality operator clearly distinguishes regular and multiple assignments.

## Related work

A Pre-SIP discussion took place prior to this proposal (see [here](https://contributors.scala-lang.org/t/pre-sip-multiple-assignments/6425/1)).

Multiple assignments are present in many contemporary languages.
This proposal already illustrated them in Swift, but they are also commonly used in Python.
Multiple assigments have also been studied extensively in scienific literature (e.g., [1, 2]).

## FAQ

## References

1. Edsger W. Dijkstra: A Discipline of Programming. Prentice-Hall 1976, ISBN 013215871X
2. Ralph-Johan Back, Joakim von Wright: Refinement Calculus - A Systematic Introduction. Graduate Texts in Computer Science, Springer 1998, ISBN 978-0-387-98417-9