Skip to content

Commit 68543ff

Browse files
committed
Initial work on documentation
1 parent 127083b commit 68543ff

File tree

1 file changed

+275
-0
lines changed

1 file changed

+275
-0
lines changed
Lines changed: 275 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,275 @@
1+
---
2+
layout: doc-page
3+
title: "Safe Initialization"
4+
---
5+
6+
Dotty implements experimental safe initialization check, which can be enabled by the compiler option `-Ycheck-init`.
7+
8+
## A Quick Glance
9+
10+
To get a feel of how it works, we first show several examples below.
11+
12+
### Parent-Child Interaction
13+
14+
Given the following code snippet:
15+
16+
``` Scala
17+
abstract class AbstractFile {
18+
def name: String
19+
val extension: String = Path.extension(name)
20+
}
21+
22+
class RemoteFile(url: String) extends AbstractFile {
23+
val localFile: String = url.hashCode + ".tmp" // error: usge of `localFile` before it's initialized
24+
def name: String = localFile
25+
}
26+
```
27+
28+
The checker will report:
29+
30+
``` scala
31+
-- Warning: tests/init/neg/AbstractFile.scala:7:4 ------------------------------
32+
7 | val localFile: String = url.hashCode + ".tmp" // error
33+
| ^
34+
| Access non-initialized field value localFile. Calling trace:
35+
| -> val extension: String = name.substring(4) [ AbstractFile.scala:3 ]
36+
| -> def name: String = localFile [ AbstractFile.scala:8 ]
37+
```
38+
39+
### Inner-Outer Interaction
40+
41+
Given the code below:
42+
43+
``` scala
44+
object Trees {
45+
class ValDef { counter += 1 }
46+
class EmptyValDef extends ValDef
47+
val theEmptyValDef = new EmptyValDef
48+
private var counter = 0 // error
49+
}
50+
```
51+
52+
The checker will report:
53+
54+
``` scala
55+
-- Warning: tests/init/neg/trees.scala:5:14 ------------------------------------
56+
5 | private var counter = 0 // error
57+
| ^
58+
| Access non-initialized field variable counter. Calling trace:
59+
| -> val theEmptyValDef = new EmptyValDef [ trees.scala:4 ]
60+
| -> class EmptyValDef extends ValDef [ trees.scala:3 ]
61+
| -> class ValDef { counter += 1 } [ trees.scala:2 ]
62+
```
63+
64+
### Functions
65+
66+
Given the code below:
67+
68+
``` scala
69+
abstract class Parent {
70+
val f: () => String = () => this.message
71+
def message: String
72+
}
73+
class Child extends Parent {
74+
val a = f()
75+
val b = "hello" // error
76+
def message: String = b
77+
}
78+
```
79+
80+
The checker reports:
81+
82+
``` scala
83+
-- Warning: tests/init/neg/features-high-order.scala:7:6 -----------------------
84+
7 | val b = "hello" // error
85+
| ^
86+
|Access non-initialized field value b. Calling trace:
87+
| -> val a = f() [ features-high-order.scala:6 ]
88+
| -> val f: () => String = () => this.message [ features-high-order.scala:2 ]
89+
| -> def message: String = b [ features-high-order.scala:8 ]
90+
```
91+
92+
## Design Goals
93+
94+
We establish the following design goals:
95+
96+
- __Safe__: checking always terminates, and is sound for common and reasonable usage (over-approximation)
97+
- __Fast__: instant feedback
98+
- __Expressive__: support common and reasonable initialization patterns
99+
- __Friendly__: simple rules, minimal syntactic overhead, informative error messages
100+
- __Modular__: modular checking, no analysis beyond project boundary
101+
- __Simple__: no changes to core type system, explainable by a simple theory
102+
103+
By _reasonable usage_, we include the following use cases (but not restricted to them):
104+
105+
- Access fields on `this` and outer `this` during initialization
106+
- Call methods on `this` and outer `this` during initialization
107+
- Instantiate inner class and call methods on such instances during initialization
108+
- Capture fields in functions
109+
110+
## Principles and Rules
111+
112+
To achieve the goals, we uphold two fundamental principles:
113+
_stackability_ and _monotonicity_.
114+
115+
Stackability means that objects are initialized in stack order: if the
116+
object `b` is created during the initialization of object `a`, then
117+
`b` should become transitively initialized before or at the same time as `a`.
118+
Scala enforces this property in syntax, except for the language
119+
feature below:
120+
121+
``` scala
122+
var x: T = _
123+
```
124+
125+
Monotonicity means that the initialization status of an object should
126+
not go backward: initialized fields continue to be initialized, a
127+
field points to an initialized object may not later point to an
128+
object under initialization. It means the following code will be rejected:
129+
130+
``` scala
131+
trait Reporter { def report(msg: String): Unit }
132+
class FileReporter(ctx: Context) extends Reporter {
133+
ctx.typer.reporter = this // ctx now reaches an uninitialized object
134+
val file: File = new File("report.txt")
135+
def report(msg: String) = file.write(msg)
136+
}
137+
```
138+
139+
In the code above, suppose `ctx` points to a transitively initialized
140+
object. Now the assignment at line 3 makes `this`, which is not fully
141+
initialized, reachable from `ctx`. This makes field usage dangerous,
142+
as it may indirectly reach uninitialized fields.
143+
144+
Monotonicity is based on a well-known technique called _heap monotonic
145+
typestate_ to ensure soundness in the presence of aliasing
146+
[1]. Otherwise, either soundness will be compromised or we have to
147+
disallow the usage of already initialized fields.
148+
149+
150+
With the established principles and design goals, following rules are imposed:
151+
152+
1. In an assignment `o.x = e`, the expression `e` may only point to transitively initialized objects.
153+
154+
This is how monotonicity is enforced in the system. Note that in an
155+
initialization `val f: T = e`, the expression `e` may point to an object
156+
under initialization. This requires a distinction between mutation and
157+
initialization in syntax in order to enforce different rules. The Scala
158+
syntax makes it easy to distinguish initialization from mutation.
159+
160+
2. References to objects under initialization may not be passed as arguments to method calls or constructors.
161+
162+
Escape of `this` in the constructor is commonly regarded as an
163+
anti-pattern, and it's rarely used in practice. This rule is simple
164+
for the programmer to reason about initialization and it simplifies
165+
implementation.
166+
167+
3. Local definitions may only refer to transitively initialized objects.
168+
169+
It means that in a local definition `val x: T = e`, the expression `e` may
170+
only evaluate to transitively initialized objects. The same goes for local
171+
lazy variables and methods. This rule is again motivated for simplicity in
172+
reasoning about initialization: programmers may safely assume that all local
173+
definitions only point to transitively initialized objects.
174+
175+
## Modularity
176+
177+
For modularity, we forbid subtle initialization interaction beyond project
178+
boundaries. For example, the following code passes the check when the two
179+
classes are defined in the same project:
180+
181+
```Scala
182+
class Base {
183+
private val map: mutable.Map[Int, String] = mutable.Map.empty
184+
def enter(k: Int, v: String) = map(k) = v
185+
}
186+
class Child extends Base {
187+
enter(1, "one")
188+
enter(2, "two")
189+
}
190+
```
191+
192+
However, when the class `Base` and `Child` are defined in two different
193+
projects, the check will emit a warning for the calls to `enter` in the class
194+
`Child`. This restricts subtle initialization within project boundaries,
195+
and avoids accidental violation of contracts across library versions.
196+
197+
We impose the following rules to enforce modularity:
198+
199+
4. A class or trait that may be extended in another project should not
200+
call virtual methods on `this` in its template/mixin evaluation,
201+
directly or indirectly.
202+
203+
5. The method call `o.m(args)` is forbidden if `o` is not transitively
204+
initialized and the target of `m` is defined in an external project.
205+
206+
6. The expression `new p.C(args)` is forbidden, if `p` is not transitively
207+
initialized and `C` is defined in an external project.
208+
209+
Theoretically, we may analyze across project boundaries based on tasty. However,
210+
from our experience with Dotty community projects, most subtle initialization
211+
patterns are restricted in the same project. We think it is good to first impose
212+
more strict rules. The feedback from the community is welcome.
213+
214+
## Theory
215+
216+
The theory is based on type-and-effect systems [1]. We introduce two concepts,
217+
_effects_ and _potentials_:
218+
219+
```
220+
π = C.this | Warm[C, π] | π.f | π.m | C.super[D] | Cold | Fun(Π, Φ)
221+
ϕ = π↑ | π.f! | π.m!
222+
```
223+
224+
Potentials (π) represent values that are possibly under initialization.
225+
226+
- `C.this`: current object
227+
- `Warm[C, π]`: an object of type `C` where all its fields are assigned, and the potential for `this` of its enclosing class is `π`.
228+
- `π.f`: the potential of the field `f` in the potential `π`
229+
- `π.m`: the potential of the field `f` in the potential `π`
230+
- `C.super[D]`: essentially the current object, used for virtual method resolution
231+
- `Cold`: an object with unknown initialization status
232+
- `Fun(Π, Φ)`: a function, when called produce effects Π and return potentials Φ.
233+
234+
Effects are triggered from potentials:
235+
236+
- `π↑`: leak of the potential `π`
237+
- `π.f!`: access field `f` on the potential `π`
238+
- `π.m!`: call the method `m` on the potential `π`
239+
240+
To ensure that the checking always terminate and for better
241+
performance, we restrict the length of potentials to be finite (by
242+
default 2). If the potential is too long, the checker stops
243+
tracking it by checking that the potential is actually transitively
244+
initialized.
245+
246+
For an expression `e`, it may be summarized by the pair `(Π, Φ)`,
247+
which means evaluation of `e` may produce the effects Φ and return the
248+
potentials Π. Each field and method is associated with such a pair.
249+
250+
The checking treats the templates of concrete classes as entry points.
251+
It maintains the set of initialized fields as initialization
252+
progresses, and check that only initialized fields are accessed during
253+
the initialization and there is no leaking of values under initialization.
254+
Virtual method calls on `this` is not a problem,
255+
as they can always be resolved statically.
256+
257+
More details can be found in a forthcoming paper.
258+
259+
## Back Doors
260+
261+
Occasionally you may want to suppress warnings reported by the
262+
checker. You can either write `e: @unchecked` to tell the checker to
263+
skip checking for the expression `e`, or you may use the old trick:
264+
mark some fields as lazy.
265+
266+
## Caveats
267+
268+
The system cannot handle static fields, nor does it provide safety
269+
guarantee when extending Java or Scala 2 classes. Calling methods of
270+
Java or Scala 2 is always safe.
271+
272+
## References
273+
274+
- Fähndrich, M. and Leino, K.R.M., 2003, July. _Heap monotonic typestates_. In International Workshop on Aliasing, Confinement and Ownership in object-oriented programming (IWACO).
275+
- Lucassen, J.M. and Gifford, D.K., 1988, January. _Polymorphic effect systems_. In Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages (pp. 47-57). ACM.

0 commit comments

Comments
 (0)