|
| 1 | +--- |
| 2 | +layout: doc-page |
| 3 | +title: "Safe Initialization" |
| 4 | +--- |
| 5 | + |
| 6 | +Dotty implements experimental safe initialization check, which can be enabled by the compiler option `-Ycheck-init`. |
| 7 | + |
| 8 | +## A Quick Glance |
| 9 | + |
| 10 | +To get a feel of how it works, we first show several examples below. |
| 11 | + |
| 12 | +### Parent-Child Interaction |
| 13 | + |
| 14 | +Given the following code snippet: |
| 15 | + |
| 16 | +``` Scala |
| 17 | +abstract class AbstractFile { |
| 18 | + def name: String |
| 19 | + val extension: String = Path.extension(name) |
| 20 | +} |
| 21 | + |
| 22 | +class RemoteFile(url: String) extends AbstractFile { |
| 23 | + val localFile: String = url.hashCode + ".tmp" // error: usge of `localFile` before it's initialized |
| 24 | + def name: String = localFile |
| 25 | +} |
| 26 | +``` |
| 27 | + |
| 28 | +The checker will report: |
| 29 | + |
| 30 | +``` scala |
| 31 | +-- Warning: tests/init/neg/AbstractFile.scala:7:4 ------------------------------ |
| 32 | +7 | val localFile: String = url.hashCode + ".tmp" // error |
| 33 | + | ^ |
| 34 | + | Access non-initialized field value localFile. Calling trace: |
| 35 | + | -> val extension: String = name.substring(4) [ AbstractFile.scala:3 ] |
| 36 | + | -> def name: String = localFile [ AbstractFile.scala:8 ] |
| 37 | +``` |
| 38 | + |
| 39 | +### Inner-Outer Interaction |
| 40 | + |
| 41 | +Given the code below: |
| 42 | + |
| 43 | +``` scala |
| 44 | +object Trees { |
| 45 | + class ValDef { counter += 1 } |
| 46 | + class EmptyValDef extends ValDef |
| 47 | + val theEmptyValDef = new EmptyValDef |
| 48 | + private var counter = 0 // error |
| 49 | +} |
| 50 | +``` |
| 51 | + |
| 52 | +The checker will report: |
| 53 | + |
| 54 | +``` scala |
| 55 | +-- Warning: tests/init/neg/trees.scala:5:14 ------------------------------------ |
| 56 | +5 | private var counter = 0 // error |
| 57 | + | ^ |
| 58 | + | Access non-initialized field variable counter. Calling trace: |
| 59 | + | -> val theEmptyValDef = new EmptyValDef [ trees.scala:4 ] |
| 60 | + | -> class EmptyValDef extends ValDef [ trees.scala:3 ] |
| 61 | + | -> class ValDef { counter += 1 } [ trees.scala:2 ] |
| 62 | +``` |
| 63 | + |
| 64 | +### Functions |
| 65 | + |
| 66 | +Given the code below: |
| 67 | + |
| 68 | +``` scala |
| 69 | +abstract class Parent { |
| 70 | + val f: () => String = () => this.message |
| 71 | + def message: String |
| 72 | +} |
| 73 | +class Child extends Parent { |
| 74 | + val a = f() |
| 75 | + val b = "hello" // error |
| 76 | + def message: String = b |
| 77 | +} |
| 78 | +``` |
| 79 | + |
| 80 | +The checker reports: |
| 81 | + |
| 82 | +``` scala |
| 83 | +-- Warning: tests/init/neg/features-high-order.scala:7:6 ----------------------- |
| 84 | +7 | val b = "hello" // error |
| 85 | + | ^ |
| 86 | + |Access non-initialized field value b. Calling trace: |
| 87 | + | -> val a = f() [ features-high-order.scala:6 ] |
| 88 | + | -> val f: () => String = () => this.message [ features-high-order.scala:2 ] |
| 89 | + | -> def message: String = b [ features-high-order.scala:8 ] |
| 90 | +``` |
| 91 | + |
| 92 | +## Design Goals |
| 93 | + |
| 94 | +We establish the following design goals: |
| 95 | + |
| 96 | +- __Safe__: checking always terminates, and is sound for common and reasonable usage (over-approximation) |
| 97 | +- __Fast__: instant feedback |
| 98 | +- __Expressive__: support common and reasonable initialization patterns |
| 99 | +- __Friendly__: simple rules, minimal syntactic overhead, informative error messages |
| 100 | +- __Modular__: modular checking, no analysis beyond project boundary |
| 101 | +- __Simple__: no changes to core type system, explainable by a simple theory |
| 102 | + |
| 103 | +By _reasonable usage_, we include the following use cases (but not restricted to them): |
| 104 | + |
| 105 | +- Access fields on `this` and outer `this` during initialization |
| 106 | +- Call methods on `this` and outer `this` during initialization |
| 107 | +- Instantiate inner class and call methods on such instances during initialization |
| 108 | +- Capture fields in functions |
| 109 | + |
| 110 | +## Principles and Rules |
| 111 | + |
| 112 | +To achieve the goals, we uphold two fundamental principles: |
| 113 | +_stackability_ and _monotonicity_. |
| 114 | + |
| 115 | +Stackability means that objects are initialized in stack order: if the |
| 116 | +object `b` is created during the initialization of object `a`, then |
| 117 | +`b` should become transitively initialized before or at the same time as `a`. |
| 118 | +Scala enforces this property in syntax, except for the language |
| 119 | +feature below: |
| 120 | + |
| 121 | +``` scala |
| 122 | +var x: T = _ |
| 123 | +``` |
| 124 | + |
| 125 | +Monotonicity means that the initialization status of an object should |
| 126 | +not go backward: initialized fields continue to be initialized, a |
| 127 | +field points to an initialized object may not later point to an |
| 128 | +object under initialization. It means the following code will be rejected: |
| 129 | + |
| 130 | +``` scala |
| 131 | +trait Reporter { def report(msg: String): Unit } |
| 132 | +class FileReporter(ctx: Context) extends Reporter { |
| 133 | + ctx.typer.reporter = this // ctx now reaches an uninitialized object |
| 134 | + val file: File = new File("report.txt") |
| 135 | + def report(msg: String) = file.write(msg) |
| 136 | +} |
| 137 | +``` |
| 138 | + |
| 139 | +In the code above, suppose `ctx` points to a transitively initialized |
| 140 | +object. Now the assignment at line 3 makes `this`, which is not fully |
| 141 | +initialized, reachable from `ctx`. This makes field usage dangerous, |
| 142 | +as it may indirectly reach uninitialized fields. |
| 143 | + |
| 144 | +Monotonicity is based on a well-known technique called _heap monotonic |
| 145 | +typestate_ to ensure soundness in the presence of aliasing |
| 146 | +[1]. Otherwise, either soundness will be compromised or we have to |
| 147 | +disallow the usage of already initialized fields. |
| 148 | + |
| 149 | + |
| 150 | +With the established principles and design goals, following rules are imposed: |
| 151 | + |
| 152 | +1. In an assignment `o.x = e`, the expression `e` may only point to transitively initialized objects. |
| 153 | + |
| 154 | + This is how monotonicity is enforced in the system. Note that in an |
| 155 | + initialization `val f: T = e`, the expression `e` may point to an object |
| 156 | + under initialization. This requires a distinction between mutation and |
| 157 | + initialization in syntax in order to enforce different rules. The Scala |
| 158 | + syntax makes it easy to distinguish initialization from mutation. |
| 159 | + |
| 160 | +2. References to objects under initialization may not be passed as arguments to method calls or constructors. |
| 161 | + |
| 162 | + Escape of `this` in the constructor is commonly regarded as an |
| 163 | + anti-pattern, and it's rarely used in practice. This rule is simple |
| 164 | + for the programmer to reason about initialization and it simplifies |
| 165 | + implementation. |
| 166 | + |
| 167 | +3. Local definitions may only refer to transitively initialized objects. |
| 168 | + |
| 169 | + It means that in a local definition `val x: T = e`, the expression `e` may |
| 170 | + only evaluate to transitively initialized objects. The same goes for local |
| 171 | + lazy variables and methods. This rule is again motivated for simplicity in |
| 172 | + reasoning about initialization: programmers may safely assume that all local |
| 173 | + definitions only point to transitively initialized objects. |
| 174 | + |
| 175 | +## Modularity |
| 176 | + |
| 177 | +For modularity, we forbid subtle initialization interaction beyond project |
| 178 | +boundaries. For example, the following code passes the check when the two |
| 179 | +classes are defined in the same project: |
| 180 | + |
| 181 | +```Scala |
| 182 | +class Base { |
| 183 | + private val map: mutable.Map[Int, String] = mutable.Map.empty |
| 184 | + def enter(k: Int, v: String) = map(k) = v |
| 185 | +} |
| 186 | +class Child extends Base { |
| 187 | + enter(1, "one") |
| 188 | + enter(2, "two") |
| 189 | +} |
| 190 | +``` |
| 191 | + |
| 192 | +However, when the class `Base` and `Child` are defined in two different |
| 193 | +projects, the check will emit a warning for the calls to `enter` in the class |
| 194 | +`Child`. This restricts subtle initialization within project boundaries, |
| 195 | +and avoids accidental violation of contracts across library versions. |
| 196 | + |
| 197 | +We impose the following rules to enforce modularity: |
| 198 | + |
| 199 | +4. A class or trait that may be extended in another project should not |
| 200 | + call virtual methods on `this` in its template/mixin evaluation, |
| 201 | + directly or indirectly. |
| 202 | + |
| 203 | +5. The method call `o.m(args)` is forbidden if `o` is not transitively |
| 204 | + initialized and the target of `m` is defined in an external project. |
| 205 | + |
| 206 | +6. The expression `new p.C(args)` is forbidden, if `p` is not transitively |
| 207 | + initialized and `C` is defined in an external project. |
| 208 | + |
| 209 | +Theoretically, we may analyze across project boundaries based on tasty. However, |
| 210 | +from our experience with Dotty community projects, most subtle initialization |
| 211 | +patterns are restricted in the same project. We think it is good to first impose |
| 212 | +more strict rules. The feedback from the community is welcome. |
| 213 | + |
| 214 | +## Theory |
| 215 | + |
| 216 | +The theory is based on type-and-effect systems [1]. We introduce two concepts, |
| 217 | +_effects_ and _potentials_: |
| 218 | + |
| 219 | +``` |
| 220 | +π = C.this | Warm[C, π] | π.f | π.m | C.super[D] | Cold | Fun(Π, Φ) |
| 221 | +ϕ = π↑ | π.f! | π.m! |
| 222 | +``` |
| 223 | + |
| 224 | +Potentials (π) represent values that are possibly under initialization. |
| 225 | + |
| 226 | +- `C.this`: current object |
| 227 | +- `Warm[C, π]`: an object of type `C` where all its fields are assigned, and the potential for `this` of its enclosing class is `π`. |
| 228 | +- `π.f`: the potential of the field `f` in the potential `π` |
| 229 | +- `π.m`: the potential of the field `f` in the potential `π` |
| 230 | +- `C.super[D]`: essentially the current object, used for virtual method resolution |
| 231 | +- `Cold`: an object with unknown initialization status |
| 232 | +- `Fun(Π, Φ)`: a function, when called produce effects Π and return potentials Φ. |
| 233 | + |
| 234 | +Effects are triggered from potentials: |
| 235 | + |
| 236 | +- `π↑`: leak of the potential `π` |
| 237 | +- `π.f!`: access field `f` on the potential `π` |
| 238 | +- `π.m!`: call the method `m` on the potential `π` |
| 239 | + |
| 240 | +To ensure that the checking always terminate and for better |
| 241 | +performance, we restrict the length of potentials to be finite (by |
| 242 | +default 2). If the potential is too long, the checker stops |
| 243 | +tracking it by checking that the potential is actually transitively |
| 244 | +initialized. |
| 245 | + |
| 246 | +For an expression `e`, it may be summarized by the pair `(Π, Φ)`, |
| 247 | +which means evaluation of `e` may produce the effects Φ and return the |
| 248 | +potentials Π. Each field and method is associated with such a pair. |
| 249 | + |
| 250 | +The checking treats the templates of concrete classes as entry points. |
| 251 | +It maintains the set of initialized fields as initialization |
| 252 | +progresses, and check that only initialized fields are accessed during |
| 253 | +the initialization and there is no leaking of values under initialization. |
| 254 | +Virtual method calls on `this` is not a problem, |
| 255 | +as they can always be resolved statically. |
| 256 | + |
| 257 | +More details can be found in a forthcoming paper. |
| 258 | + |
| 259 | +## Back Doors |
| 260 | + |
| 261 | +Occasionally you may want to suppress warnings reported by the |
| 262 | +checker. You can either write `e: @unchecked` to tell the checker to |
| 263 | +skip checking for the expression `e`, or you may use the old trick: |
| 264 | +mark some fields as lazy. |
| 265 | + |
| 266 | +## Caveats |
| 267 | + |
| 268 | +The system cannot handle static fields, nor does it provide safety |
| 269 | +guarantee when extending Java or Scala 2 classes. Calling methods of |
| 270 | +Java or Scala 2 is always safe. |
| 271 | + |
| 272 | +## References |
| 273 | + |
| 274 | +- Fähndrich, M. and Leino, K.R.M., 2003, July. _Heap monotonic typestates_. In International Workshop on Aliasing, Confinement and Ownership in object-oriented programming (IWACO). |
| 275 | +- Lucassen, J.M. and Gifford, D.K., 1988, January. _Polymorphic effect systems_. In Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages (pp. 47-57). ACM. |
0 commit comments