Skip to content

Commit 22dd9cf

Browse files
committed
Merge pull request #397 from odersky/master
Blog post on DOT
2 parents 31ebf30 + a5a87bc commit 22dd9cf

File tree

1 file changed

+145
-0
lines changed

1 file changed

+145
-0
lines changed
Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
---
2+
layout: blog
3+
post-type: blog
4+
by: Martin Odersky
5+
title: The Essence of Scala
6+
---
7+
8+
What do you get if you boil Scala on a slow flame and wait until all
9+
incidental features evaporate and only the most concentrated essence
10+
remains? After doing this for 8 years we believe we have the answer:
11+
it's DOT, the calculus of dependent object types, that underlies Scala.
12+
13+
A [paper on DOT](http://infoscience.epfl.ch/record/215280) will be
14+
presented in April at [Wadlerfest](http://events.inf.ed.ac.uk/wf2016),
15+
an event celebrating Phil Wadler's 60th birthday. There's also a prior
16+
technical report ([From F to DOT](http://arxiv.org/abs/1510.05216))
17+
by Tiark Rompf and Nada Amin describing a slightly different version
18+
of the calculus. Each paper describes a proof of type soundness that
19+
has been machine-checked for correctness.
20+
21+
## The DOT calculus
22+
23+
A calculus is a kind of mini-language that is small enough to be
24+
studied formally. Translated to Scala notation, the language covered
25+
by DOT is described by the following abstract grammar:
26+
27+
Value v = (x: T) => t Function
28+
new { x: T => ds } Object
29+
30+
Definition d = def a = t Method definition
31+
type A = T Type
32+
33+
Term t = v Value
34+
x Variable
35+
t1(t2) Application
36+
t.a Selection
37+
{ val x = t1; t2 } Local definition
38+
39+
Type T = Any Top type
40+
Nothing Bottom type
41+
x.A Selection
42+
(x: T1) => T2 Function
43+
{ def a: T } Method declaration
44+
{ type T >: T1 <: T2 } Type declaration
45+
T1 & T2 Intersection
46+
{ x => T } Recursion
47+
48+
The grammar uses several kinds of names:
49+
50+
x for (immutable) variables
51+
a for (parameterless) methods
52+
A for types
53+
54+
The full calculus adds to this syntax formal _typing rules_ that
55+
assign types `T` to terms `t` and formal _evaluation rules_ that
56+
describe how a program is evaluated. The following _type soundness_
57+
property was shown with a mechanized, (i.e. machine-checked) proof:
58+
59+
> If a term `t` has type `T`, and the evaluation of `t` terminates, then
60+
the result of the evaluation will be a value `v` of type `T`.
61+
62+
## Difficulties
63+
64+
Formulating the precise soundness theorem and proving it was unexpectedly hard,
65+
because it uncovered some technical challenges that had not been
66+
studied in depth before. In DOT - as well as in many programming languages -
67+
you can have conflicting definitions. For instance you might have an abstract
68+
type declaration in a base class with two conflicting aliases in subclasses:
69+
70+
trait Base { type A }
71+
trait Sub1 extends Base { type A = String }
72+
trait Sub2 extends Base { type A = Int }
73+
trait Bad extends Sub1 with Sub2
74+
75+
Now, if you combine `Sub1` and `Sub2` in trait `Bad` you get a conflict,
76+
since the type `A` is supposed to be equal to both `String` and `Int`. If you do
77+
not detect the conflict and assume the equalities at face value you
78+
get `String = A = Int`, hence by transitivity `String = Int`! Once you
79+
are that far, you can of course engineer all sorts of situations where
80+
a program will typecheck but cause a wrong execution at runtime. In
81+
other words, type soundness is violated.
82+
83+
Now, the problem is that one cannot always detect these
84+
inconsistencies, at least not by a local analysis that does not need
85+
to look at the whole program. What's worse, once you have an
86+
inconsistent set of definitions you can use these definitions to
87+
"prove" their own consistency - much like a mathematical theory that
88+
assumes `true = false` can "prove" every proposition including its own
89+
correctness.
90+
91+
The crucial reason why type soundness still holds is this: If one
92+
compares `T` with an alias, one does so always relative to some _path_
93+
`x` that refers to the object containing `T`. So it's really `x.T =
94+
Int`. Now, we can show that during evaluation every such path refers
95+
to some object that was created with a `new`, and that, furthermore,
96+
every such object has consistent type definitions. The tricky bit is
97+
to carefully distinguish between the full typing rules, which allow
98+
inconsistencies, and the typing rules arising from runtime values,
99+
which do not.
100+
101+
## Why is This Important?
102+
103+
There are at least four reasons why insights obtained in the DOT
104+
project are important.
105+
106+
1. They give us a well-founded explanation of _nominal typing_.
107+
Nominal typing means that a type is distinguished from others
108+
simply by having a different name.
109+
For instance, given two trait definitions
110+
111+
trait A extends AnyRef { def f: Int }
112+
trait B extends AnyRef { def f: Int }
113+
114+
we consider `A` and `B` to be different types, even though both
115+
traits have the same parents and both define the same members.
116+
The opposite of
117+
nominal typing is structural typing, which treats types
118+
that have the same structure as being the same. Most programming
119+
languages are at least in part nominal whereas most formal type systems,
120+
including DOT, are structural. But the abstract types in DOT
121+
provide a way to express nominal types such as classes and traits.
122+
The Wadlerfest paper contains examples that show how
123+
one can express classes for standard types such as `Boolean` and `List` in DOT.
124+
125+
2. They give us a stable basis on which we can study richer languages
126+
that resemble Scala more closely. For instance, we can encode
127+
type parameters as type members of objects in DOT. This encoding
128+
can give us a better understanding of the interactions of
129+
subtyping and generics. It can explain why variance rules
130+
are the way they are and what the precise typing rules for
131+
wildcard parameters `[_ <: T]`, `[_ >: T]` should be.
132+
133+
3. DOT also provides a blueprint for Scala compilation. The new Scala
134+
compiler _dotty_ has internal data structures that closely resemble DOT.
135+
In particular, type parameters are immediately mapped to type members,
136+
in the way we propose to encode them also in the calculus.
137+
138+
4. Finally, the proof principles explored in the DOT work give us guidelines
139+
to assess and treat other possible soundness issues. We now know much
140+
better what conditions must be fulfilled to ensure type soundness.
141+
This lets us put other constructs of the Scala language to the test,
142+
either to increase our confidence that they are indeed sound, or
143+
to show that they are unsound. In my next blog I will
144+
present some of the issues we have discovered through that exercise.
145+

0 commit comments

Comments
 (0)