Skip to content

Commit ab84cfb

Browse files
authored
Merge pull request #4848 from abeln/deep-dive-1-notes
Add meeting notes for "Dotty Internals 1: Trees & Symbols" talk
2 parents 946ba3f + 177bad5 commit ab84cfb

File tree

3 files changed

+158
-1
lines changed

3 files changed

+158
-1
lines changed
Lines changed: 155 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,155 @@
1+
---
2+
layout: doc-page
3+
title: Dotty Internals 1: Trees & Symbols (Meeting Notes)
4+
---
5+
6+
These are meeting notes for the [Dotty Internals 1: Trees & Symbols](https://www.youtube.com/watch?v=yYd-zuDd3S8) talk by [Dmitry Petrashko](http://twitter.com/darkdimius) on Mar 21, 2017.
7+
8+
# Entry point
9+
`dotc/Compiler.scala`
10+
11+
The entry point to the compiler contains the list of phases and their order.
12+
13+
# Phases
14+
15+
Some phases executed independently, but others (miniphases) are grouped for efficiency.
16+
See the paper "[Miniphases: Compilation using Modular and Efficient Tree Transformation](https://infoscience.epfl.ch/record/228518/files/paper.pdf)" for details.
17+
18+
# Trees
19+
`dotc/ast/Trees.scala`
20+
21+
Trees represent code written by the user (e.g. methods, classes, expressions). There are two kinds of trees: untyped and typed.
22+
23+
Unlike other compilers (but like `scalac`), dotty doesn't use multiple intermediate representations (IRs) during the compilation pipeline. Instead, it uses trees for all phases.
24+
25+
Dotty trees are immutable, so they can be shared.
26+
27+
## Untyped trees
28+
`dotc/ast/untpd.scala`
29+
30+
These are the trees as output by the parser.
31+
32+
Some trees only exist as untyped: e.g. `WhileDo` and `ForDo`. These are desugared by the typechecker.
33+
34+
## Typed trees
35+
`dotc/ast/tpd.scala`
36+
37+
Typed trees contain not only the user-written code, but also semantic information (the types) about the code.
38+
39+
## Notes on some tree types
40+
41+
* `RefTree`: trees that refer to something. There are multiple subtypes
42+
- `Ident`: by-name reference
43+
- `Select`: select (e.g. a field) from another tree (e.g. `a.foo` is represented as `Select(Ident(a), foo)`)
44+
* `This`: the this pointer
45+
* `Apply`: function application: e.g. `a.foo(1, 2)(3, 4)` becomes `Apply(Apply(Select(Ident(a), foo), List(1, 2)), List(3, 4))`
46+
* `TypeApply`: type application: `def foo[T](a: T) = ??? foo[Int](1)` becomes `Apply(TypeApply(Ident(foo), List(Int)), List(1))`
47+
* `Literal`: constants (e.g. integer constant 1)
48+
* `Typed`: type ascription (e.g. for widening, as in `(1: Any)`)
49+
* `NamedArg`: named arguments (can appear out-of-order in untyped trees, but will appear in-order in typed ones)
50+
* `Assign`: assignment. The node has a `lhs` and a `rhs`, but the `lhs` can be arbitrarly complicated (e.g. `(new C).f = 0`).
51+
* `If`: the condition in an if-expression can be arbitrarly complex (e.g. it can contain class definitions)
52+
* `Closure`: the free variables are stored in the `env` field, but are only accessible "around" the `LambdaLift` phase.
53+
* `Match` and `CaseDef`: pattern-matching trees. The `pat` field in `CaseDef` (the pattern) is, in turn, populated with a subset of trees like `Bind` and `Unapply`.
54+
* `Return`: return from a method. If the `from` field is empty, then we return from the closest enclosing method.
55+
The `expr` field should have a types that matches the return type of the method, but the `Return` node itself has type bottom.
56+
* `TypeTree`: tree representing a type (e.g. for `TypeApply`).
57+
* `AndType`, `OrType`, etc.: these are other trees that represent types that can be written by the user. These are a strict subset of all types, since
58+
some types *cannot* be written by the user.
59+
* `ValDef`: defines fields or local variables. To differentiate between the two cases, we can look at the denotation.
60+
The `preRhs` field is lazy because sometimes we want to "load" a definition without know what's on the rhs (for example, to look up its type).
61+
* `DefDef`: method definition.
62+
* `TypeDef`: type definition. Both `type A = ???` and `class A {}` are represented with a `TypeDef`. To differentiate between the two, look at the type of the node (better), or in the case of classes there should be a `Template` node in the rhs.
63+
* `Template`: describes the "body" of a class, including inheritance information and constructor. The `constr` field will be populated only after the `Constructors` phase; before that the constructor lives in the `preBody` field.
64+
* `Thicket`: allows us to return multiple trees when a single one is expected. This kind of tree is not user-visible.
65+
For example, `transformDefDef` in `LabelDefs` takes in a `DefDef` and needs to be able to sometimes break up the method into multiple methods, which are then returned as a single tree (via a `Thicket`). If we return a thicket in a location where multiple trees are expected, the compiler will flatten them, but if only one tree is expected (for example, in the constructor field of a class), then the compiler will throw.
66+
67+
### ThisTree
68+
69+
Tree classes have a `ThisTree` type field which is used to implement functionality that's common for *all* trees while returning
70+
a specific tree type. See `withType` in the `Tree` base class, for an example.
71+
72+
Additionally, both `Tree` and `ThisTree` are polymorphic so they can represent both untyped and typed trees.
73+
74+
For example, `withType` has signature `def withType(tpe: Type)(implicit ctx: Context): ThisTree[Type]`.
75+
This means that `withType` can return the most-specific tree type for the current tree, while at the same time guaranteeing that
76+
the returned tree will be typed.
77+
78+
## Creating trees
79+
80+
You should use the creation methods in `untpd.scala` and `tpd.scala` to instantiate tree objects (as opposed to
81+
creating them directly using the case classes in `Trees.scala`).
82+
83+
## Meaning of trees
84+
85+
In general, the best way to know what a tree represents is to look at its type or denotation; pattern matching
86+
on the structure of a tree is error-prone.
87+
88+
## Errors
89+
`dotc/typer/ErrorReporting.scala`
90+
91+
Sometimes there's an error during compilation, but we want to continue compilling (as opposed to failing outright), to
92+
uncover additional errors.
93+
94+
In cases where a tree is expected but there's an error, we can use the `errorTree` methods in `ErrorReporting` to create
95+
placeholder trees that explicitly mark the presence of errors.
96+
97+
Similarly, there exist `ErrorType` and `ErrorSymbol` classes.
98+
99+
## Assignment
100+
101+
The closest in Dotty to what a programming language like C calls an "l-value" is a `RefTree` (so an `Ident` or a `Select`).
102+
However, keep in mind that arbitrarily complex expressions can appear in the lhs of an assignment: e.g.
103+
```
104+
trait T {
105+
var s = 0
106+
}
107+
{
108+
class T2 extends T
109+
while (true) 1
110+
new Bla
111+
}.s = 10
112+
```
113+
Another caveat, before typechecking there can be some trees where the lhs isn't a `RefTree`: e.g. `(a, b) = (3, 4)`.
114+
115+
# Symbols
116+
`dotc/core/Symbols.scala`
117+
118+
Symbols are references to definitions (e.g. of variables, fields, classes). Symbols can be used to refer to definitions for which we don't have ASTs (for example, from the Java standard library).
119+
120+
`NoSymbol` is used to indicate the lack of a symbol.
121+
122+
Symbols uniquely identify definitions, but they don't say what the definitions *mean*. To understand the meaning of a symbol
123+
we need to look at its *denotation* (spefically for symbols, a `SymDenotation`).
124+
125+
Symbols can not only represent terms, but also types (hence the `isTerm`/`isType` methods in the `Symbol` class).
126+
127+
## ClassSymbol
128+
129+
`ClassSymbol` represents either a `class`, or an `trait`, or an `object`. For example, an object
130+
```
131+
object O {
132+
val s = 1
133+
}
134+
```
135+
is represented (after `Typer`) as
136+
```
137+
class O$ { this: O.type =>
138+
val s = 1
139+
}
140+
val O = new O$
141+
```
142+
where we have a type symbol for `class O$` and a term symbol for `val O`. Notice the use of the selftype `O.type` to indicate that `this` has a singleton type.
143+
144+
## SymDenotation
145+
`dotc/core/SymDenotations.scala`
146+
147+
Symbols contain `SymDenotation`s. The denotation, in turn, refers to:
148+
149+
* the source symbol (so the linkage is cyclic)
150+
* the "owner" of the symbol:
151+
- if the symbol is a variable, the owner is the enclosing method
152+
- if it's a field, the owner is the enclosing class
153+
- if it's a class, then the owner is the enclosing class
154+
* a set of flags that contain semantic information about the definition (e.g. whether it's a trait or mutable). Flags are defined in `Flags.scala`.
155+
* the type of the definition (through the `info` method)

docs/docs/resources/talks.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ transformations and more.
1717

1818
Deep Dive with Dotty
1919
--------------------
20-
- (Mar 21, 2017) [Dotty Internals 1: Trees & Symbols](https://www.youtube.com/watch?v=yYd-zuDd3S8) by [Dmitry Petrashko](http://twitter.com/darkdimius).
20+
- (Mar 21, 2017) [Dotty Internals 1: Trees & Symbols](https://www.youtube.com/watch?v=yYd-zuDd3S8) by [Dmitry Petrashko](http://twitter.com/darkdimius) [\[meeting notes\]](http://dotty.epfl.ch/docs/internals/dotty-internals-1-notes.html).
2121
This is a recorded meeting between EPFL and Waterloo, where we introduce first notions inside Dotty: Trees and Symbols.
2222

2323
- (Mar 21, 2017) [Dotty Internals 2: Types](https://www.youtube.com/watch?v=3gmLIYlGbKc) by [Martin Odersky](http://twitter.com/odersky) and [Dmitry Petrashko](http://twitter.com/darkdimius).

docs/sidebar.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -139,6 +139,8 @@ sidebar:
139139
url: docs/internals/syntax.html
140140
- title: Type System
141141
url: docs/internals/type-system.html
142+
- title: "Dotty Internals 1: Trees & Symbols (Meeting Notes)"
143+
url: docs/internals/dotty-internals-1-notes.html
142144
- title: Resources
143145
subsection:
144146
- title: Talks

0 commit comments

Comments
 (0)