|
| 1 | +# Dotc's Overall Structure |
| 2 | + |
| 3 | +The compiler code is found in package [dotty.tools](https://github.com/lampepfl/dotty/tree/master/src/dotty/tools). It spans the |
| 4 | +following three sub-packages: |
| 5 | + |
| 6 | + backend Compiler backends (currently for JVM and JS) |
| 7 | + dotc The main compiler |
| 8 | + io Helper modules for file access and classpath handling. |
| 9 | + |
| 10 | +The [dotc](https://github.com/lampepfl/dotty/tree/master/src/dotty/tools/dotc) |
| 11 | +package contains some main classes that can be run as separate |
| 12 | +programs. The most important one is class |
| 13 | +[Main](https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/Main.scala). |
| 14 | +`Main` inherits from |
| 15 | +[Driver](https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/Driver.scala) which |
| 16 | +contains the highest level functions for starting a compiler and processing some sources. |
| 17 | +`Driver` in turn is based on two other high-level classes, |
| 18 | +[Compiler](https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/Compiler.scala) and |
| 19 | +[Run](https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/Run.scala). |
| 20 | + |
| 21 | +## Package Structure |
| 22 | + |
| 23 | +Most functionality of `dotc` is implemented in subpackages of `dotc`. Here's a list of sub-packages |
| 24 | +and their focus. |
| 25 | + |
| 26 | + ast Abstract syntax trees, |
| 27 | + config Compiler configuration, settings, platform specific definitions. |
| 28 | + core Core data structures and operations, with specific subpackages for: |
| 29 | + |
| 30 | + core.classfile Reading of Java classfiles into core data structures |
| 31 | + core.tasty Reading and writing of TASTY files to/from core data structures |
| 32 | + core.unpickleScala2 Reading of Scala2 symbol information into core data structures |
| 33 | + |
| 34 | + parsing Scanner and parser |
| 35 | + printing Pretty-printing trees, types and other data |
| 36 | + repl The interactive REPL |
| 37 | + reporting Reporting of error messages, warnings and other info. |
| 38 | + rewrite Helpers for rewriting Scala 2's constructs into dotty's. |
| 39 | + transform Miniphases and helpers for tree transformations. |
| 40 | + typer Type-checking and other frontend phases |
| 41 | + util General purpose utility classes and modules. |
| 42 | + |
| 43 | +## Contexts |
| 44 | + |
| 45 | +`dotc` has almost no global state (the only significant bit of global state is the name table, |
| 46 | +which is used to hash strings into unique names). Instead, all essential bits of information that |
| 47 | +can vary over a compiler run are collected in a |
| 48 | +[Context](https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/core/Contexts.scala). |
| 49 | +Most methods in `dotc` take a Context value as an implicit parameter. |
| 50 | + |
| 51 | +Contexts give a convenient way to customize values in some part of the |
| 52 | +call-graph. To run, e.g. some compiler function `f` at a given |
| 53 | +phase `phase`, we invoke `f` with an explicit context parameter, like |
| 54 | +this |
| 55 | + |
| 56 | + f(/*normal args*/)(ctx.withPhase(phase)) |
| 57 | + |
| 58 | +This assumes that `f` is defined in the way most compiler functions are: |
| 59 | + |
| 60 | + def f(/*normal parameters*/)(implicit ctx: Context) ... |
| 61 | + |
| 62 | +Compiler code follows the convention that all implicit `Context` |
| 63 | +parameters are named `ctx`. This is important to avoid implicit |
| 64 | +ambiguities in the case where nested methods contain each a Context |
| 65 | +parameters. The common name ensures then that the implicit parameters |
| 66 | +properly shadow each other. |
| 67 | + |
| 68 | +Sometimes we want to make sure that implicit contexts are not captured |
| 69 | +in closures or other long-lived objects, be it because we want to |
| 70 | +enforce that nested methods each get their own implicit context, or |
| 71 | +because we want to avoid a space leak in the case where a closure can |
| 72 | +survive several compiler runs. A typical case is a completer for a |
| 73 | +symbol representing an external class, which produces the attributes |
| 74 | +of the symbol on demand, and which might never be invoked. In that |
| 75 | +case we follow the convention that any context parameter is explicit, |
| 76 | +not implicit, so we can track where it is used, and that it has a name |
| 77 | +different from `ctx`. Commonly used is `ictx` for "initialization |
| 78 | +context". |
| 79 | + |
| 80 | +With these two conventions in place, it has turned out that implicit |
| 81 | +contexts work amazingly well as a device for dependency injection and |
| 82 | +bulk parameterization. There is of course always the danger that |
| 83 | +an unexpected implicit will be passed, but in practice this has not turned out to |
| 84 | +be much of a problem. |
| 85 | + |
| 86 | +## Compiler Phases |
| 87 | + |
| 88 | +Seen from a temporal perspective, the `dotc` compiler consists of a list of phases. |
| 89 | +The current list of phases is specified in class [Compiler](https://github.com/lampepfl/dotty/blob/master/src/dotty/tools/dotc/Compiler.scala) as follows: |
| 90 | + |
| 91 | +```scala |
| 92 | + def phases: List[List[Phase]] = List( |
| 93 | + List(new FrontEnd), // Compiler frontend: scanner, parser, namer, typer |
| 94 | + List(new PostTyper), // Additional checks and cleanups after type checking |
| 95 | + List(new Pickler), // Generate TASTY info |
| 96 | + List(new FirstTransform, // Some transformations to put trees into a canonical form |
| 97 | + new CheckReentrant), // Internal use only: Check that compiled program has no data races involving global vars |
| 98 | + List(new RefChecks, // Various checks mostly related to abstract members and overriding |
| 99 | + new CheckStatic, // Check restrictions that apply to @static members |
| 100 | + new ElimRepeated, // Rewrite vararg parameters and arguments |
| 101 | + new NormalizeFlags, // Rewrite some definition flags |
| 102 | + new ExtensionMethods, // Expand methods of value classes with extension methods |
| 103 | + new ExpandSAMs, // Expand single abstract method closures to anonymous classes |
| 104 | + new TailRec, // Rewrite tail recursion to loops |
| 105 | + new LiftTry, // Put try expressions that might execute on non-empty stacks into their own methods |
| 106 | + new ClassOf), // Expand `Predef.classOf` calls. |
| 107 | + List(new PatternMatcher, // Compile pattern matches |
| 108 | + new ExplicitOuter, // Add accessors to outer classes from nested ones. |
| 109 | + new ExplicitSelf, // Make references to non-trivial self types explicit as casts |
| 110 | + new CrossCastAnd, // Normalize selections involving intersection types. |
| 111 | + new Splitter), // Expand selections involving union types into conditionals |
| 112 | + List(new VCInlineMethods, // Inlines calls to value class methods |
| 113 | + new SeqLiterals, // Express vararg arguments as arrays |
| 114 | + new InterceptedMethods, // Special handling of `==`, `|=`, `getClass` methods |
| 115 | + new Getters, // Replace non-private vals and vars with getter defs (fields are added later) |
| 116 | + new ElimByName, // Expand by-name parameters and arguments |
| 117 | + new AugmentScala2Traits, // Expand traits defined in Scala 2.11 to simulate old-style rewritings |
| 118 | + new ResolveSuper), // Implement super accessors and add forwarders to trait methods |
| 119 | + List(new Erasure), // Rewrite types to JVM model, erasing all type parameters, abstract types and refinements. |
| 120 | + List(new ElimErasedValueType, // Expand erased value types to their underlying implementation types |
| 121 | + new VCElideAllocations, // Peep-hole optimization to eliminate unnecessary value class allocations |
| 122 | + new Mixin, // Expand trait fields and trait initializers |
| 123 | + new LazyVals, // Expand lazy vals |
| 124 | + new Memoize, // Add private fields to getters and setters |
| 125 | + new LinkScala2ImplClasses, // Forward calls to the implementation classes of traits defined by Scala 2.11 |
| 126 | + new NonLocalReturns, // Expand non-local returns |
| 127 | + new CapturedVars, // Represent vars captured by closures as heap objects |
| 128 | + new Constructors, // Collect initialization code in primary constructors |
| 129 | + // Note: constructors changes decls in transformTemplate, no InfoTransformers should be added after it |
| 130 | + new FunctionalInterfaces,// Rewrites closures to implement @specialized types of Functions. |
| 131 | + new GetClass), // Rewrites getClass calls on primitive types. |
| 132 | + List(new LambdaLift, // Lifts out nested functions to class scope, storing free variables in environments |
| 133 | + // Note: in this mini-phase block scopes are incorrect. No phases that rely on scopes should be here |
| 134 | + new ElimStaticThis, // Replace `this` references to static objects by global identifiers |
| 135 | + new Flatten, // Lift all inner classes to package scope |
| 136 | + new RestoreScopes), // Repair scopes rendered invalid by moving definitions in prior phases of the group |
| 137 | + List(new ExpandPrivate, // Widen private definitions accessed from nested classes |
| 138 | + new CollectEntryPoints, // Find classes with main methods |
| 139 | + new LabelDefs), // Converts calls to labels to jumps |
| 140 | + List(new GenSJSIR), // Generate .js code |
| 141 | + List(new GenBCode) // Generate JVM bytecode |
| 142 | + ) |
| 143 | +``` |
| 144 | + |
| 145 | +Note that phases are grouped, so the `phases` method is of type |
| 146 | +`List[List[Phase]]`. The idea is that all phases in a group are |
| 147 | +*fused* into a single tree traversal. That way, phases can be kept |
| 148 | +small (most phases perform a single function) without requiring an |
| 149 | +excessive number of tree traversals (which are costly, because they |
| 150 | +have generally bad cache locality). |
| 151 | + |
| 152 | +Phases fall into four categories: |
| 153 | + |
| 154 | + - Frontend phases: `Frontend`, `PostTyper` and `Pickler`. `FrontEnd` parses the source programs and generates |
| 155 | + untyped abstract syntax trees, which are then typechecked and transformed into typed abstract syntax trees. |
| 156 | + `PostTyper` performs checks and cleanups that require a fully typed program. In particular, it |
| 157 | + |
| 158 | + - creates super accessors representing `super` calls in traits |
| 159 | + - creates implementations of synthetic (compiler-implemented) methods |
| 160 | + - avoids storing parameters passed unchanged from subclass to superclass in duplicate fields. |
| 161 | + |
| 162 | + Finally `Pickler` serializes the typed syntax trees produced by the frontend as TASTY data structures. |
| 163 | + |
| 164 | + - High-level transformations: All phases from `FirstTransform` to `Erasure`. Most of these phases transform |
| 165 | + syntax trees, expanding high-level constructs to more primitive ones. The last phase in the group, `Erasure` |
| 166 | + translates all types into types supported directly by the JVM. To do this, it performs another type checking |
| 167 | + pass, but using the rules of the JVM's type system instead of Scala's. |
| 168 | + |
| 169 | + - Low-level transformations: All phases from `ElimErasedValueType` to `LabelDefs`. These |
| 170 | + further transform trees until they are essentially a structured version of Java bytecode. |
| 171 | + |
| 172 | + - Code generators: These map the transformed trees to Java classfiles or Javascript files. |
| 173 | + |
| 174 | + |
0 commit comments