GADTs might need a different implementation scheme

Working with opaque types, which are based on our existing GADT machinery, I see more and more problems with our existing approach:

This one is known: 

- GADT maps are different from constraints, so no unfication is possible between variables defined in one and variables defined in the other.

But these have not been voiced before:

 - Knowledge about GADT bounds has spread far into the core algorithms of the compiler. It's not just subtyping, by now we consult the GADT bounds also

   - when computing members of a type
   - when computing base types
   - when following the underlying type of a type proxy

    It's not clear at all whether this is enough, or whether we need to consult GADT maps in more places. 

 - GADT maps pose enormous problems for caching. Every update to a GADT map invalidates all caches that contain values whose computation relied on the previous value (or relied on the fact that there was no GADT entry yet). But there is no way to trace these caches. The alternative of not caching any info that could be affected by GADT bounds is unappealing as well, because it looks very hard to figure out what might be affected later on.

So far, we have not seen much fail empirically, because GADTs were not really fully implemented so very little code relied on them. But if we want that to change, we have to take these problems seriously.

## An alternative approach

Here's a different implementation scheme which avoids the cited problems (whether it will create new ones remains to be seen).

1. When type checking a case in a match, determine all type parameters with term owners that are referenced from the widened scrutinee type. Call these the _GADT parameters_. (This step is the same as in the current scheme)

2. Determine the _support set_ of the GADT parameters. This is the smallest set _S_ of term-owned symbols such that:

 - every GADT parameter is in _S_.
 - if a symbol _x_ is owned by a term and has a type info _T_ that refers to a symbol in _S_, then _x_ is in _S_ (for both term and type symbols _x_).

3. Define a map _M_ over the support set as follows: 

 - Every GADT parameter in _S_ is mapped to a type variable whose `TypeParamRef` origin has the bounds of the original symbol mapped recursively by _M_.
 - Every other symbol _s_ in _S_ is mapped to a fresh symbol of the same kind as _s_, whose info is the original info of _s_ mapped recursively by _M_.

   Contexts have a new field `gadtMap` that contains the current map.

4. A match case gets typechecked in a new context, where a new instance of _M_ as computed in the last step is added to the `gadtMap` field and all TypeVars in _M_ s image are added to the constraint.

5. When typechecking an identifier in `typedIdent`, map the computed type (always a `TermRef` or ` TypeRef`) using the context's GADT map. 

   [Aside: The mapping would be avoided if we could _replace_ old scopes with new scopes in the context. But this means duplicating potentially a stack of context objects, because GADT parameters might be bound in outer contexts. This looks tricky.]

6. Typecheck pattern against scrutinee as usual. This might further constrain images of GADT parameters in the constraint. After that is done,

 - instantiate all GADT TypeVars in _M_ to fresh abstract type symbols, with the bounds copied over and mapped recursively.
 - instantiate all pattern bound TypeVars as usual.

7. Typecheck the guard and RHS of the case as usual. At the end, check that the type of RHS
conforms to the expected type of the match after mapping the latter with _M_. If the type of the RHS does not conform to expected type without applying _M_, insert a cast.








Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GADTs might need a different implementation scheme #5254

An alternative approach

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

GADTs might need a different implementation scheme #5254

Description

An alternative approach

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions