Skip to content

Fix #5980 #6023

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 19, 2019
Merged

Fix #5980 #6023

merged 1 commit into from
Mar 19, 2019

Conversation

anatoliykmetyuk
Copy link
Contributor

This is a fix for #5980.

Motivation

I have not found a proof anywhere that F[A] & F[B] <:< F[A & B] for all covariant F[+_]. However, this property is featured in tests and docs, and there is a deliberate logic in the sources to distribute type parameters for covariant type constructors.

Also, it seems that the bug originates not really because of the distribution itself but because of inconsistent treatment of how subtyping relationship is inferred for traits and type params. Even if the F[A] & F[B] <:< F[A & B] is not true for all types, I think the relationship should be inferred the same way (in the sense elaborated in the Solution section below) for both type params and traits.

Solution

Consider that we need to compute F[A] & F[B] <:< F[A & B]. F can be either a type param or a trait (see the issue description above). The <:< relationship needs to be computed e.g. when assigning F[A] & F[B] to a value of type F[A & B].

The difference between how traits and type params are treated originates at the match inside compareAppliedType2 of isSubType of TypeComparer.

TypeBounds handles type params among other things and, via compareLower, ultimately leads to the fourthTry. In forthTry, ultimately the following logic is used: "A & B <: C is true if either A <: C or B <: C is true".

In case of ClassInfo, the branch will ultimately lead us to this line which computes the & operation over the two type bounds.

Since the & operation sometimes simplifies the types (as in for covariant F[+_], it may distribute type parameters, so that F[A] & F[B] becomes F[A & B]), the two behaviors (for traits and type bounds) are inconsistent. For traits, the result of the & operation will be checked to be a subtype of the target type, and for type params, no such check is performed.

To summarise, when inferring F[A] & F[B] <:< F[A & B], in case if F is a class or a trait, the result of the & operation on types will be compared to be a subtype of F[A & B]. No such check will be performed if F is a type param.

This PR adds such a check to the corresponding inference logic.

Notes on andType

The problem with &

andType is still not the same as &. However, if you plug in & instead of andType there, Scala 2 library compilation tests will not pass. Precisely, this line will cause an infinite loop when inferring a subtype via isSubType we have just modified. In a minimised form, the offending example looks like this (also added to the tests):

object foo {
  trait A[X]
  trait B extends A[B]
  trait Min[+S <: B with A[S]]

  def c: Any = ???
  c match {
    case pc: Min[_] =>
  }
}

The loop seems to happen on the case pc: Min[_] => typecheck. When traced which types get compared in turn in recur of TypeComparer (via println(s"recCount = ${recCount}; ${tp1.show} <: ${tp2.show}")), under -Yno-deep-subtypes compiler option, the trace is:

// ...
recCount = 0; foo.Min <: class dotty.tools.dotc.typer.ProtoTypes$AnyTypeConstructorProto$
recCount = 0; Nothing <: Nothing
recCount = 0; foo.B & foo.A[LazyRef(foo.Min[_]#S)] <: Any
recCount = 0; _ <: Nothing
recCount = 1; foo.B & foo.A[LazyRef(foo.Min[_]#S)] <: Nothing
recCount = 2; foo.B <: Nothing
recCount = 2; foo.A[LazyRef(foo.Min[_]#S)] <: Nothing

// The previous two comparisons failed. Since they were made to determine if foo.B & foo.A[LazyRef(foo.Min[_]#S)] and
// since we have modified the logic that determins this condition,
// maybeAndType invoked here in attempt to compute the & operation on
// foo.B and foo.A[LazyRef(foo.Min[_]#S)] types
// As part of this operation, it tries to see if either operand is a subtype of the other
// Hence the traces from here are from within the & operation
recCount = 2; foo.A[LazyRef(foo.Min[_]#S)] <: foo.B

// Start of the loop's period
recCount = 2; foo.B <: foo.A[LazyRef(foo.Min[_]#S)]
recCount = 3; foo.A[foo.B] <: foo.A[LazyRef(foo.Min[_]#S)]
recCount = 4; foo.type(foo) <: foo.type(foo)
recCount = 4; LazyRef(foo.Min[_]#S) <: foo.B
recCount = 5; foo.Min[_]#S <: foo.B
recCount = 6; foo.B & foo.A[LazyRef(foo.Min[_]#S)] <: foo.B
recCount = 7; foo.B <: foo.B
recCount = 4; foo.B <: LazyRef(foo.Min[_]#S)
recCount = 5; foo.B <: foo.Min[_]#S
recCount = 6; foo.B <: foo.B & foo.A[LazyRef(foo.Min[_]#S)]
recCount = 7; foo.B <: foo.B

// End of the loop's period, note how the next trace is the same as the one at the beginning of the loop's period start
// From here, the period will repeat itself
recCount = 7; foo.B <: foo.A[LazyRef(foo.Min[_]#S)]
recCount = 8; foo.A[foo.B] <: foo.A[LazyRef(foo.Min[_]#S)]
recCount = 9; foo.type(foo) <: foo.type(foo)
recCount = 9; LazyRef(foo.Min[_]#S) <: foo.B
recCount = 10; foo.Min[_]#S <: foo.B
recCount = 11; foo.B & foo.A[LazyRef(foo.Min[_]#S)] <: foo.B
recCount = 12; foo.B <: foo.B
recCount = 9; foo.B <: LazyRef(foo.Min[_]#S)
recCount = 10; foo.B <: foo.Min[_]#S
recCount = 11; foo.B <: foo.B & foo.A[LazyRef(foo.Min[_]#S)]
recCount = 12; foo.B <: foo.B
recCount = 12; foo.B <: foo.A[LazyRef(foo.Min[_]#S)]
recCount = 13; foo.A[foo.B] <: foo.A[LazyRef(foo.Min[_]#S)]
recCount = 14; foo.type(foo) <: foo.type(foo)
recCount = 14; LazyRef(foo.Min[_]#S) <: foo.B

Basically for some reason at some point foo.B & foo.A[LazyRef(foo.Min[_]#S)] (which is the type bounds of S) gets compared to Nothing. Given that & operation on types as part of its workflow checks its operands to be a subtype of one another, we end up with the above periodic situation if we solve the bug with &. It seems that type aliases are involved here: S is a TypeAlias. Normally, the compiler does not allow for things like type S = B with A[S] because of cyclic references. However, it seems to allow for such a cyclic reference under the special conditions presented above.

Why andType does the job

& delegates to glb computation, and andType basically checks whether it can distribute the type arguments if the two operands are type constructors, falling back to the original AndType otherwise. glb involves subtyping checks between its two arguments, and andType does not. Hence the risk of getting an infinite subtyping check loop is eliminated. This also is enough to make sure the bug in question is solved, since the behavior now is more consistent between traits and type params – more consistent distribution-wise.

Tests

With the & solution, among all the testCompilation, only the issue above was encountered. After replacing & with andType all tests, including the one testing the issue in question, pass. I've also included the tests to describe that weird typecheck looping behavior I encountered, just in case.

Copy link
Member

@dottybot dottybot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, and thank you for opening this PR! 🎉

All contributors have signed the CLA, thank you! ❤️

Commit Messages

We want to keep history, but for that to actually be useful we have
some rules on how to format our commit messages (relevant xkcd).

Please stick to these guidelines for commit messages:

  1. Separate subject from body with a blank line
  2. When fixing an issue, start your commit message with Fix #<ISSUE-NBR>:
  3. Limit the subject line to 72 characters
  4. Capitalize the subject line
  5. Do not end the subject line with a period
  6. Use the imperative mood in the subject line ("Add" instead of "Added")
  7. Wrap the body at 80 characters
  8. Use the body to explain what and why vs. how

adapted from https://chris.beams.io/posts/git-commit

Have an awesome day! ☀️

@anatoliykmetyuk
Copy link
Contributor Author

anatoliykmetyuk commented Mar 5, 2019

More thoughts on why andType is usable instead of & and gives consistency with the behavior in case of traits.

& is implemented in terms of glb, and glb ultimately uses andType. glb distinguishes the following input cases:

  • Cases where glb is either of its operands – in the scenario of checking A & B <: C, if A & B is simplified to either A or B, it has already been checked at the point where we want to simplify A & B. Hence we don't need to do it again.
  • Either of the operands is an OrTypeOrTypes seem to also be handled in the fourthTry in the AndType(tp11, tp12) branch, right before the modified snippet.
  • Either of the operands is a subtype of another, in which case the result of glb seems to be that subtype. So again, A & B is simplified to either A or B which were already handled.
  • Both operands are ConstantType, in which case return Nothing. So A & B gets simplified to Nothing in this case, which is a subtype of everything. Type bounds don't seem to inherit from ConstantType, so in case of type parameters, the semantics stays the same. However, I have doubts on whether the semantics stays the same for all possible cases when neither A <: C nor B <: C are true – it depends on whether ConstantTypes can get to the fourthTry.
  • In all other cases, use andType.

If used on types which cannot be simplified, andType seems to return these two types under the AndType – essentially, AndType(A, B) stays the same if we can't simplify A and B via andType.

So basically in all cases, in case of type parameters, the semantics of andType should be the same as & when used in type comparison scenario.

However, I'm not completely sure whether it makes for the consistent subtyping behaviour between type params and traits since there's a lot going on there. Would appreciate some feedback on the above logic regarding whether andType can be used in place of & there without loss of consistency between type params and traits.

@odersky
Copy link
Contributor

odersky commented Mar 6, 2019

test performance please

@dottybot
Copy link
Member

dottybot commented Mar 6, 2019

performance test scheduled: 1 job(s) in queue, 1 running.

@dottybot
Copy link
Member

dottybot commented Mar 6, 2019

Performance test finished successfully:

Visit http://dotty-bench.epfl.ch/6023/ to see the changes.

Benchmarks is based on merging with master (5f5b5b0)

@@ -737,7 +737,12 @@ class TypeComparer(initctx: Context) extends ConstraintHandling[AbsentContext] {
return recur(AndType(tp11, tp121), tp2) && recur(AndType(tp11, tp122), tp2)
case _ =>
}
either(recur(tp11, tp2), recur(tp12, tp2))

def maybeAndType = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

either by itself is expensive, so it would be good to avoid the extra call.
Furthermore, andType(tp11, tp12) is known to be a subtype of AndType(tp1, tp2) and we can exploit that.
Finally, != should not be used for type comparisons, since it is meaningless and potentially expensive. Use
=:= for semantic equality or eq for syntactic equality.
So how about:

val tp1norm = andType(tp11, tp12)
if (tp1norm ne tp1) recur(tp1norm, tp2)
else either(recur(tp11, tp2), recur(tp12, tp2))

Copy link
Contributor Author

@anatoliykmetyuk anatoliykmetyuk Mar 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, though this solution kept failing run/mixins1/A_1.scala – minimised example:

trait A[This <: A[This] with B]
trait B extends A[B]

At some point, A[LazyRef(This)] & A[B] ne A[LazyRef(This)] & A[B] returned true. This is due to LazyRef, which is uncached. I modified the solution to avoid needless construction of the AndType if we already have the equivalent object.

Probably related to #6057.

@odersky
Copy link
Contributor

odersky commented Mar 10, 2019

Looks good now, thanks!

@smarter

This comment has been minimized.

@anatoliykmetyuk anatoliykmetyuk force-pushed the master branch 2 times, most recently from a4d6e7c to 8f0f13c Compare March 10, 2019 19:02
@anatoliykmetyuk

This comment has been minimized.

@smarter

This comment has been minimized.

@anatoliykmetyuk anatoliykmetyuk force-pushed the master branch 2 times, most recently from 49ae979 to 938c259 Compare March 10, 2019 19:16
@anatoliykmetyuk

This comment has been minimized.

@smarter

This comment has been minimized.

@@ -1641,6 +1643,18 @@ class TypeComparer(initctx: Context) extends ConstraintHandling[AbsentContext] {
NoType
}

private[this] def andTypeGen(tp1: Type, tp2: Type, op: (Type, Type) => Type
, original: (Type, Type) => Type, isErased: Boolean = ctx.erasedTypes): Type = trace(s"glb(${tp1.show}, ${tp2.show})", subtyping, show = true) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems original is always _ & _. Why not inline it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't inline it for consistency reasons: if there is a degree of freedom in the op parameter to liftIfHK, it would IMO be weird if there wouldn't be a degree of freedom in the original as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the second thought, I agree, _ & _ repeated twice doesn't look nice. I have found a compromise in terms of making the original in andTypeGen a default parameter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why this function signature should be consistent with liftIfHK. And I don't think having a default parameter is the right way to remove the duplication.

Bonus point: if you inline original, you will avoid allocating a lambda most of the time.

Copy link
Contributor Author

@anatoliykmetyuk anatoliykmetyuk Mar 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why this function signature should be consistent with liftIfHK.

The problem of the control over the execution flow of the function is solved by decoupling the logic from the data. Doing so partially is inconsistent (even if sufficient for the issue in question).

Bonus point: if you inline original, you will avoid allocating a lambda most of the time.

How?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When calling andTypeGen, a lambda is allocated for original. But unless liftIfHK is called, there is no need to allocate this lambda.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it have to do with the fact that andTypeGen passes the reference along but liftIfHK calls it in its body?

…tors

When performing subtyping comparisons of the `F[A] & F[B] <:< C` kind,
if `F` is a type parameter (e.g. as in `trait Foo[F[+_]]`), the compiler
decides the relationship to be true if either `F[A] <: C` or `F[B] <: C`.
However, in case of covariant `F[+_]`, it is possible to distribute the
type arguments when performing the `&` operation: `F[A] & F[B] ~> F[A & B]`.
Hence there is one more case to be considered when checking `<:<`
relationship. This commit adds such a check to the `<:<` logic. Now,
`F[A] & F[B] <:< C` is true (for covariant `F`) if either of the following
is true:
- F[A] <: C
- F[B] <: C
- F[A & B] <: C
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants