Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework defaulting #730

Closed
robdockins opened this issue May 18, 2020 · 13 comments
Closed

Rework defaulting #730

robdockins opened this issue May 18, 2020 · 13 comments
Labels
language Changes or extensions to the language typechecker Issues related to type-checking Cryptol code.
Milestone

Comments

@robdockins
Copy link
Contributor

The defaulting rules, especially for bitvector widths, have long been a source of irritation and confusion in Cryptol. With PR #724, we have a good opportunity/excuse to revisit this topic, especially as we are integrating the long-planned feature of generalizing indexing operations over Integral types.

In my experience, the vast majority of defaulting and resulting warnings occur because of source-file literal values (including finite enumerations). I think our goal should be to eliminate the vast majority of these warnings via smart defaulting so that the remaining cases can be upgraded to errors instead.

I suggested (CF #724 (comment)) the following strategies:

For Literal n a constraints where a is an unconstrained variable or constrained only by Zero, Ring or Integral, we should default without warning to Integer. For Literal n a where a is a variable constrained by Field or Round, we should default to Rational.

In addition, I think we should consider the following rule: All other cases should be an ambiguous type error (including Literal n [_] cases we currently default with warnings). This includes removing special case defaulting rules for indexing primitives, etc. If we adopt this rule, decimal literals at bitvector type will always have to have their widths fixed by the surrounding context (type signatures or an explicit type ascription).

@robdockins robdockins added typechecker Issues related to type-checking Cryptol code. language Changes or extensions to the language labels May 18, 2020
@robdockins
Copy link
Contributor Author

Having thought about this some more, I want to make the following concrete proposal:

  • Numeric literals are the only values subject to defaulting.
  • All numeric literals are polymorphic; however they have a preferred (ground, monomorphic) type they will adopt if their type is not otherwise specified according to their syntax.
  • Any type ambiguities that are not removed by choosing preferred types are errors (i.e., no guessing bitvector widths).

The numeric literals will have preferred types according to the following rules:

  • Regular decimal literals have preferred type Integer
  • Hex, octal and binary literals have preferred type [n] where n is determined by the number of digits written.
  • Fractional literals have preferred type Rational
  • Character literals are treated as numeric literals and have preferred type [8].
  • String literals always have fixed (not preferred) type [n][8] where n is the length of the string.

If the type of two literals needs to be unified and their preferred type does not match, the result is a that the type has no preferred type and the result will be an ambiguous type error unless the type is eventually unified with something else.

This squashes ALL the usual defaulting warnings (they either disappear or become errors) and I think it makes the whole system both more flexible and more predictable.

@weaversa
Copy link
Collaborator

weaversa commented Jun 3, 2020

The proposal overall seems good. However, I prefer the stricter types we have right now on hex, octal, and binary literals. In the years since the change was made, I’ve never written a spec where that caused an issue. It’s nice to run examples in the interpreter without typing all the leading zeroes, but the backtick works just fine there.

We did write specs that called for 6-bit and 7-bit ascii, and so relaxing the character types would be sometimes beneficial.

@robdockins
Copy link
Contributor Author

The idea for relaxing the rules for hex literals, etc, was to deal with the issue mentioned in this comment #744 (comment).

Basically cryptol will print things like 0x15:[5] that aren't valid syntax; we could instead allow them, if we wanted. I don't have a strong opinion.

Regarding character types, are you also suggesting that string literals have more flexible types, so you could directly write, e.g. "ASDF":[_][7]? I suppose we could even go the other way, and allow strings of arbitrary unicode codepoints via "ASDF":[_][24].

@yav
Copy link
Member

yav commented Jun 3, 2020

I like the general plan that Rob is suggesting but I also think that we should keep the literals monophonic, and fix the printer to be smarter about how it shows things.

The reason I think we should keep things monophonic is that defaulting only happens when a type would be otherwise ambiguous (which is itself a bit of a hard thing to specify). If a type is not ambiguous, then we just infer a polymorphic type. If literals are polymorphic, then a lot more declarations end up being polymorphic and you end up with polymorphic code on accident.

We should also keep in mind that we have two defaulting algorithms which serve different purposes:

  1. When the Cryptol program is actually ambiguous
  2. When you ask Cryptol to print a polymorphic value

(1) is quite important for the semantics of the language; (2) is a quality of life thing.

I think folks new to Cryptol often tend to lump these two together as they are both warnings so we should consider how to phrase things better. Perhaps, being more aggressive on turning warnings into errors in (1) if we can't default would help. I am all for that!

@robdockins
Copy link
Contributor Author

As I said, I don't have a strong opinion about the hex, octal and binary literals.

However, I don't think you'd end up with overly-polymorphic code very often. The whole idea of the "preferred" type is that if a type variable has a preferred type, you simply instantiate it at the preferred type instead of generalizing.

@yav
Copy link
Member

yav commented Jun 3, 2020

Hmm, wouldn't that be very confusing? If I understand correctly, just naming a literal would change its behavior... Consider this example:

f : [7]
f = 0x15   -- OK

g : [7]
g = x     -- Not OK

x = 0x15   -- inferred type for `x` would be `[8]`?

Also, at least at the moment, the Literal class does not depend on the notation you actually used to write the literal, so we wouldn't know how to default things, although it wouldn't be hard to parameterize it by an extra radix component, I guess.

Overall, sticking with the simple monomorphic story is a lot simpler.

EDIT: Fix example as it was bogus (thanks @robdockins )

@yav
Copy link
Member

yav commented Jun 3, 2020

By the way, I think that having Cmp around is also OK. So I think maybe a reasonably straight forward story would be:

  1. Start defaulting from a Literal n a constraint.
  2. If a has a Field or Round constraint, then default it to Rational
  3. otherwise, default a to Integer.

Thoughts?

@robdockins
Copy link
Contributor Author

Maybe you meant the types of f and g to be something smaller, like [7]?

In any case, I don't think this is terribly confusing, but obviously opinions can differ.

@robdockins
Copy link
Contributor Author

FWIW, I don't think @yav 's solution helps with #747 or with character literals generally, as they would get defaulted to Integer.

@yav
Copy link
Member

yav commented Jun 3, 2020

Yes that's true, but in that example there are no character literals :-) The whole characters in the type thing is a terrible hack.

@robdockins robdockins added this to the 2.9.0 milestone Jun 19, 2020
@robdockins
Copy link
Contributor Author

I've implemented essentially the suggestion from #730 (comment) in PR #774. This leaves the fixed-width literals alone (they still correspond to precise bitwidths). Decimal literals default to unlimited precision types (Integer or Rational) only; the associated warnings are suppressed by default. Other situations that used to default are now errors instead (except that we will default polymorphic values at the REPL as before).

Overall, I'm fairly pleased with the result, and I think we should seriously consider adopting it.

@atomb
Copy link
Contributor

atomb commented Jun 23, 2020

I'm also quite happy with what I've seen.

@robdockins
Copy link
Contributor Author

Fixed via #774

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language Changes or extensions to the language typechecker Issues related to type-checking Cryptol code.
Projects
None yet
Development

No branches or pull requests

4 participants