Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Char Operations #946

Closed
5 tasks done
gluax opened this issue May 11, 2021 · 13 comments · Fixed by #957
Closed
5 tasks done

[Feature] Char Operations #946

gluax opened this issue May 11, 2021 · 13 comments · Fixed by #957
Assignees
Labels
documentation Improvements or additions to documentation feature A new feature.

Comments

@gluax
Copy link
Contributor

gluax commented May 11, 2021

🚀 Feature

This allows operations +,-,==,!=,<=,<,>=,> to be used with the char type.
Depends on #939.
Depends on #940.
Depends on #942.
Depends on #950.

Motivation

This helps the implementation goal of #929.
Users should be able to perform equality and other basic operations with the type.

Implementation

Operators

  • ==
  • !=

Tests

  • Compiler Test each operation.
    • ==
    • !=
@gluax
Copy link
Contributor Author

gluax commented May 17, 2021

A few things here:
Realizing Rust doesn't let you do +,- operations for chars. Rust says to convert them to bytes then handle those operations.

Second for implementing + and - which of the following should we allow or both?
let b = 'a' + 1field;
let some_char = 'a' + 'a';

@acoglio
Copy link
Collaborator

acoglio commented May 17, 2021

We could go either way in regard to + and - -- having them work directly on char, or requiring conversions with integers. The RFC didn't really discuss that, perhaps implicitly suggesting to go the integer conversion route. However, adding/subtracting characters seems fairly natural to "shift" characters between ranges (for uppercase/lowercase we have explicit functions, but there may be others that make sense). If we support them directly on char, they should be well-defined only if the result is a Unicode code point (e.g. 'A' - 'a' would be an error at build time). If we require conversions with integers, we may want the compiler to optimize patterns like u32_to_char(char_to_u32(x) + char_to_u32(y)) to be just field addition in the R1CS. We can defer all of these decisions to later, since we could have an MVP without + an -.

@acoglio
Copy link
Collaborator

acoglio commented May 17, 2021

The first example

let b = 'a' + 1field;

would be a type error, because char and field are separate types. (We just compile characters to fields in the R1CS.) Unless we add implicit casts of some sort to Leo at some point.

The second example

let some_char = 'a' + 'a';

would be legal if we decide to support + directly on char -- yielding https://www.codetable.net/decimal/194

@gluax
Copy link
Contributor Author

gluax commented May 17, 2021

Got it, then I suppose the question is, do we want to allow the + or - operator or take the Rust approach on this matter and not allow them? I would assume the latter since that's what we've been doing.

@acoglio
Copy link
Collaborator

acoglio commented May 17, 2021

I personally don't have a strong opinion one way or the other. Maybe a very small preference for + and - directly on char just so we don't need to recognize and transform the kinds of patterns discussed above, and the code looks cleaner.

@gluax What do you prefer? What about @bendyarm @damirka @Protryon @collinc97 @howardwu?

(We can always defer the decision -- not needed for an MVP.)

@gluax
Copy link
Contributor Author

gluax commented May 17, 2021

I would personally go for the Rust route of not allowing it.

@collinc97
Copy link
Collaborator

I agree with @gluax . Supporting + and - operations on unicode code point characters would allow leo users to access noncharacters.
While the underlying code point value of a noncharacter would be a valid field in Leo, we would not have a way to represent the character as code text.

The rust motivation for disallowing + and - is similar. "Unicode encoding is not a continuous sequence of characters, for instance U+D800...U+DFFF aren't valid Unicode characters as they cannot be represented in UTF-16."

@acoglio
Copy link
Collaborator

acoglio commented May 17, 2021

I think it's fine to go the Rust route, but I'll just point out for completeness that we could make operations that yield noncharacters illegal in the same way in which we can make additions that overflow illegal -- checked on actual inputs via interpretation.

@acoglio
Copy link
Collaborator

acoglio commented May 17, 2021

The initial design allows any value between 0 and 10FFFF in char, but we could tighten that at some point to exclude code points that are not valid characters.

@acoglio
Copy link
Collaborator

acoglio commented May 17, 2021

Since in Leo we have concrete inputs at compile time, we can do things cheaply that would be expensive in "traditional" languages.

@gluax
Copy link
Contributor Author

gluax commented May 18, 2021

@acoglio Maybe, for now, we can pull out the two operations from this issue and into a new one as I don't believe they are necessary for an MVP.

@acoglio
Copy link
Collaborator

acoglio commented May 18, 2021

@gluax Yes, that sounds like a good idea. I think that == and != are the only really required ones -- to have equality on all types.

Do you know if we have snarkVM targets for the ordering operations on field elements? (For < etc.) I think those are useful operations, but perhaps not needed for an MVP. (We can convert to integers and compare.)

@gluax
Copy link
Contributor Author

gluax commented May 18, 2021

@acoglio I just asked Collin about that; it should be in with == and !=. I am also setting it up so that chars will automatically have them when the operator is implemented for fields.

Whether we decide to implement those or not is another question and I can update #950 accordingly. I don't think it should be too much work imo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation feature A new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants