Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Features/string library #250

Merged
merged 7 commits into from
Nov 23, 2015
Merged

Features/string library #250

merged 7 commits into from
Nov 23, 2015

Conversation

EliasC
Copy link
Contributor

@EliasC EliasC commented Oct 13, 2015

This is @TobiasWrigstad and my work on adding a string library. It replaces the old type string with a passive class type String which has similar functionality to the immutable strings of Java. Writing "Hello" is now syntactic sugar for new String("Hello"), where the string argument is the embedded C type char*. The class itself can be found in bundles/standard/String.enc, which is the suggested location for the standard library.

Other notable additions in this PR is the char type (written as 'a') and the ability to give warnings during typechecking. See the commit history for more details.

I've tried to prune the history somewhat, but it could probably be squashed into even fewer commits. However, I think the full log makes it easier to see what has been done when doing the review (please let me know if you disagree). When this is deemed ready for merging, I will squash the history into two or three commits.

@EliasC
Copy link
Contributor Author

EliasC commented Oct 13, 2015

I should add that the only warning right now is that the old string type is deprecated.

@EliasC
Copy link
Contributor Author

EliasC commented Oct 16, 2015

Rebased and ready for review!

@kikofernandez
Copy link
Contributor

it complains that there would be conflicts and cannot be merged

@EliasC
Copy link
Contributor Author

EliasC commented Oct 16, 2015

@kikofernandez: Yes, because somebody decided to rebase plenary again today at half past four... ;)

@kikofernandez
Copy link
Contributor

the problem for the failed test is that the par.enc test has not been updated (using old string). It seems like an array of string is now an array of String and I get the error:

*** Error during typechecking ***
"par.enc" (line 39, column 13)
Type 'String' does not match expected type 'string'
In expression:
  new CreditCard("KIKO FERNANDEZ REYES",
                 "INVALID_CARD",
                 ["Online", "Clothes", "Food", "Party"])

line 39 is the new CreditCard("KIKO FERNANDEZ REYES",... but I don't know where... maybe you could throw an error where I can see which argument it's referring to?

I can push the refactoring to par.enc for this PR if you want to

@EliasC
Copy link
Contributor Author

EliasC commented Oct 20, 2015

@kikofernandez Had the change locally. Pushed now!

@EliasC
Copy link
Contributor Author

EliasC commented Oct 20, 2015

maybe you could throw an error where I can see which argument it's referring to?

I totally agree, but it's a problem of error reporting in general. Could you add it to the wailing wall in the encore-lang slack channel?

@kikofernandez
Copy link
Contributor

since we are using String types, we could use the equals method in par.enc

@kikofernandez
Copy link
Contributor

is there any way where:

if ((purchases[i]).equals("Online")) then

which typechecks could be written without the surrounding parenthesis such as:

if (purchases[i].equals("Online")) then

@@ -649,11 +652,14 @@ expr = unit
return $ Peer (meta pos) ty
print = do pos <- getPosition
reserved "print"
val <- expression
return $ Print (meta pos) "{}\n" [val]
arg <- option [] ((:[]) <$> expression)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change arg <- option [] ((:[]) <$> expression) allows us to write:

let x = something in
  print;

is there any reason for allowing this behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing only print will print a newline. This is how Python does it (although that might not me motivation to have it like that).

@kikofernandez
Copy link
Contributor

  • write documentation
  • is it expected that any time I want to create a new String("test") I get the following error message?
*** Error during typechecking ***                                      
 Type 'String' does not match expected type 'char*'

@kikofernandez
Copy link
Contributor

after this is clarified, I think this could be merged (previous squashing of the commits mentioned in the initial commit)

@EliasC
Copy link
Contributor Author

EliasC commented Oct 20, 2015

  • write documentation

Do you mean the scribble stuff, or some kind of per-method-documentation?

  • is it expected that any time I want to create a new String("test") I get the following error message?

Ah, no... This is a bug in the desugaring. What is going on is that "foo" gets desugared into new String("foo"), where the inner "foo" is a c-string. Thus, when you write out new String("test"), the inner string is desugared into a new String, which thus has the wrong type... Will look into it.

@EliasC
Copy link
Contributor Author

EliasC commented Oct 20, 2015

is there any way where:

   if ((purchases[i]).equals("Online")) then

which typechecks could be written without the surrounding parenthesis such as:

   if (purchases[i].equals("Online")) then

This is a general parsing issue, not related to strings. Use the wailing wall (or create an issue).

@kikofernandez
Copy link
Contributor

scribble stuff, fixing the bug and I think it's ready to be merged

@EliasC
Copy link
Contributor Author

EliasC commented Oct 20, 2015

Fix to par.enc bug pushed. I'll let you know when the other issues are sorted out.

@kaeluka
Copy link
Contributor

kaeluka commented Oct 21, 2015

In this case, it makes sense to mention string in the language documentation. In general, however, I think we should not aim to add all stdlib to the language documentation. The stdlib has its own documentation (which the language doc may of course link to)

@EliasC
Copy link
Contributor Author

EliasC commented Oct 21, 2015

@kaeluka: 👍

During the typechecking phase you can now emit warnings which will be
printed at compile time. Warnings are not arbitrary strings (like
errors) but must be explicitly defined as constructors of the type
`Warning`, together with the corresponding case for showing the warning.
Right now there is only one warning `StringDeprecatedWarning`, which is
given when the type `string` is used. It is produced in `Util.hs` in the
following code:

```
| isStringType ty = do
    tcWarning StringDeprecatedWarning
    return ty
```

I believe we should move to this system for errors as well as it
centralizes all possible errors that can be thrown. For example, we
would then write `assertSubtypeOf` (in `Typechecker.hs`) as

```
assertSubtypeOf :: Type -> Type -> TypecheckM ()
assertSubtypeOf sub super =
    unlessM (sub `subtypeOf` super) $
            tcError $ TypeMismatch sub super
```
It turns out that one of the implementations of split had a bug that
would produce incorrect results. It would only be used when patterns of
certain lengths were used, and the test case that triggered that bug was
commented out(!). I fixed this by removing the buggy implementation.
This is related to issue #282. The affected methods were
`string_from_array`, `to_array` and `split`. New test cases have been
added.
@EliasC
Copy link
Contributor Author

EliasC commented Nov 23, 2015

@vhphuc If you feel happy with the changes addressing your comments, you can merge this pull request by hitting the big green button below :)

PhucVH888 added a commit that referenced this pull request Nov 23, 2015
@PhucVH888 PhucVH888 merged commit 42c83e3 into parapluu:features/plenary Nov 23, 2015
@kikofernandez kikofernandez deleted the features/string-library branch November 23, 2015 18:37
@albertnetymk
Copy link
Contributor


  -- forall x string
  -- forall str string
  -- str.occurrences(x) + 1 == str.concat(x).occurrences(x)

class Main {
  def main() : void {
      let
        str = "hello"
        x = ""
      in {
        print str.occurrences(x);
        print str.concatenate(x).occurrences(x);
      }
  }
}

I recall there was some discussion on the behavior of occurrences; the results involved with empty string is surprising.

@EliasC
Copy link
Contributor Author

EliasC commented Nov 24, 2015

I would be more surprised if concatenating with the empty string somehow added "more" empty string to the string. Doing so would violate the monoid identity(?) rule:

forall s : string
    s.concatenate("") == ("").concatenate(s) == s

@EliasC
Copy link
Contributor Author

EliasC commented Nov 24, 2015

Compare with natural numbers and addition:

It is generally the case that a = b => a+x > b, but not when x = 0.

@albertnetymk
Copy link
Contributor

There's nothing odd on the behaviour of concat. I think the surprise originates from calling occurrences with empty string. Maybe, it's sensible to mandate non-empty string as the argument. JS does sth similar. http://stackoverflow.com/questions/4009756/how-to-count-string-occurrence-in-string

@EliasC
Copy link
Contributor Author

EliasC commented Nov 24, 2015

One could also argue that "Hello" contains six occurrences of "":

"_Hello"
"H_ello"
"He_llo"
"Hel_lo"
"Hell_o"
"Hello_"

This would also interplay nicely with the expected property s.occurrences(s) == 1 for the empty string. I think this makes more sense than banning the empty string altogether.

@EliasC
Copy link
Contributor Author

EliasC commented Nov 24, 2015

If you agree with this I will create an issue for it so that it can be fixed after the final PRs to plenary have been merged.

@albertnetymk
Copy link
Contributor

I am possibly too biased, but str.occurrences(x) + 1 == str.concat(x).occurrences(x) seems too correct to be violated.

@EliasC
Copy link
Contributor Author

EliasC commented Nov 24, 2015

@albertnetymk Surely you must agree that str.concat("") == str for all str?

@supercooldave
Copy link

@albertnetymk's invariant str.occurrences(x) + 1 == str.concat(x).occurrences(x) doesn't hold in general. Consider "wawa".occurrences("waw") == 1 and "wawawaw".occurrences("waw") == 3.

@EliasC
Copy link
Contributor Author

EliasC commented Nov 24, 2015

Since this PR has been merged already, maybe we should create an issue so the discussion doesn't get lost?

@albertnetymk
Copy link
Contributor

@EliasC Sure, appending empty string is a no-op for the original string. @supercooldave Indeed, thanks.
I guess the users of String would file it some day, if at all.

@albertnetymk
Copy link
Contributor

Wait a sec. @supercooldave I think "wawawaw".occurrences("waw") == 2, which is confirmed by current implementation of Encore. After each match, we remove the matched the prefix, so str.occurrences(x) + 1 == str.concat(x).occurrences(x) should hold, maybe only for non-empty x.

@supercooldave
Copy link

@albertnetymk: That may well be what the code does, but is that what the specification means?
Perhaps the specification should say count the number of non-overlapping occurrences, if this is the intended meaning.

@EliasC
Copy link
Contributor Author

EliasC commented Nov 25, 2015

Regardless of the implementation of occurrences, since str.concat("") == str for all str it must be the case that str.concat("").occurrences(x) == str.occurrences(x), which is preserved by the length + 1-interpretation.

Python it seems agrees with this:

Python

   str = "hello"
=> None
   len(str)
=> 5
   str.count("")
=> 6

Most languages I've tried seem to not even have such a method, maybe because they couldn't agree within the development teams? :)

@EliasC
Copy link
Contributor Author

EliasC commented Nov 25, 2015

Perhaps the specification should say count the number of non-overlapping occurrences, if this is the intended meaning.

+1

@albertnetymk
Copy link
Contributor

@EliasC count in ruby has a different interface from others http://ruby-doc.org/core-2.2.0/String.html#method-i-count

@supercooldave For python, it dose say "non-overlapping occurrences" in its doc for count. I kind of assumed that occurrences has similar semantics with count in python, but the result is not the same between those two,5 vs 6`.

@EliasC
Copy link
Contributor Author

EliasC commented Nov 25, 2015

@EliasC count in ruby has a different interface from others http://ruby-doc.org/core-2.2.0/String.html#method-i-count

Thanks! I missed this!

I kind of assumed that occurrences has similar semantics with count in python, but the result is not the same between those two,5 vs 6.

I also think that we should have the same semantics as Python does (as I have argued above).

@albertnetymk
Copy link
Contributor

I also think that we should have the same semantics as Python does (as I have argued above).

Currently, we do NOT, right?

@supercooldave
Copy link

We should have a clear specification and the implementation should abide (like The Dude).

@EliasC
Copy link
Contributor Author

EliasC commented Nov 25, 2015

@albertnetymk: Currently we do not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants