Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic on invalid utf16 instead of an Error #778

Closed
alexanderkjall opened this issue Oct 3, 2020 · 1 comment · Fixed by #853
Closed

Panic on invalid utf16 instead of an Error #778

alexanderkjall opened this issue Oct 3, 2020 · 1 comment · Fixed by #853
Assignees
Labels
bug Something isn't working lexer Issues surrounding the lexer

Comments

@alexanderkjall
Copy link

Describe the bug

Expected an Error, not a panic.

To Reproduce
Can be reproduced with this program

fn main() {
    let _ = boa::parse("'43\\uDDDDDDDDD19");
}

Expected behavior

An Error.

Build environment (please complete the following information):

  • OS: Ubuntu 20.04
  • Version: 0.10.0
  • Target triple: [e.g. x86_64-unknown-linux-gnu]
  • Rustc version: 1.48.0-nightly (d006f5734 2020-08-28)

Additional context
Full stacktrace:

thread 'main' panicked at 'Could not get next codepoint: DecodeUtf16Error { code: 56797 }', /home/capitol/.cargo/registry/src/gh.neting.cc-1ecc6299db9ec823/Boa-0.10.0/src/syntax/lexer/string.rs:185:42
stack backtrace:
   0: rust_begin_unwind
             at /rustc/d006f5734f49625c34d6fc33bf6b9967243abca8/library/std/src/panicking.rs:483
   1: core::panicking::panic_fmt
             at /rustc/d006f5734f49625c34d6fc33bf6b9967243abca8/library/core/src/panicking.rs:85
   2: core::option::expect_none_failed
             at /rustc/d006f5734f49625c34d6fc33bf6b9967243abca8/library/core/src/option.rs:1221
   3: core::result::Result<T,E>::expect
             at /home/capitol/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs:933
   4: <boa::syntax::lexer::string::StringLiteral as boa::syntax::lexer::Tokenizer<R>>::lex
             at /home/capitol/.cargo/registry/src/gh.neting.cc-1ecc6299db9ec823/Boa-0.10.0/src/syntax/lexer/string.rs:182
   5: boa::syntax::lexer::Lexer<R>::next
             at /home/capitol/.cargo/registry/src/gh.neting.cc-1ecc6299db9ec823/Boa-0.10.0/src/syntax/lexer/mod.rs:191
   6: boa::syntax::parser::cursor::buffered_lexer::BufferedLexer<R>::fill
             at /home/capitol/.cargo/registry/src/gh.neting.cc-1ecc6299db9ec823/Boa-0.10.0/src/syntax/parser/cursor/buffered_lexer/mod.rs:116
   7: boa::syntax::parser::cursor::buffered_lexer::BufferedLexer<R>::peek
             at /home/capitol/.cargo/registry/src/gh.neting.cc-1ecc6299db9ec823/Boa-0.10.0/src/syntax/parser/cursor/buffered_lexer/mod.rs:201
   8: boa::syntax::parser::cursor::Cursor<R>::peek
             at /home/capitol/.cargo/registry/src/gh.neting.cc-1ecc6299db9ec823/Boa-0.10.0/src/syntax/parser/cursor/mod.rs:56
   9: <boa::syntax::parser::Script as boa::syntax::parser::TokenParser<R>>::parse
             at /home/capitol/.cargo/registry/src/gh.neting.cc-1ecc6299db9ec823/Boa-0.10.0/src/syntax/parser/mod.rs:124
  10: boa::syntax::parser::Parser<R>::parse_all
             at /home/capitol/.cargo/registry/src/gh.neting.cc-1ecc6299db9ec823/Boa-0.10.0/src/syntax/parser/mod.rs:104
  11: boa::parse
             at /home/capitol/.cargo/registry/src/gh.neting.cc-1ecc6299db9ec823/Boa-0.10.0/src/lib.rs:73
  12: boa_reproduce::main
             at ./src/main.rs:2
  13: core::ops::function::FnOnce::call_once
             at /home/capitol/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:227
@alexanderkjall alexanderkjall added the bug Something isn't working label Oct 3, 2020
@HalidOdat HalidOdat added the lexer Issues surrounding the lexer label Oct 8, 2020
@jevancc
Copy link
Contributor

jevancc commented Oct 9, 2020

Hi @HalidOdat , I would like to work on this issue. Could you assign it to me please?

I've looked into the expected behavior of it. We can have a workaround like replacing the invalid code point with a fixed replacement character. However, it can not handle the case like:

const x = '\uD834'; // '�', invalid code point
const y = '\uDD1E'; // '�', invalid code point
console.log(x + y); // '��', should be '𝄞'

I think this case is difficult to handle until we have #736 resolved. Any ideas about it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working lexer Issues surrounding the lexer
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants