Skip to content
This repository has been archived by the owner on Aug 31, 2023. It is now read-only.

feat(rome_json_parser): JSON Lexer #3809

Merged
merged 3 commits into from
Nov 23, 2022
Merged

feat(rome_json_parser): JSON Lexer #3809

merged 3 commits into from
Nov 23, 2022

Conversation

MichaReiser
Copy link
Contributor

@MichaReiser MichaReiser commented Nov 21, 2022

Summary

Part of #2351

This is the first step toward our own JSON parser. The PR implements a basic JSON lexer with basic error recovery.

Non-standard features implemented by the lexer for better error recovery:

  • Support for comments
  • Support for identifiers
  • Support for single-quoted string

This PR introduces a new rome_js_unicode_table crate that is shared between the json and JS parser and allows for fast unicode resolution.

Test Plan

I wrote a handful of manual tests and copied the JSONTestSuite

@MichaReiser MichaReiser added this to the 11.0.0 milestone Nov 21, 2022
@netlify
Copy link

netlify bot commented Nov 21, 2022

Deploy Preview for docs-rometools canceled.

Name Link
🔨 Latest commit aaac99a
🔍 Latest deploy log https://app.netlify.com/sites/docs-rometools/deploys/637dd782bb58cc0008343fdf

@MichaReiser MichaReiser added A-Parser Area: parser L-JSON Language: JSON labels Nov 21, 2022

#[derive(Debug, Diagnostic, Clone)]
#[diagnostic(category = "parse", severity = Error)]
pub struct ParseDiagnostic {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diagnostic is a duplicate for now. I plan to move it into a rome_parser crate once I start refactoring the parser.

@github-actions
Copy link

github-actions bot commented Nov 21, 2022

Parser conformance results on ubuntu-latest

js/262

Test result main count This PR count Difference
Total 45879 45879 0
Passed 44936 44936 0
Failed 943 943 0
Panics 0 0 0
Coverage 97.94% 97.94% 0.00%

jsx/babel

Test result main count This PR count Difference
Total 39 39 0
Passed 36 36 0
Failed 3 3 0
Panics 0 0 0
Coverage 92.31% 92.31% 0.00%

symbols/microsoft

Test result main count This PR count Difference
Total 5946 5946 0
Passed 1757 1757 0
Failed 4189 4189 0
Panics 0 0 0
Coverage 29.55% 29.55% 0.00%

ts/babel

Test result main count This PR count Difference
Total 588 588 0
Passed 519 519 0
Failed 69 69 0
Panics 0 0 0
Coverage 88.27% 88.27% 0.00%

ts/microsoft

Test result main count This PR count Difference
Total 16257 16257 0
Passed 12397 12397 0
Failed 3860 3860 0
Panics 0 0 0
Coverage 76.26% 76.26% 0.00%

@MichaReiser MichaReiser force-pushed the feat/json-lexer branch 3 times, most recently from 97ef5cd to d4eda07 Compare November 21, 2022 15:17
@MichaReiser
Copy link
Contributor Author

!bench_parser

@MichaReiser MichaReiser force-pushed the feat/json-lexer branch 2 times, most recently from 5bc9e50 to b246357 Compare November 21, 2022 15:34
@MichaReiser MichaReiser marked this pull request as ready for review November 21, 2022 15:34
@MichaReiser MichaReiser requested a review from xunilrj as a code owner November 21, 2022 15:34
@MichaReiser MichaReiser requested a review from a team November 21, 2022 15:34
@MichaReiser
Copy link
Contributor Author

!bench_parser

@calibre-analytics
Copy link

calibre-analytics bot commented Nov 21, 2022

Comparing feat(rome_json_parser): JSON Lexer Snapshot #5 to median since last deploy of rome.tools.

LCP? CLS? TBT?
Overall
Median across all pages and test profiles
2.25s
from 264ms
0.0
no change
136ms
no change
Chrome Desktop
Chrome Desktop • Cable
2.25s
from 264ms
0.0
no change
328ms
from 22ms
iPhone, 4G LTE
iPhone 12 • 4G LTE
1.08s
from 239ms
0.0
no change
11ms
no change
Motorola Moto G Power, 3G connection
Motorola Moto G Power • Regular 3G
16.4s
from 1.06s
0.0
no change
136ms
no change

1 page tested

 Home

Browser previews

Chrome Desktop iPhone, 4G LTE Motorola Moto G Power, 3G connection
Chrome Desktop iPhone, 4G LTE Motorola Moto G Power, 3G connection

Most significant changes

Value Budget
Total JavaScript Size in Bytes
Chrome Desktop
5.35 MB
from 86.8 KB
Total JavaScript Size in Bytes
iPhone, 4G LTE
5.35 MB
from 86.8 KB
Total JavaScript Size in Bytes
Motorola Moto G Power, 3G connection
5.35 MB
from 86.8 KB
JS Parse & Compile
Motorola Moto G Power, 3G connection
1.66s
from 27ms
JS Parse & Compile
iPhone, 4G LTE
467ms
from 12ms

27 other significant changes: JS Parse & Compile on Chrome Desktop, Largest Contentful Paint on Motorola Moto G Power, 3G connection, First Contentful Paint on Motorola Moto G Power, 3G connection, Total Blocking Time on Chrome Desktop, Total CSS Size in Bytes on Chrome Desktop, Total CSS Size in Bytes on iPhone, 4G LTE, Total CSS Size in Bytes on Motorola Moto G Power, 3G connection, Time to Interactive on Motorola Moto G Power, 3G connection, Total Page Size in Bytes on Chrome Desktop, Total Page Size in Bytes on iPhone, 4G LTE, Total Page Size in Bytes on Motorola Moto G Power, 3G connection, Time to Interactive on Chrome Desktop, Number of Requests on Motorola Moto G Power, 3G connection, Number of Requests on Chrome Desktop, Number of Requests on iPhone, 4G LTE, First Contentful Paint on Chrome Desktop, Largest Contentful Paint on Chrome Desktop, Speed Index on Motorola Moto G Power, 3G connection, Time to Interactive on iPhone, 4G LTE, Largest Contentful Paint on iPhone, 4G LTE, First Contentful Paint on iPhone, 4G LTE, Speed Index on Chrome Desktop, Total HTML Size in Bytes on Chrome Desktop, Total HTML Size in Bytes on iPhone, 4G LTE, Total HTML Size in Bytes on Motorola Moto G Power, 3G connection, Lighthouse Performance Score on Motorola Moto G Power, 3G connection, Lighthouse Performance Score on Chrome Desktop

Calibre: Site dashboard | View this PR | Edit settings | View documentation

@github-actions
Copy link

Parser Benchmark Results

group                                 main                                   pr
-----                                 ----                                   --
parser/checker.ts                     1.00    149.8±9.26ms    17.4 MB/sec    1.02   152.9±12.21ms    17.0 MB/sec
parser/compiler.js                    1.01     90.7±5.30ms    11.6 MB/sec    1.00     89.7±5.89ms    11.7 MB/sec
parser/d3.min.js                      1.00     49.7±3.38ms     5.3 MB/sec    1.11     55.3±4.83ms     4.7 MB/sec
parser/dojo.js                        1.02      4.3±0.19ms    16.0 MB/sec    1.00      4.2±0.23ms    16.3 MB/sec
parser/ios.d.ts                       1.01    129.1±7.71ms    14.5 MB/sec    1.00    128.3±7.33ms    14.5 MB/sec
parser/jquery.min.js                  1.00     14.3±1.45ms     5.8 MB/sec    1.02     14.6±1.03ms     5.7 MB/sec
parser/math.js                        1.05    113.1±4.20ms     5.7 MB/sec    1.00    107.7±5.24ms     6.0 MB/sec
parser/parser.ts                      1.00      3.1±0.19ms    15.6 MB/sec    1.03      3.2±0.15ms    15.1 MB/sec
parser/pixi.min.js                    1.00     69.4±3.66ms     6.3 MB/sec    1.02     71.1±4.60ms     6.2 MB/sec
parser/react-dom.production.min.js    1.00     17.9±1.14ms     6.4 MB/sec    1.08     19.3±1.06ms     6.0 MB/sec
parser/react.production.min.js        1.00   929.1±44.13µs     6.6 MB/sec    1.02   950.4±41.41µs     6.5 MB/sec
parser/router.ts                      1.00      2.5±0.12ms    25.4 MB/sec    1.07      2.7±0.29ms    23.7 MB/sec
parser/tex-chtml-full.js              1.09    151.9±5.76ms     6.0 MB/sec    1.00    139.6±7.88ms     6.5 MB/sec
parser/three.min.js                   1.04     76.9±3.85ms     7.6 MB/sec    1.00     73.7±3.90ms     8.0 MB/sec
parser/typescript.js                  1.01   595.9±35.82ms    15.9 MB/sec    1.00   591.4±29.71ms    16.1 MB/sec
parser/vue.global.prod.js             1.03     23.8±2.56ms     5.1 MB/sec    1.00     23.0±1.92ms     5.2 MB/sec

span: Option<TextRange>,
/// Reference to a file where the issue occurred
#[location(resource)]
file_id: FileId,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a refactor I have in mind for the JS parser and formatter too but we may try it out with the new JSON lexer first: since the parsing (and by extension lexing) only works on a single file, the parser itself doesn't really need to know the FileId of the file it's parsing and may simply return a list of diagnostics without a resource, to be injected at a higher level (probably in the Workspace) using the .with_file_path() diagnostic combinator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto. Ideally, we should move away from FileId gradually :)
This will make things way easier in the long run! (personal experience)

crates/rome_js_unicode_table/Cargo.toml Outdated Show resolved Hide resolved
crates/rome_json_parser/Cargo.toml Outdated Show resolved Hide resolved
version = "0.0.0"
authors = ["Rome Tools"]
license = "MIT"
description = "An extremely fast JSON parser"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: blazing fast, or turbo fast or nitro fast 🤣

Please don't take this comment seriously!! 😄

@MichaReiser MichaReiser merged commit d9c5c14 into main Nov 23, 2022
@MichaReiser MichaReiser deleted the feat/json-lexer branch November 23, 2022 08:29
@Conaclos
Copy link
Contributor

Conaclos commented Nov 23, 2022

Non-standard features implemented by the lexer

Worth to note that extensions of JSON exist:

Rome could support the most common: JSON and JSONC

jeysal added a commit to jeysal/rometools that referenced this pull request Nov 24, 2022
* upstream/main: (73 commits)
  fix(semantic_analyzers): style/noShoutyConstants does not recognize multiple uses of a constant. (rome#3789)
  feat(rome_js_analyze): useDefaultSwitchClauseLast (rome#3791)
  chore: run rustfmt and typo fix (rome#3840)
  feat(rome_js_analyze): use exhaustive deps support properties (rome#3581)
  website(docs): Fix text formatting (rome#3828)
  feat(rome_js_analyze): `noVoidTypeReturn` (rome#3806)
  feat(rome_cli): expose the `--verbose` flag to the CLI (rome#3812)
  fix(rome_diagnostics): allow diagnostic locations to be created without a resource (rome#3834)
  feat(rome_js_analyze): add noExtraNonNullAssertion rule (rome#3797)
  fix(rome_lsp): lsp friendly catch unwind (rome#3740)
  feat(rome_js_semantic): model improvements (rome#3825)
  feat(rome_json_parser): JSON Lexer (rome#3809)
  feat(rome_js_analyze): implement `noDistractingElements` (rome#3820)
  fix(rome_js_formatter): shothanded named import line break with default import (rome#3826)
  feat(rome_js_analyze): `noConstructorReturn` (rome#3805)
  feat(website): change enabledNurseryRules to All/Recommended select (rome#3810)
  feat(rome_js_analyze): noSetterReturn
  feat(rome_js_analyze): noConstructorReturn
  feat(rome_analyze): suppress rule via code actions (rome#3572)
  feat(rome_js_analyze): `noVar` (rome#3765)
  ...
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Parser Area: parser L-JSON Language: JSON
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants