-
-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: use regex
to enhance js engine support
#762
Conversation
✅ Deploy Preview for shiki-next ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
✅ Deploy Preview for shiki-matsu ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Continuing the discussion from antfu/oniguruma-to-js#1 , here are some basic import { regex } from 'regex'
import { Context, replaceUnescaped } from 'regex-utilities'
const TABLE_POSIX = {
alnum: '0-9A-Za-z',
alpha: 'A-Za-z',
ascii: '\\0-\\x7F',
blank: ' \\t',
cntrl: '\\0-\\x1F\\x7F',
digit: '0-9',
graph: '!-~',
lower: 'a-z',
print: ' -~',
punct: '!-/:-@[-`{-~',
space: ' \\t\\r\\n\\v\\f',
upper: 'A-Z',
xdigit: '0-9A-Fa-f',
word: '\\w',
}
function oniguruma(str) {
str = replaceUnescaped(str, '\\\\(?:h|(?<literal>[\'"`#@]))', ({groups: {literal}}, details) => {
if (literal) {
return literal
}
let hex = TABLE_POSIX.xdigit
return details.context === Context.DEFAULT ? `[${hex}]` : hex
})
str = replaceUnescaped(str, String.raw`\\(?<escape>[Az])|\(\?#[^)]*\)`, ({groups: {escape}}) => {
if (escape) {
return escape === 'A' ? '^' : '$'
}
return ''
}, Context.DEFAULT)
str = replaceUnescaped(str, String.raw`\[:(?<neg>\^?)(?<posix>[a-z]+):\]`, ({groups: {neg, posix}}) => {
if (neg) {
throw new Error('TODO: Support negated POSIX classes')
}
let resolved = TABLE_POSIX[posix]
if (!resolved) {
throw new Error(`Unknown POSIX class "${posix}"`)
}
return resolved
}, Context.CHAR_CLASS)
return str
}
regex({
flags: 'dg',
subclass: true,
plugins: [oniguruma],
unicodeSetsPlugin: null,
disable: {
x: true,
n: true,
v: true,
},
})({ raw: [p] }) Note that this fixes a couple errors from Edited to simplify and add support for The plugin is adding support for:
|
Thank you! I updated the code based on your suggestions, but unfortunately it seems to still have a lot errors. Meanwhile I am a little bit worried about the complexity here - I am still up to give |
I've just edited the code in my last comment to add support for
I've added a new feature in regex-utilities 2.3.0 and used it in the updated example code to significantly simplify. Hopefully this is now easier to follow and makes it clearer how to work with
For sure, this makes a lot of sense while you're in the rapid development stage. Even beyond that, though, I think the goals of these two projects might be just far enough apart that you will continue to prefer not to develop on top of So no worries at all if you want to close this. It was interesting/valuable to discuss this with you and see how |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #762 +/- ##
==========================================
+ Coverage 92.30% 92.32% +0.01%
==========================================
Files 70 70
Lines 4547 4558 +11
Branches 1009 1009
==========================================
+ Hits 4197 4208 +11
Misses 345 345
Partials 5 5 ☔ View full report in Codecov by Sentry. |
docs/references/engine-js-compat.md
Outdated
| | Count | | ||
| :-------------- | --------------------------------: | | ||
| Total Languages | 213 | | ||
| Fully Supported | [132](#fully-supported-languages) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gets a lot better, thanks for the updates!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. I'll try to identify the issues and fix handling for grammars that used to be okay but aren't anymore. I already notice an issue with regex
's support for multiple possessive quantifiers in the same pattern that should be easy to fix.
Okay, I've run the
pattern = pattern
.replace(/\\�/g, '\\G')
.replace(/\(\{\)/g, '(\\{)') Removing this increases the number of grammars passing, whether using
This is still slightly lower than the number reported when not using The culprits for the number of "supported" grammars going slightly down are the following errors that
These are subroutines that Oniguruma supports but
Even though Looking at the remaining failing cases beyond the currently unsupported subroutines described above, the main culprits seem to be:
Here's what I'd personally do:
Even though the reported number of "supported" grammars would go down slightly (to 172), this should only affect grammars that are already not working correctly in at least some cases. And the number of actually supported grammars would go up significantly, since:
If desired, after landing this change I'd recommend doing additional work in |
Thanks a lot for your deep investigation, that looks very promising!
Yes, I totally agree with that. It's surely better to surface the errors instead of silently eating them. With the current result, I am more than happy to land it, and then we can iterate as we go. However, as the JS engine is mostly designed to work on browsers, the bundle size is something we need to be concerned about. If The current test runs with
For Shiki, in this particular case, I think it's acceptable as we consider this to be an advanced usage. If it's tricky to provide back-compact, we can just document the minimal runtime requirement for using the feature. Thanks so much for your work and help! |
I'll merge this for now, let's iterate it on main banch. |
Amazing to see this merged. 🎉🙌
If I/we create a new library that extends It wouldn’t make sense to add extensive Oniguruma translation to the base But, it absolutely makes sense to me to create a new library (tentative name: I have a bunch of ideas for this, but I won't be able to start development for it until sometime next week. I’d be happy to add you as a maintainer for this new project (totally up to you!) after the basics are set up. What do you think? Does this seem like a decent plan? |
Sounds great, looking forward to it! (and take your time ofc). Thanks! |
Cool. I think it will be a fun project so I'm looking forward to it, too.
Okay, in that case I'm going to require native flag |
…#1244) This PR contains the following updates: | Package | Change | Age | Adoption | Passing | Confidence | |---|---|---|---|---|---| | [@shikijs/vitepress-twoslash](https://redirect.github.com/shikijs/shiki) ([source](https://redirect.github.com/shikijs/shiki/tree/HEAD/packages/vitepress-twoslash)) | [`1.16.2` -> `1.16.3`](https://renovatebot.com/diffs/npm/@shikijs%2fvitepress-twoslash/1.16.2/1.16.3) | [![age](https://developer.mend.io/api/mc/badges/age/npm/@shikijs%2fvitepress-twoslash/1.16.3?slim=true)](https://docs.renovatebot.com/merge-confidence/) | [![adoption](https://developer.mend.io/api/mc/badges/adoption/npm/@shikijs%2fvitepress-twoslash/1.16.3?slim=true)](https://docs.renovatebot.com/merge-confidence/) | [![passing](https://developer.mend.io/api/mc/badges/compatibility/npm/@shikijs%2fvitepress-twoslash/1.16.2/1.16.3?slim=true)](https://docs.renovatebot.com/merge-confidence/) | [![confidence](https://developer.mend.io/api/mc/badges/confidence/npm/@shikijs%2fvitepress-twoslash/1.16.2/1.16.3?slim=true)](https://docs.renovatebot.com/merge-confidence/) | --- ### Release Notes <details> <summary>shikijs/shiki (@​shikijs/vitepress-twoslash)</summary> ### [`v1.16.3`](https://redirect.github.com/shikijs/shiki/releases/tag/v1.16.3) [Compare Source](https://redirect.github.com/shikijs/shiki/compare/v1.16.2...v1.16.3) ##### 🚀 Features - Make `createCssVariablesTheme` no longer experimental - by [@​antfu](https://redirect.github.com/antfu) [<samp>(ac10b)</samp>](https://redirect.github.com/shikijs/shiki/commit/ac10b3ac) - Use `regex` to enhance js engine support - by [@​antfu](https://redirect.github.com/antfu) in [https://github.com/shikijs/shiki/issues/762](https://redirect.github.com/shikijs/shiki/issues/762) [<samp>(ed362)</samp>](https://redirect.github.com/shikijs/shiki/commit/ed362960) ##### [View changes on GitHub](https://redirect.github.com/shikijs/shiki/compare/v1.16.2...v1.16.3) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR was generated by [Mend Renovate](https://mend.io/renovate/). View the [repository job log](https://developer.mend.io/github/cap-js/docs). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOC41OS4yIiwidXBkYXRlZEluVmVyIjoiMzguNTkuMiIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOltdfQ==--> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
antfu/oniguruma-to-js#1