Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

url: improve url module performance #1650

Closed
wants to merge 3 commits into from

Conversation

petkaantonov
Copy link
Contributor

There is no compatibility breakage and using eager .href is only 33% slower in url parse benchmark, and it could be made lazy again since nobody deletes .href anyway.

misc/url.js parse(): 1.0241e+5
misc/url.js format(): 1.9642e+5
misc/url.js resolve("../foo/bar?baz=boom"): 23774
misc/url.js resolve("foo/bar"): 43754
misc/url.js resolve("http://nodejs.org"): 33768
misc/url.js resolve("./foo/bar?baz"): 28106

@Fishrock123 Fishrock123 added the url Issues and PRs related to the legacy built-in url module. label May 7, 2015
@Fishrock123
Copy link
Contributor

This is still semver-major, correct?

@petkaantonov
Copy link
Contributor Author

no there should not be breaking changes

@silverwind
Copy link
Contributor

Let's CI this, also npm tests should pass.

@Fishrock123
Copy link
Contributor

@Fishrock123
Copy link
Contributor

make test-npm is happy.

@silverwind
Copy link
Contributor

I'd be fine landing this in a patch release.

@petkaantonov can you outline what's changed from the original implementation?

@petkaantonov
Copy link
Contributor Author

The parsing has been rewritten (.parse()).

Some minor changes in formatting (.format())

Some micro optimization in resolution (.resolveObject()).

@silverwind
Copy link
Contributor

Needs a rebase because of 19ffb5c

@petkaantonov petkaantonov force-pushed the faster-url-parser-2 branch from 4c976d0 to dd0053a Compare May 11, 2015 06:50
@petkaantonov
Copy link
Contributor Author

rebased

@silverwind
Copy link
Contributor

Thanks, got a few linter errors concerning this rule: http://eslint.org/docs/rules/comma-spacing

lib/url.js
   31:22  error  There should be no space before ','  comma-spacing
   32:22  error  There should be no space before ','  comma-spacing
   33:14  error  There should be no space before ','  comma-spacing
   33:28  error  There should be no space before ','  comma-spacing
  228:48  error  There should be no space before ','  comma-spacing
  636:40  error  There should be no space before ','  comma-spacing
  941:48  error  There should be no space before ','  comma-spacing
  950:46  error  There should be no space before ','  comma-spacing

@petkaantonov
Copy link
Contributor Author

Those are not spaces but comments like 0x3A/*':'*/, 0x3F/*'?'*/, ...

@silverwind
Copy link
Contributor

Yeah, it's kind of bullshit that this gets interpreted as a space.

@petkaantonov
Copy link
Contributor Author

should we open issue in eslint?

@silverwind
Copy link
Contributor

Already been done: eslint/eslint#2408

@silverwind
Copy link
Contributor

Probably best to disable the comma-spacing linter rule until that one is resolved.

@yosuke-furukawa
Copy link
Member

How about this ??

const _protocolCharacters = makeAsciiTable([
  [0x61, 0x7A], /*a-z*/
  [0x41, 0x5A], /*A-Z*/
  0x2E, /*'.'*/ 0x2B, /*'+'*/ 0x2D /*'-'*/
]);

@silverwind
Copy link
Contributor

@yosuke-furukawa that's just confusing. also doesn't work in cases like

if (!containsCharacter2(search, 0x23 /*'#'*/, -1))

@yosuke-furukawa
Copy link
Member

Hm... I think this is not so weird.

if (!containsCharacter2(search, 0x23, /*'#'*/ -1))

But +1 for suppress the warnings if you want.

@silverwind
Copy link
Contributor

@yosuke-furukawa would you suggest setting the setting to 1 or 0? Both seem to make make lint pass.

@yosuke-furukawa
Copy link
Member

1 is warn, 0 is silence. 1 is better. When we will re-consider the eslint rules, we can detect the rule violation easily.

@silverwind
Copy link
Contributor

@petkaantonov the PR should pass eslint now. I'd advice changing lines like

  0x2E /*'.'*/, 0x2B /*'+'*/, 0x2D /*'-'*/

to

  0x2E/*'.'*/, 0x2B/*'+'*/, 0x2D/*'-'*/

because I assume eslint will still warn on these after the fix.

@petkaantonov
Copy link
Contributor Author

done

@silverwind
Copy link
Contributor

@petkaantonov thanks. May I ask to add a small test for the delete uri.prop case (#1591)?

@jasnell
Copy link
Member

jasnell commented May 14, 2015

One additional bit to note, the IANA registry for URI schemes lists a few more permanent schemes that require the slash (coap, for instance, http://tools.ietf.org/html/rfc7252) that are currently not listed in the _slashProtocols. Definitely do not need to add those in here but we'll likely want to revisit and add them in later.

@jasnell
Copy link
Member

jasnell commented May 14, 2015

@petkaantonov thank you for persisting with this btw. The improvements look very good on an initial review

@tyscorp
Copy link

tyscorp commented May 14, 2015

It'll be great to see this finally land. Great work @petkaantonov

@alubbe
Copy link

alubbe commented Jun 3, 2015

Hey everybody - what's the status, what's keeping us from merging this? :)

@domenic
Copy link
Contributor

domenic commented Jun 3, 2015

There are still a few outstanding comments, including the request to port the tests from the previous PR. Plus, it needs to be rebased, since it no longer merges cleanly.

@petkaantonov
Copy link
Contributor Author

I am really lacking the time at the moment

@Fishrock123
Copy link
Contributor

reference to original: #1561

@alubbe
Copy link

alubbe commented Jun 10, 2015

No worries @petkaantonov
I was just checking in because Techempower is gearing up to run the next benchmark suite and I remember the low node numbers being part of your motivation to work on this in the first place.
We also landed significant speed ups for the express+jade app, so the next round should yield pretty good numbers.

@ronkorving
Copy link
Contributor

Out of interest, is this still going places?

@jbergstroem
Copy link
Member

@ronkorving it nowadays live in #2303.

@ronkorving
Copy link
Contributor

Ah thanks, better close this one then I guess?

@jbergstroem
Copy link
Member

I think its left open in case petka has time to resume.

@jasnell jasnell added the stalled Issues and PRs that are stalled. label Nov 16, 2015
@ChALkeR ChALkeR added the performance Issues and PRs related to the performance of Node.js. label Feb 16, 2016
@estliberitas estliberitas force-pushed the master branch 2 times, most recently from 7da4fd4 to c7066fb Compare April 26, 2016 05:22
@benjamingr
Copy link
Member

Going to cloes this, Petka said he's likely not going to resume, the changes live in another PR and this can be reopend anyway if anyone disagrees.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Issues and PRs related to the performance of Node.js. stalled Issues and PRs that are stalled. url Issues and PRs related to the legacy built-in url module.
Projects
None yet
Development

Successfully merging this pull request may close these issues.