Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

url: extend url.format to support WHATWG URL #10857

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions doc/api/url.md
Original file line number Diff line number Diff line change
Expand Up @@ -198,6 +198,47 @@ The formatting process operates as follows:
string, an [`Error`][] is thrown.
* `result` is returned.

## url.format(URL[, options])

> Stability: 1 - Experimental

* `URL` {URL} A [WHATWG URL][] object
* `options` {Object}
* `auth` {Boolean} `true` if the serialized URL string should include the
username and password, `false` otherwise. Defaults to `true`.
* `fragment` {Boolean} `true` if the serialized URL string should include the
fragment, `false` otherwise. Defaults to `true`.
* `search` {Boolean} `true` if the serialized URL string should include the
search query, `false` otherwise. Defaults to `true`.
* `unicode` (Boolean) `true` if Unicode characters appearing in the host
component of the URL string should be encoded directly as opposed to being
Punycode encoded. Defaults to `false`.

Returns a customizable serialization of a URL String representation of a
[WHATWG URL][] object.

The URL object has both a `toString()` method and `href` property that return
string serializations of the URL. These are not, however, customizable in
any way. The `url.format(URL[, options])` method allows for basic customization
of the output.

For example:

```js
const myURL = new URL('https://a:b@你好你好?abc#foo');

console.log(myURL.href);
// Prints https://a:b@xn--6qqa088eba/?abc#foo

console.log(myURL.toString());
// Prints https://a:b@xn--6qqa088eba/?abc#foo

console.log(url.format(myURL, {fragment: false, unicode: true, auth: false}));
// Prints 'https://你好你好?abc'
```

*Note*: This variation of the `url.format()` method is currently considered to
be experimental.

## url.parse(urlString[, parseQueryString[, slashesDenoteHost]])
<!-- YAML
Expand Down Expand Up @@ -830,3 +871,4 @@ console.log(myURL.origin);
[Punycode]: https://tools.ietf.org/html/rfc5891#section-4.4
[`Map`]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Map
[`array.toString()`]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/toString
[WHATWG URL]: #url_the_whatwg_url_api
60 changes: 30 additions & 30 deletions lib/internal/url.js
Original file line number Diff line number Diff line change
@@ -1,13 +1,5 @@
'use strict';

function getPunycode() {
try {
return process.binding('icu');
} catch (err) {
return require('punycode');
}
}
const punycode = getPunycode();
const util = require('util');
const binding = process.binding('url');
const context = Symbol('context');
Expand All @@ -20,6 +12,7 @@ const kScheme = Symbol('scheme');
const kHost = Symbol('host');
const kPort = Symbol('port');
const kDomain = Symbol('domain');
const kFormat = Symbol('format');

// https://tc39.github.io/ecma262/#sec-%iteratorprototype%-object
const IteratorPrototype = Object.getPrototypeOf(
Expand Down Expand Up @@ -263,18 +256,19 @@ class URL {
}

Object.defineProperties(URL.prototype, {
toString: {
// https://heycam.github.io/webidl/#es-stringifier
writable: true,
enumerable: true,
configurable: true,
[kFormat]: {
enumerable: false,
configurable: false,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two lines are not needed as they are the default. However, the trend in Web APIs seems to be make symbol properties configurable and writable, so I'd recommend sticking to that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I know they are the default. I prefer to be explicit. Also, because this is an internal only method, I prefer to keep these values

// eslint-disable-next-line func-name-matching
value: function toString(options) {
options = options || {};
const fragment =
options.fragment !== undefined ?
!!options.fragment : true;
const unicode = !!options.unicode;
value: function format(options) {
if (options && typeof options !== 'object')
throw new TypeError('options must be an object');
options = Object.assign({
fragment: true,
unicode: false,
search: true,
auth: true
}, options);
const ctx = this[context];
var ret;
if (this.protocol)
Expand All @@ -284,28 +278,23 @@ Object.defineProperties(URL.prototype, {
const has_username = typeof ctx.username === 'string';
const has_password = typeof ctx.password === 'string' &&
ctx.password !== '';
if (has_username || has_password) {
if (options.auth && (has_username || has_password)) {
if (has_username)
ret += ctx.username;
if (has_password)
ret += `:${ctx.password}`;
ret += '@';
}
if (unicode) {
ret += punycode.toUnicode(this.hostname);
if (this.port !== undefined)
ret += `:${this.port}`;
} else {
ret += this.host;
}
ret += options.unicode ?
domainToUnicode(this.host) : this.host;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the unicode option name from the spec, from compatibility with the old url, or is it up for debate? I don’t think it’s conveying the semantics very well (like, I have to look at the tests in this PR to see which behaviour maps to which value). How does something like decodeIDNA: true/false sound to you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Strange, my response ended up both in my own review and here, but when I deleted that one in my review this one disappeared too)..

I think options is neither in the spec nor in the old url.format. For me the meaning of unicode is pretty straightforword(maybe it's just me though). The two serialization alternatives are named Domain to ASCII and Domain to Unicode in the spec.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unicode flag is consistent with the URL standard, yes (see: https://url.spec.whatwg.org/#host-parsing). To be fair, however, that is on parse side and this is on the serialization, but there is precedent at least.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unicode flag is consistent with the URL standard, yes (see: https://url.spec.whatwg.org/#host-parsing). To be fair, however, that is on parse side and this is on the serialization, but there is precedent at least.

Eh, if it’s in the standard, even just on the parse side, it’s probably best to leave it like that. Thanks for pointing it out.

For me the meaning of unicode is pretty straightforword(maybe it's just me though).

Well, if it’s working and people know what the option describes, that’s probably just fine then. (At least in my head, unicode: true sounds like it could both mean enabling a) decoding of the hostname from punycode to Unicode or b) encoding of the hostname from Unicode to punycode)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will make sure that the relevant documentation appropriately covers the option and makes its function clear

} else if (ctx.scheme === 'file:') {
ret += '//';
}
if (this.pathname)
ret += this.pathname;
if (typeof ctx.query === 'string')
if (options.search && typeof ctx.query === 'string')
ret += `?${ctx.query}`;
if (fragment & typeof ctx.fragment === 'string')
if (options.fragment && typeof ctx.fragment === 'string')
ret += `#${ctx.fragment}`;
return ret;
}
Expand All @@ -314,11 +303,21 @@ Object.defineProperties(URL.prototype, {
configurable: true,
value: 'URL'
},
toString: {
// https://heycam.github.io/webidl/#es-stringifier
writable: true,
enumerable: true,
configurable: true,
// eslint-disable-next-line func-name-matching
value: function toString() {
return this[kFormat]({});
}
},
href: {
enumerable: true,
configurable: true,
get() {
return this.toString();
return this[kFormat]({});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think keep it as this.toString() is closer to how the spec defines href as the stringifier.

Copy link
Member Author

@jasnell jasnell Jan 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result is the same. Using this[kFormat]({}) here avoids an unnecessary property access.

},
set(input) {
parse(this, input);
Expand Down Expand Up @@ -1120,3 +1119,4 @@ exports.domainToASCII = domainToASCII;
exports.domainToUnicode = domainToUnicode;
exports.encodeAuth = encodeAuth;
exports.urlToOptions = urlToOptions;
exports.formatSymbol = kFormat;
17 changes: 10 additions & 7 deletions lib/url.js
Original file line number Diff line number Diff line change
Expand Up @@ -538,19 +538,22 @@ function autoEscapeStr(rest) {
}

// format a parsed object into a url string
function urlFormat(obj) {
function urlFormat(obj, options) {
// ensure it's an object, and not a string url.
// If it's an obj, this is a no-op.
// this way, you can call url_format() on strings
// to clean up potentially wonky urls.
if (typeof obj === 'string') obj = urlParse(obj);

else if (typeof obj !== 'object' || obj === null)
if (typeof obj === 'string') {
obj = urlParse(obj);
} else if (typeof obj !== 'object' || obj === null) {
throw new TypeError('Parameter "urlObj" must be an object, not ' +
obj === null ? 'null' : typeof obj);

else if (!(obj instanceof Url)) return Url.prototype.format.call(obj);

} else if (!(obj instanceof Url)) {
var format = obj[internalUrl.formatSymbol];
return format ?
format.call(obj, options) :
Url.prototype.format.call(obj);
}
return obj.format();
}

Expand Down
102 changes: 102 additions & 0 deletions test/parallel/test-url-format-whatwg.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
'use strict';

require('../common');
const assert = require('assert');
const url = require('url');
const URL = url.URL;

const myURL = new URL('http://xn--lck1c3crb1723bpq4a.com/a?a=b#c');

assert.strictEqual(
url.format(myURL),
'http://xn--lck1c3crb1723bpq4a.com/a?a=b#c'
);

assert.strictEqual(
url.format(myURL, {}),
'http://xn--lck1c3crb1723bpq4a.com/a?a=b#c'
);

const errreg = /^TypeError: options must be an object$/;
assert.throws(() => url.format(myURL, true), errreg);
assert.throws(() => url.format(myURL, 1), errreg);
assert.throws(() => url.format(myURL, 'test'), errreg);
assert.throws(() => url.format(myURL, Infinity), errreg);

// Any falsy value other than undefined will be treated as false.
// Any truthy value will be treated as true.

assert.strictEqual(
url.format(myURL, {fragment: false}),
'http://xn--lck1c3crb1723bpq4a.com/a?a=b'
);

assert.strictEqual(
url.format(myURL, {fragment: ''}),
'http://xn--lck1c3crb1723bpq4a.com/a?a=b'
);

assert.strictEqual(
url.format(myURL, {fragment: 0}),
'http://xn--lck1c3crb1723bpq4a.com/a?a=b'
);

assert.strictEqual(
url.format(myURL, {fragment: 1}),
'http://xn--lck1c3crb1723bpq4a.com/a?a=b#c'
);

assert.strictEqual(
url.format(myURL, {fragment: {}}),
'http://xn--lck1c3crb1723bpq4a.com/a?a=b#c'
);

assert.strictEqual(
url.format(myURL, {search: false}),
'http://xn--lck1c3crb1723bpq4a.com/a#c'
);

assert.strictEqual(
url.format(myURL, {search: ''}),
'http://xn--lck1c3crb1723bpq4a.com/a#c'
);

assert.strictEqual(
url.format(myURL, {search: 0}),
'http://xn--lck1c3crb1723bpq4a.com/a#c'
);

assert.strictEqual(
url.format(myURL, {search: 1}),
'http://xn--lck1c3crb1723bpq4a.com/a?a=b#c'
);

assert.strictEqual(
url.format(myURL, {search: {}}),
'http://xn--lck1c3crb1723bpq4a.com/a?a=b#c'
);

assert.strictEqual(
url.format(myURL, {unicode: true}),
'http://理容ナカムラ.com/a?a=b#c'
);

assert.strictEqual(
url.format(myURL, {unicode: 1}),
'http://理容ナカムラ.com/a?a=b#c'
);

assert.strictEqual(
url.format(myURL, {unicode: {}}),
'http://理容ナカムラ.com/a?a=b#c'
);

assert.strictEqual(
url.format(myURL, {unicode: false}),
'http://xn--lck1c3crb1723bpq4a.com/a?a=b#c'
);

assert.strictEqual(
url.format(myURL, {unicode: 0}),
'http://xn--lck1c3crb1723bpq4a.com/a?a=b#c'
);
3 changes: 2 additions & 1 deletion tools/doc/type-parser.js
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,8 @@ const typeMap = {
'Iterable': jsDocPrefix +
'Reference/Iteration_protocols#The_iterable_protocol',
'Iterator': jsDocPrefix +
'Reference/Iteration_protocols#The_iterator_protocol'
'Reference/Iteration_protocols#The_iterator_protocol',
'URL': 'url.html#url_the_whatwg_url_api'
};

module.exports = {
Expand Down