From 5807b28261e44a47e31683230137da395ddc79d8 Mon Sep 17 00:00:00 2001 From: Rimas Date: Tue, 7 Mar 2017 16:12:16 +0200 Subject: [PATCH] Add empty host concept for file and non-special URLs Fixes #258. --- url.bs | 76 +++++++++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 59 insertions(+), 17 deletions(-) diff --git a/url.bs b/url.bs index 269c61ae..ec35294e 100644 --- a/url.bs +++ b/url.bs @@ -254,9 +254,9 @@ point URLs from A can come from untrusted sources.

Host representation

A host is a domain, an -IPv4 address, an IPv6 address, or an opaque host. Typically a host -serves as a network address, but it is sometimes used as opaque identifier in URLs -where a network address is not necessary. +IPv4 address, an IPv6 address, an opaque host, or an empty host. +Typically a host serves as a network address, but it is sometimes used as opaque +identifier in URLs where a network address is not necessary.

The RFCs referenced in the paragraphs below are for informative purposes only. They have no influence on host writing, parsing, and serialization. Unless stated otherwise @@ -280,11 +280,10 @@ eight 16-bit pieces.

Support for <zone_id> is intentionally omitted. -

An opaque host is an ASCII string holding data that can be used for -further processing. +

An opaque host is a non-empty ASCII string holding data that can be used +for further processing. -

An opaque host is only used by non-special -URLs. +

An empty host is the empty string.

Host miscellaneous

@@ -383,7 +382,7 @@ up to three ASCII digits per sequence, each representing a decimal number XXX should we define the format inline instead just like STD 66? --> -

An valid opaque-host string must be zero or more URL units or: +

A valid opaque-host string must be one or more URL units or: "[", followed by a valid IPv6-address string, followed by "]".

This is not part of the definition of valid host string as it @@ -768,7 +767,8 @@ no purpose other than being a location the algorithm can jump to. IPv6 serializer on host, followed by "]". -

  • Otherwise, host is a domain or opaque host, return host. +

  • Otherwise, host is a domain, opaque host, or empty host, + return host. The IPv4 serializer takes an @@ -1002,6 +1002,48 @@ It is initially the empty string.

    A URL's host is null or a host. It is initially null. +

    +

    The following table lists allowed URL's scheme / + host combinations. + + + + + + + +
    scheme + host +
    domain + IPv4 address + IPv6 address + opaque host + empty host + null +
    non-"file" special + ✅ + ✅ + ✅ + ❌ + ❌ + ❌ +
    "file" + ✅ + ✅ + ✅ + ❌ + ✅ + ✅ +
    non-special + ❌ + ❌ + ✅ + ✅ + ✅ + ✅ +
    +

    +

    A URL's port is either null or a 16-bit unsigned integer that identifies a networking port. It is initially null. @@ -1172,9 +1214,10 @@ switching on base URL's scheme:

    a path-relative-scheme-less-URL string

    "file"

    a scheme-relative-file-URL string -

    a path-absolute-URL string if base URL's host is null +

    a path-absolute-URL string if base URL's host is an + empty host

    a path-absolute-non-Windows-file-URL string if base URL's host - is non-null + is not an empty host

    a path-relative-scheme-less-URL string

    Otherwise

    a scheme-relative-URL string @@ -1198,8 +1241,8 @@ optionally followed by a path-absolute-URL string. "//", followed by an opaque-host-and-port string, optionally followed by a path-absolute-URL string. -

    An opaque-host-and-port string must be either an empty -valid opaque-host string or: a non-empty valid opaque-host string, optionally followed +

    An opaque-host-and-port string must be either the empty +string or: a valid opaque-host string, optionally followed by ":" and a URL-port string.

    A scheme-relative-file-URL string must be @@ -2066,11 +2109,10 @@ string input, optionally with a base URL base, opti Windows drive letter, then:

      -
    1. If url's host is non-null, - validation error. +

    2. If url's host is neither the empty string nor null, + validation error, set url's host to the empty string. -

    3. Set url's host to null and replace the second - code point in buffer with ":". +

    4. Replace the second code point in buffer with ":".

    This is a (platform-independent) Windows drive letter quirk.