Skip to content

Commit

Permalink
Adds BNF and text to extend LANGTAG to support base direction and ter…
Browse files Browse the repository at this point in the history
…m construtors for creating directional language-tagged strings.

Fixes #32.
  • Loading branch information
gkellogg committed Sep 25, 2023
1 parent 43e8215 commit 994035b
Show file tree
Hide file tree
Showing 3 changed files with 38 additions and 13 deletions.
47 changes: 36 additions & 11 deletions spec/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,9 @@
<a data-cite="RDF12-CONCEPTS#dfn-subject">subject</a> or
<a data-cite="RDF12-CONCEPTS#dfn-object">object</a> of another
<a data-cite="RDF12-CONCEPTS#dfn-rdf-triple">triple</a>,
making it possible to make statements about other statements.</p>
making it possible to make statements about other statements.
RDF 1.2 N-Triples also adds support for
<a data-cite="RDF12-CONCEPTS#dfn-dir-lang-string">directional language-tagged strings</a>.</p>
</section>

<section id='sotd'>
Expand Down Expand Up @@ -231,7 +233,11 @@ <h3>RDF Literals</h3>
are used to identify values such as strings, numbers, dates.</p>

<p>Literals (Grammar production <a href="#grammar-production-literal">Literal</a>)
have a lexical form followed by a language tag, a datatype IRI, or neither.</p>
have a lexical form followed by a
<a data-cite="RDF12-CONCEPTS#dfn-language-tag">language tag</a>
(possibly including <a data-cite="RDF12-CONCEPTS#dfn-base-direction">base direction</a>),
a <a data-cite="RDF12-CONCEPTS#dfn-datatype-iri">datatype IRI</a>,
or neither.</p>

<p>The representation of the lexical form consists of an
initial delimiter <code>&quot;</code> (<span class="codepoint">U+0022</span>),
Expand All @@ -252,6 +258,9 @@ <h3>RDF Literals</h3>
is the characters between the delimiters, after processing any escape sequences.
If present, the <a data-cite="RDF12-CONCEPTS#dfn-language-tag">language tag</a>
is preceded by a '<code>@</code>' (<span class="codepoint">U+0040</span>).
If present, the <a data-cite="RDF12-CONCEPTS#dfn-base-direction">base direction</a>
is included in the <a data-cite="RDF12-CONCEPTS#dfn-language-tag">language tag</a>
separated by '<code>--</code>'.
If there is no language tag, there may be a <a data-cite="RDF12-CONCEPTS#dfn-datatype-iri">datatype IRI</a>,
preceded by two concatenated <code>^</code> characters,
each having the code point <span class="codepoint">U+005E</span>.
Expand All @@ -267,8 +276,9 @@ <h3>RDF Literals</h3>
<http://example.org/show/218> <http://www.w3.org/2000/01/rdf-schema#label> "That Seventies Show"^^<http://www.w3.org/2001/XMLSchema#string> . # literal with XML Schema string datatype
<http://example.org/show/218> <http://www.w3.org/2000/01/rdf-schema#label> "That Seventies Show" . # same as above
<http://example.org/show/218> <http://example.org/show/localName> "That Seventies Show"@en . # literal with a language tag
<http://example.org/show/218> <http://example.org/show/localName> "That Seventies Show"@en-ltr . # literal with a language tag and base direction
<http://example.org/show/218> <http://example.org/show/localName> "Cette Série des Années Septante"@fr-be . # literal outside of ASCII range with a region subtag
<http://example.org/#spiderman> <http://example.org/text> "This is a multi-linenliteral with many quotes (""""")nand two apostrophes ('')." .
<http://example.org/#spiderman> <http://example.org/text> "This is a multi-linenliteral with many quotes (""""") and two apostrophes ('')." .
<http://en.wikipedia.org/wiki/Helium> <http://example.org/elements/atomicNumber> "2"^^<http://www.w3.org/2001/XMLSchema#integer> . # xsd:integer
<http://en.wikipedia.org/wiki/Helium> <http://example.org/elements/specificGravity> "1.663E-4"^^<http://www.w3.org/2001/XMLSchema#double> . # xsd:double
-->
Expand Down Expand Up @@ -325,7 +335,7 @@ <h2>A Canonical form of N-Triples</h2>
any of which MUST be a single space (<code>U+0020</code>).</li>
<li><a data-cite="RDF12-CONCEPTS#dfn-literal">Literals</a> with the
datatype <code>http://www.w3.org/2001/XMLSchema#string</code>
MUST NOT use the datatype IRI part of the <a href="#grammar-production-literal">literal</a>,
MUST NOT use the <a data-cite="RDF12-CONCEPTS#dfn-datatype-iri">datatype IRI</a> part of the <a href="#grammar-production-literal">literal</a>,
and are represented using only <a href="#grammar-production-STRING_LITERAL_QUOTE">STRING_LITERAL_QUOTE</a>.
</li>
<li><code><a href="#grammar-production-HEX">HEX</a></code> MUST use only uppercase letters (<code>[A-F]</code>).</li>
Expand Down Expand Up @@ -498,7 +508,9 @@ <h3>RDF Term Constructors</h3>
<a data-cite="RDF12-CONCEPTS#dfn-language-tag">language tag</a>
</td>
<td>
The characters following the <code>@</code> form the unicode string of the language tag.
The characters following the <code>@</code> form the unicode string of the <a data-cite="RDF12-CONCEPTS#dfn-language-tag">language tag</a>
and optionally the <a data-cite="RDF12-CONCEPTS#dfn-base-direction">base direction</a>,
if the matched characters include '<code>--</code>'.
</td>
</tr>
<tr id="handle-STRING_LITERAL_QUOTE">
Expand All @@ -523,13 +535,24 @@ <h3>RDF Term Constructors</h3>
<td>
The literal has a lexical form of the first rule argument,
<code>STRING_LITERAL_QUOTE</code>,
and either a language tag of <code>LANGTAG</code>
or a datatype IRI of <code>iri</code>,
and either a <a data-cite="RDF12-CONCEPTS#dfn-language-tag">language tag</a>
with optional <a data-cite="RDF12-CONCEPTS#dfn-base-direction">base direction</a>
from <code><a href="#grammar-production-LANGTAG" class="type langTag">LANGTAG</a></code>
or a <a data-cite="RDF12-CONCEPTS#dfn-datatype-iri">datatype IRI</a> of <code>iri</code>,
depending on which rule matched the input.
If the <code>LANGTAG</code> rule matched,
If the <code><a href="#grammar-production-LANGTAG" class="type langTag">LANGTAG</a></code> rule matched,
it is split into <a data-cite="RDF12-CONCEPTS#dfn-language-tag">language tag</a>
and <a data-cite="RDF12-CONCEPTS#dfn-base-direction">base direction</a>
on '<code>--</code>'.
If there is no <a data-cite="RDF12-CONCEPTS#dfn-base-direction">base direction</a>,
the datatype is <code>rdf:langString</code>
and the language tag is <code>LANGTAG</code>.
If neither a language tag nor a datatype IRI is provided,
and the <a data-cite="RDF12-CONCEPTS#dfn-language-tag">language tag</a> is <code><a href="#grammar-production-LANGTAG" class="type langTag">LANGTAG</a></code>.
If there is a base direction, the datatype is <code>rdf:dirLangString</code>,
the <a data-cite="RDF12-CONCEPTS#dfn-language-tag">language tag</a> is
taken from the portion of the matched <code><a href="#grammar-production-LANGTAG" class="type langTag">LANGTAG</a></code> proceding '<code>--</code>'
and the <a data-cite="RDF12-CONCEPTS#dfn-base-direction">base direction</a>
is taken from the portion of the matched <code><a href="#grammar-production-LANGTAG" class="type langTag">LANGTAG</a></code> following '<code>--</code>'.
If neither a language tag nor a <a data-cite="RDF12-CONCEPTS#dfn-datatype-iri">datatype IRI</a> is provided,
the literal has a datatype of <code>xsd:string</code>.
</td>
</tr>
Expand Down Expand Up @@ -743,8 +766,10 @@ <h2>Changes between RDF 1.1 and RDF 1.2</h2>
<li>Separated <a href="#security"></a> from <a href="#sec-mediaReg-n-triples"></a>
and updated language.</li>
<li>Changes <a href="#canonical-ntriples"></a> to clarify
use of datatype IRIs and expand the use of escapes
use of <a data-cite="RDF12-CONCEPTS#dfn-datatype-iri">datatype IRIs</a> and expand the use of escapes
in literals.</li>
<li>Extends <a href="#grammar-production-LANGTAG" class="type langTag">LANGTAG</a> to include
an optional <a data-cite="RDF12-CONCEPTS#dfn-base-direction">base direction</a>.</li>
</ul>
</section>

Expand Down
2 changes: 1 addition & 1 deletion spec/ntriples-bnf.html
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ <h3 id="terminals">Productions for terminals</h3>
<td>[11]</td>
<td><code>LANGTAG</code></td>
<td>::=</td>
<td>"<code class="grammar-literal">@</code>" <code class="grammar-brac">[</code><code class="grammar-literal">a-zA-Z</code><code class="grammar-brac">]</code><code class="grammar-plus">+</code> <code class="grammar-paren">(</code>"<code class="grammar-literal">-</code>" <code class="grammar-brac">[</code><code class="grammar-literal">a-zA-Z0-9</code><code class="grammar-brac">]</code><code class="grammar-plus">+</code><code class="grammar-paren">)</code><code class="grammar-star">*</code></td>
<td>"<code class="grammar-literal">@</code>" <code class="grammar-brac">[</code><code class="grammar-literal">a-zA-Z</code><code class="grammar-brac">]</code><code class="grammar-plus">+</code> <code class="grammar-paren">(</code>"<code class="grammar-literal">-</code>" <code class="grammar-brac">[</code><code class="grammar-literal">a-zA-Z0-9</code><code class="grammar-brac">]</code><code class="grammar-plus">+</code><code class="grammar-paren">)</code><code class="grammar-star">*</code> <code class="grammar-paren">(</code>"<code class="grammar-literal">--</code>" <code class="grammar-brac">[</code><code class="grammar-literal">a-zA-Z0-9</code><code class="grammar-brac">]</code><code class="grammar-plus">+</code><code class="grammar-paren">)</code><code class="grammar-opt">?</code></td>
</tr>
<tr id="grammar-production-STRING_LITERAL_QUOTE">
<td>[12]</td>
Expand Down
2 changes: 1 addition & 1 deletion spec/ntriples.bnf
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ quotedTriple ::= '<<' subject predicate object '>>'

IRIREF ::= '<' ([^#x00-#x20<>"{}|^`\] | UCHAR)* '>'
BLANK_NODE_LABEL ::= '_:' ( PN_CHARS_U | [0-9] ) ((PN_CHARS|'.')* PN_CHARS)?
LANGTAG ::= "@" [a-zA-Z]+ ( "-" [a-zA-Z0-9]+ )*
LANGTAG ::= "@" [a-zA-Z]+ ( "-" [a-zA-Z0-9]+ )* ( "--" [a-zA-Z0-9]+ )?
STRING_LITERAL_QUOTE ::= '"' ( [^#x22#x5C#xA#xD] | ECHAR | UCHAR )* '"'
UCHAR ::= ( "\u" HEX HEX HEX HEX )
| ( "\U" HEX HEX HEX HEX HEX HEX HEX HEX )
Expand Down

0 comments on commit 994035b

Please sign in to comment.