-
Notifications
You must be signed in to change notification settings - Fork 19
Inventory of existing extensions to SPARQL 1.1
Example:
FILTER("1999"^^xsd:gYear < "2009-01-01T20:20:20Z"^^xsd:dateTime) => true (strict semantics is type error)
Examples:
"2013-11"^^xsd:gYearMonth + "P1Y1M"^^xsd:yearMonthDuration => "P2Y1M"^^xsd:yearMonthDuration
"12"^^xsd:Integer * "P1Y"^^:xsd:yearMonthDuration => "P12Y"^^xsd:yearMonthDuration
See GeSPARQL specs
Example:
?subj search:matches [
search:query "search terms...";
search:property my:property;
search:score ?score;
search:snippet ?snippet ] .
Jena provides the functions from "XPath and XQuery Functions and Operators 3.1" for the atomic (so not sequences), non-XML-related datatypes expect for the picture string formatting operations (no contribution and no requests so far). It does provide a "afn:sprintf" with the more programmer-centric formatting strings.
Example:
?segment apf:strSplit (?s ", ")
This is covered by issue 6.
The list syntax is reused to become multiple arguments/results for the operation.
https://jena.apache.org/documentation/query/construct-quad.html
CONSTRUCT {
GRAPH :g { ?s :p ?o }
:s ?p :o
} WHERE { ... }
https://jena.apache.org/documentation/query/generate-json-from-sparql.html
JSON {
"author": ?author,
"title": ?title
} WHERE { ... }
BIND(CALL(?x, ?y) AS ?z)
See issue #20
STDEV
, STDEV_SAMP
, STDEV_POP
, VARIANCE
, VAR_SAMP
, VAR_POP
following the SQL operations.
Fuseki provides for GET, PUT, POST on the whole dataset treated as quads.
Connectors to Lucene, SOLR, Elastic that implement FTS, sorting, limit/offset, snippet extraction (hit highlighting), facets (including hierarchical), aggregations, sub-aggregations
GraphDB-Mongo Integration so you can store voluminous JSON in Mongo, and take only relevant parts of them as JSON-LD to GraphDB.
Builtin graphs onto:implicit
and onto:explicit
return only inferred/explicit triples respectively. This is similar to SPARQL Entailment regimes but not compatible.
Delete optimization: inferred triples without support are retracted (no full re-infer is needed). Predicate onto:schemaTransaction
is used to mark axiomatic (T-Box) triples to make this process more efficient.
sameAs optimization: sameAs-equivalent URLs are treated as a cluster, and combinatorial triple expansion is avoided. Graph onto:disable-sameAs
controls whether such triples should be returned in result sets (whether the clustered URLs should be enumerated).
Explain Plan: Graph onto:explain
returns a query plan, instead of actually executing the query
Semantic similarity searches based on text and graph (predication) vector embedding (distributional semantics)
The documentation for Blazegraph extensions can be found on Blazegraph wiki.
Named subqueries let you pre-compute solution sets which may be used multiple times within your query. They are useful when you want to process some subset of your data in multiple ways within a single query. You may also have multiple named subqueries. Each named subquery result can be INCLUDEd into the query in one or more places. The solution sets will be stored on the native heap (HTree) if the analytic query mode is enabled.
SELECT ...
WITH {
# Subquery goes here
} AS %NAME
WHERE {
# Main query goes here
INCLUDE %NAME
}
Blazegraph supports persistent named solution sets, which can be created either with INSERT INTO ... SELECT
syntax:
INSERT INTO %solutionSet1
SELECT ?product ?reviewer
WHERE {
?product a bsbm-inst:ProductType1 .
?review bsbm:reviewFor ?product ;
rev:reviewer ?reviewer .
?reviewer bsbm:country ?country .
}
or can be managed explicitly by using:
CREATE ( SILENT )? (GRAPH IRIref | SOLUTIONS %VARNAME ( QuadData )? )
DROP ( SILENT )? (GRAPH IRIref | DEFAULT | NAMED | ALL | GRAPHS | SOLUTIONS | SOLUTIONS %VARNAME)
CLEAR ( SILENT )? (GRAPH IRIref | DEFAULT | NAMED | ALL | GRAPHS | SOLUTIONS | SOLUTIONS %VARNAME)
Named solution sets can be used with INCLUDE %solutionSet
syntax mentioned above in Named Subqueries chapter.
Blazegraph supports extended syntax that allows attaching triples to a statement as a subject (RDF Reification). Example:
@prefix : <http://bigdata.com> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix dct: <http://purl.org/dc/elements/1.1/> .
:bob foaf:name "Bob" .
<<:bob foaf:age 23>> dct:creator <http://example.com/crawlers#c1> ;
dct:source <http://example.net/homepage-listing.html> .
and the query using this:
PREFIX : <http://bigdata.com>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dct: <http://purl.org/dc/elements/1.1/>
SELECT ?age ?src WHERE {
?bob foaf:name "Bob" .
<<?bob foaf:age ?age>> dct:source ?src .
}
Blazegraph provides an integrated full text indexing and search facility.
Example:
SELECT ?subj ?score
WHERE {
?lit bds:search "mike" .
?lit bds:relevance ?score .
?subj ?p ?lit .
}
This is translated by query optimizer into a SERVICE clause:
SELECT ?sub ?score
WHERE {
SERVICE <http://www.bigdata.com/rdf/search#search> {
?lit bds:search "mike" .
?lit bds:relevance ?score .
}
?subj ?p ?lit .
}
External FTS engines are supported too:
PREFIX fts: <http://www.bigdata.com/rdf/fts#>
SELECT ?res WHERE {
?res fts:search "Alice" .
?res fts:endpoint "http://localhost:1234/solr/blazegraph/select" .
}
Geospatial datatypes can be queried using Blazegraph’s custom SERVICE extension.
Example:
SELECT * WHERE {
SERVICE geo:search {
?event geo:search "inCircle" .
?event geo:searchDatatype geoliteral:lat-lon-time .
?event geo:predicate example:happened .
?event geo:spatialCircleCenter "48.13743#11.57549" .
?event geo:spatialCircleRadius "100" . # default unit: Kilometers
?event geo:timeStart "1356994800" .
?event geo:timeEnd "1388530799" . # 31.12.2013, 23:59:59
}
}
Blazergaph provides set of algorithms that allow to implement (graph traversals)[https://wiki.blazegraph.com/wiki/index.php/RDF_GAS_API ]. Example for BFS search in graph:
PREFIX gas: <http://www.bigdata.com/rdf/gas#>
SELECT ?depth (count(?out) as ?cnt) {
SERVICE gas:service {
gas:program gas:gasClass "com.bigdata.rdf.graph.analytics.BFS" .
gas:program gas:in <ip:/112.174.24.90> . # one or more times, specifies the initial frontier.
gas:program gas:out ?out . # exactly once - will be bound to the visited vertices.
gas:program gas:out1 ?depth . # exactly once - will be bound to the depth of the visited vertices.
gas:program gas:maxIterations 4 . # optional limit on breadth first expansion.
gas:program gas:maxVisited 2000 . # optional limit on the #of visited vertices.
}
}
group by ?depth
order by ?depth
Graphs can be members of a virtual graph. Membership in the virtual graph can be declared as triple:
:vg bd:virtualGraph :g1
:vg bd:virtualGraph :g2
and then can be used as:
FROM VIRTUAL GRAPH :vg
FROM NAMED VIRTUAL GRAPH :vg
Blazegraph has SPARQL Update syntax to control incremental truth maintenance and entailments:
DISABLE ENTAILMENTS;
ENABLE ENTAILMENTS;
CREATE ENTAILMENTS;
DROP ENTAILMENTS;
Blazegraph supports query hints using magic triples in SPARQL queries. Query hints may be used to change the default behavior of the query plan generator, or the runtime evaluation of the compiled query plan.
Example:
SELECT ?x ?o
WHERE {
# disable join order optimizer for this group graph pattern.
hint:Query hint:optimizer "None" .
?x rdfs:label ?o .
?x rdf:type foaf:Person .
}
Hint scope can be: Query, SubQuery, Group, GroupAndSubGroups, Prior. See the docs for the full list of hints.
Kineo implementes a SQL-like syntax for window functions that allow queries implementing "limit per resource", moving averages, quantiles, etc. For example:
# 3 photos from each country
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?image ?country WHERE {
?image a foaf:Image ;
dcterms:coverage [ foaf:name ?country ; dcterms:type "Country" ] ;
.
}
HAVING (ROW_NUMBER() OVER (PARTITION BY ?country) <= 3)
ORDER BY ?country
This is covered by issue 47.
The SERVICE
can by default be used to delegate queries to other SPARQL endpoints.
Comunica generalizes this behaviour by allowing different kinds of sources do be federated over,
such as raw RDF documents.
SELECT *
WHERE {
SERVICE <http://example.org/me.rdf> {
?s ?p ?o .
}
}
Related to issue 10.
find paths between nodes in the RDF graph
START ?s [ = <IRI> | <GRAPH PATTERN> ] END ?e [ = <IRI> | <GRAPH PATTERN> ]
VIA <GRAPH PATTERN> | <VAR> | <PATH>
[MAX LENGTH <int>]
[ORDER BY <condition>]
[OFFSET <int>]
[LIMIT <int>]