Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add query patterns #273

Merged
merged 36 commits into from
Sep 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
a90ef65
Add new QuerySource
nck-mlcnv Sep 5, 2024
75c22a5
Add pattern queries
nck-mlcnv Sep 5, 2024
0e881b5
Update tests
nck-mlcnv Sep 5, 2024
cd96228
Update schema
nck-mlcnv Sep 5, 2024
ab115c9
Update test file
nck-mlcnv Sep 5, 2024
31676aa
Update documentation
nck-mlcnv Sep 5, 2024
bbc7a20
Fix default value of config
nck-mlcnv Sep 5, 2024
f9ce17a
Remove exception
nck-mlcnv Sep 5, 2024
8e2a09c
Fix hashcode
nck-mlcnv Sep 5, 2024
7d6ec5c
Revert "Fix hashcode"
nck-mlcnv Sep 5, 2024
e11de85
Revert "Remove exception"
nck-mlcnv Sep 5, 2024
6735344
Refactoring of QueryList
nck-mlcnv Sep 5, 2024
c311579
Remove logging statement
nck-mlcnv Sep 5, 2024
d2d6b96
Update doc
nck-mlcnv Sep 5, 2024
7ff23b5
Update graalvm suite
nck-mlcnv Sep 5, 2024
ed5e74d
Fix native compilation
nck-mlcnv Sep 5, 2024
9545444
Update logging statement
nck-mlcnv Sep 5, 2024
7b11c23
Rename query pattern caching to save
nck-mlcnv Sep 6, 2024
cb12590
Update doc
nck-mlcnv Sep 6, 2024
85ee6e9
Update doc
nck-mlcnv Sep 11, 2024
b5811db
Fix most change requests
nck-mlcnv Sep 11, 2024
61df2d0
Fix test
nck-mlcnv Sep 11, 2024
a2ae16f
Enable stdout for graal script
nck-mlcnv Sep 11, 2024
b08f77a
Fix schema
nck-mlcnv Sep 11, 2024
caaaf3b
Fix schema 2
nck-mlcnv Sep 11, 2024
e37ff28
Cleanup
nck-mlcnv Sep 11, 2024
33723ad
Add detection of normal queries
nck-mlcnv Sep 11, 2024
6d47702
Update doc
nck-mlcnv Sep 11, 2024
b72d8cd
Generate random variable if it already exists
nck-mlcnv Sep 11, 2024
575e853
Add additional variable prefix
nck-mlcnv Sep 13, 2024
c263394
Change placeholder pattern
nck-mlcnv Sep 13, 2024
e651e8d
Add hash to instance file
nck-mlcnv Sep 13, 2024
1e93b95
Merge remote-tracking branch 'origin/feature/query-pattern' into feat…
nck-mlcnv Sep 13, 2024
486e98c
Merge remote-tracking branch 'origin/develop' into feature/query-pattern
nck-mlcnv Sep 13, 2024
c7c97c2
Update doc
nck-mlcnv Sep 13, 2024
cf8644b
don't mengle file and limit filename identifier
bigerl Sep 13, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 52 additions & 0 deletions docs/configuration/queries.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ The `queries` property is an object that contains the following properties:
| order | no | `linear` | The order in which the queries are executed. If set to `linear` the queries will be executed in their order inside the file. If `format` is set to `folder`, queries will be sorted by their file name first. | `random` or `linear` |
| seed | no | `0` | The seed for the random number generator that selects the queries. If multiple workers use the same query handler, their seed will be the sum of the given seed and their worker id. | `12345` |
| lang | no | `SPARQL` | Not used for anything at the moment. | |
| template | no | | If set, queries from `path` will be treated as query templates. See [Query Templates](#query-templates) for more information. | |

## Format

Expand Down Expand Up @@ -98,3 +99,54 @@ tasks:
lang: "SPARQL"
# ... additional worker properties
```

## Query Templates
Query templates are queries containing placeholders for some terms.
Replacement candidates are identified by querying a given endpoint.
This is done in a way that the resulting queries will yield results against endpoints with the same data.

The placeholders are written in the form of `%%[a-zA-Z0-9_]+%%`, which means that any character sequence consisting
of letters, numbers, and underscores, enclosed by `%%` will be interpreted as a placeholder.
The query templates originated from WatDiv,
where the placeholders are of [similar form](https://dsg.uwaterloo.ca/watdiv/basic-testing.shtml).
If the placeholder name is equal to a variable name in the query, the placeholder will not be assigned
the same variable name during candidate generation.

Query templates and normal queries can be mixed in the same file or folder.

An exemplary template:
`SELECT * WHERE {?s %%var1%% ?o . ?o <http://exa.com> %%var2%%}`

This template will then be converted to:
`SELECT ?var1 ?var2 WHERE {?s ?var1 ?o . ?o <http://exa.com> ?var2}`

The SELECT query will then be requested from the given sparql endpoint (e.g DBpedia).
The solutions for this query are used to instantiate the template.
The results may look like the following:
- `SELECT * WHERE {?s <http://prop/1> ?o . ?o <http://exa.com> "123"}`
- `SELECT * WHERE {?s <http://prop/1> ?o . ?o <http://exa.com> "12"}`
- `SELECT * WHERE {?s <http://prop/2> ?o . ?o <http://exa.com> "1234"}`

### Configuration
The `template` attribute has the following properties:

| property | required | default | description | example |
|----------|----------|---------|---------------------------------------------------------------------|-----------------------------|
| endpoint | yes | | The endpoint to query. | `http://dbpedia.org/sparql` |
| limit | no | `2000` | The maximum number of instances per query template. | `100` |
| save | no | `true` | If set to `true`, query instances will be saved in a separate file. | `false` |

If the `save` attribute is set to `true`,
the instances will be saved in a separate file in the same directory as the query templates.
If the query templates are stored in a folder, the instances will be saved in the parent directory.

Example of query configuration with query templates:
```yaml
queries:
path: "./example/suite/queries/"
format: "folder"
template:
endpoint: "http://dbpedia.org/sparql"
limit: 100
save: true
```
6 changes: 5 additions & 1 deletion example-suite.yml
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,11 @@ tasks:
number: 16
requestType: post query
queries:
path: "./example/queries.txt"
path: "./example/query_pattern.txt"
pattern:
endpoint: "https://dbpedia.org/sparql"
limit: 1000
save: false
timeout: 180s
completionTarget:
duration: 1000s
Expand Down
2 changes: 1 addition & 1 deletion graalvm/queries.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
placeholder
SELECT * WHERE {?s %%var1%% ?o . ?o %%var3%% %%var2%%}
4 changes: 4 additions & 0 deletions graalvm/suite.yml
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,10 @@ tasks:
order: "random"
seed: 123
lang: "SPARQL"
template:
endpoint: "http://dbpedia.org/sparql"
limit: 1
save: false
timeout: 2s
connection: Blazegraph
completionTarget:
Expand Down
2 changes: 2 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -315,6 +315,8 @@
-O3
-H:-UseCompressedReferences
-H:+UnlockExperimentalVMOptions
--enable-http
--enable-https
</buildArgs>
<metadataRepository>
<enabled>true</enabled>
Expand Down
29 changes: 26 additions & 3 deletions schema/iguana-schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -183,8 +183,8 @@
}
},
"required": [
"type",
"directory"
"type",
"directory"
],
"title": "CSVStorage"
},
Expand Down Expand Up @@ -335,9 +335,29 @@
"type": "object",
"unevaluatedProperties": false,
"required": [
"duration"
"duration"
]
},
"Template": {
"type": "object",
"additionalProperties": false,
"properties": {
"endpoint": {
"type": "string"
},
"limit": {
"type": "integer",
"minimum": 1
bigerl marked this conversation as resolved.
Show resolved Hide resolved
},
"save": {
"type": "boolean"
}
},
"required": [
"endpoint"
],
"title": "Template"
},
"QueryMixes": {
"properties": {
"number": {
Expand Down Expand Up @@ -379,6 +399,9 @@
"lang": {
"type": "string",
"enum": [ "", "SPARQL" ]
},
"template": {
"$ref": "#/definitions/Template"
}
},
"required": [
Expand Down
Loading
Loading