Skip to content

Commit

Permalink
Enable multiQuery optimization for PropertyMapStep and ElementMapStep…
Browse files Browse the repository at this point in the history
… [cql-tests] [tp-tests]

Adds possibility to fetch properties and labels of vertices using valueMap, elementMap, propertyMap steps.

Adds fetching modes to properties, values, valueMap, elementMap, propertyMap steps to be able to preFetch all properties (single slice query) or only required properties (separate slice query per each requested property).

Fixes JanusGraph#2444

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
  • Loading branch information
porunov committed Jun 10, 2023
1 parent b757b8b commit 5240e3b
Show file tree
Hide file tree
Showing 20 changed files with 1,762 additions and 87 deletions.
6 changes: 6 additions & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -189,6 +189,12 @@ GraphBinary is now used as the default MessageSerializer.hb
use `query.batch.has-step-mode = none` as replacement for `query.batch-property-prefetch = false` or use
`query.batch.has-step-mode = all_properties` as replacement for `query.batch-property-prefetch = true`.

`query.fast-property` has no influence on `values`, `properties`, `valueMap`, `propertyMap`, `elementMap` anymore when
`query.batch.enabled` is `true`. By default, those steps are configured to fetch only required properties
(with separate query per property), but the behaviour can be changed with the configuration `query.batch.properties-mode`.
In case previous behavior is desired, use `query.batch.properties-mode = required_properties_only` for `query.fast-property = false`
or use `query.batch.properties-mode = all_properties` for `query.fast-property = true`.

[Batch processing](https://docs.janusgraph.org/operations/batch-processing/) allows JanusGraph to fetch a batch of
vertices from the storage backend together instead of requesting each vertex individually which leads to a high number
of backend queries.
Expand Down
14 changes: 9 additions & 5 deletions docs/configs/janusgraph-cfg.md
Original file line number Diff line number Diff line change
Expand Up @@ -346,7 +346,10 @@ Configuration options for query processing

| Name | Description | Datatype | Default Value | Mutability |
| ---- | ---- | ---- | ---- | ---- |
| query.fast-property | Whether to pre-fetch all properties on first singular vertex property access. This can eliminate backend calls on subsequent property access for the same vertex at the expense of retrieving all properties at once. This can be expensive for vertices with many properties | Boolean | true | MASKABLE |
| query.fast-property | Whether to pre-fetch all properties on first singular vertex property access. This can eliminate backend calls on subsequent property access for the same vertex at the expense of retrieving all properties at once. This can be expensive for vertices with many properties.
This setting is applicable to direct vertex properties access (like `vertex.properties("foo")` but not to `vertex.properties("foo","bar")` because the later case is not a singular property access).
This setting is not applicable to the the next Gremlin steps: `valueMap`, `propertyMap`, `elementMap`, `properties`, `values` (configuration option `query.batch.properties-mode` should be used to configure their behavior).
When `true` this setting overwrites `query.batch.has-step-mode` to `all_properties` unless `none` mode is used. | Boolean | true | MASKABLE |
| query.force-index | Whether JanusGraph should throw an exception if a graph query cannot be answered using an index. Doing so limits the functionality of JanusGraph's graph queries but ensures that slow graph queries are avoided on large graphs. Recommended for production use of JanusGraph. | Boolean | false | MASKABLE |
| query.hard-max-limit | If smart-limit is disabled and no limit is given in the query, query optimizer adds a limit in light of possibly large result sets. It works in the same way as smart-limit except that hard-max-limit is usually a large number. Default value is Integer.MAX_VALUE which effectively disables this behavior. This option does not take effect when smart-limit is enabled. | Integer | 2147483647 | MASKABLE |
| query.ignore-unknown-index-key | Whether to ignore undefined types encountered in user-provided index queries | Boolean | false | MASKABLE |
Expand All @@ -362,17 +365,18 @@ Configuration options to configure batch queries optimization behavior
| Name | Description | Datatype | Default Value | Mutability |
| ---- | ---- | ---- | ---- | ---- |
| query.batch.enabled | Whether traversal queries should be batched when executed against the storage backend. This can lead to significant performance improvement if there is a non-trivial latency to the backend. If `false` then all other configuration options under `query.batch` namespace are ignored. | Boolean | true | MASKABLE |
| query.batch.has-step-mode | Properties pre-fetching mode for `has` step. Used only when query.batch.enabled is `true`.<br>Supported modes:<br>- `all_properties` Pre-fetch all vertex properties on any property access<br>- `required_properties_only` Pre-fetch necessary vertex properties for the whole chain of foldable `has` steps<br>- `required_and_next_properties` Prefetch the same properties as with `required_properties_only` mode, but also prefetch
| query.batch.has-step-mode | Properties pre-fetching mode for `has` step. Used only when query.batch.enabled is `true`.<br>Supported modes:<br>- `all_properties` - Pre-fetch all vertex properties on any property access (fetches all vertex properties in a single slice query)<br>- `required_properties_only` - Pre-fetch necessary vertex properties for the whole chain of foldable `has` steps (uses a separate slice query per each required property)<br>- `required_and_next_properties` - Prefetch the same properties as with `required_properties_only` mode, but also prefetch
properties which may be needed in the next properties access step like `values`, `properties,` `valueMap`, `elementMap`, or `propertyMap`.
In case the next step is not one of those properties access steps then this mode behaves same as `required_properties_only`.
In case the next step is one of the properties access steps with limited scope of properties, those properties will be
pre-fetched together in the same multi-query.
In case the next step is one of the properties access steps with unspecified scope of property keys then this mode
behaves same as `all_properties`.<br>- `required_and_next_properties_or_all` Prefetch the same properties as with `required_and_next_properties`, but in case the next step is not
`values`, `properties,` `valueMap`, `elementMap`, or `propertyMap` then acts like `all_properties`.<br>- `none` Skips `has` step batch properties pre-fetch optimization.<br> | String | required_and_next_properties | MASKABLE |
behaves same as `all_properties`.<br>- `required_and_next_properties_or_all` - Prefetch the same properties as with `required_and_next_properties`, but in case the next step is not
`values`, `properties,` `valueMap`, `elementMap`, or `propertyMap` then acts like `all_properties`.<br>- `none` - Skips `has` step batch properties pre-fetch optimization.<br> | String | required_and_next_properties | MASKABLE |
| query.batch.limited | Configure a maximum batch size for queries against the storage backend. This can be used to ensure responsiveness if batches tend to grow very large. The used batch size is equivalent to the barrier size of a preceding `barrier()` step. If a step has no preceding `barrier()`, the default barrier of TinkerPop will be inserted. This option only takes effect if `query.batch.enabled` is `true`. | Boolean | true | MASKABLE |
| query.batch.limited-size | Default batch size (barrier() step size) for queries. This size is applied only for cases where `LazyBarrierStrategy` strategy didn't apply `barrier` step and where user didn't apply barrier step either. This option is used only when `query.batch.limited` is `true`. Notice, value `2147483647` is considered to be unlimited. | Integer | 2500 | MASKABLE |
| query.batch.repeat-step-mode | Batch mode for `repeat` step. Used only when query.batch.enabled is `true`.<br>These modes are controlling how the child steps with batch support are behaving if they placed to the start of the `repeat`, `emit`, or `until` traversals.<br>Supported modes:<br>- `closest_repeat_parent` Child start steps are receiving vertices for batching from the closest `repeat` step parent only.<br>- `all_repeat_parents` Child start steps are receiving vertices for batching from all `repeat` step parents.<br>- `starts_only_of_all_repeat_parents` Child start steps are receiving vertices for batching from the closest `repeat` step parent (both for the parent start and for next iterations) and also from all `repeat` step parents for the parent start. | String | all_repeat_parents | MASKABLE |
| query.batch.properties-mode | Properties pre-fetching mode for `values`, `properties`, `valueMap`, `propertyMap`, `elementMap` steps. Used only when query.batch.enabled is `true`.<br>Supported modes:<br>- `all_properties` - Pre-fetch all vertex properties on non-singular property access (fetches all vertex properties in a single slice query). On single property access this mode behaves the same as `required_properties_only` mode.<br>- `required_properties_only` - Pre-fetch necessary vertex properties only (uses a separate slice query per each required property)<br>- `none` - Skips vertex properties pre-fetching optimization.<br> | String | required_properties_only | MASKABLE |
| query.batch.repeat-step-mode | Batch mode for `repeat` step. Used only when query.batch.enabled is `true`.<br>These modes are controlling how the child steps with batch support are behaving if they placed to the start of the `repeat`, `emit`, or `until` traversals.<br>Supported modes:<br>- `closest_repeat_parent` - Child start steps are receiving vertices for batching from the closest `repeat` step parent only.<br>- `all_repeat_parents` - Child start steps are receiving vertices for batching from all `repeat` step parents.<br>- `starts_only_of_all_repeat_parents` - Child start steps are receiving vertices for batching from the closest `repeat` step parent (both for the parent start and for next iterations) and also from all `repeat` step parents for the parent start. | String | all_repeat_parents | MASKABLE |

### schema
Schema related configuration options
Expand Down
Loading

1 comment on commit 5240e3b

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark

Benchmark suite Current: 5240e3b Previous: f3cdce1 Ratio
org.janusgraph.JanusGraphSpeedBenchmark.basicAddAndDelete 14760.74697172899 ms/op 14332.161923092626 ms/op 1.03
org.janusgraph.GraphCentricQueryBenchmark.getVertices 1360.4920308720389 ms/op 1321.1248866847918 ms/op 1.03
org.janusgraph.MgmtOlapJobBenchmark.runClearIndex 218.94617670869565 ms/op 219.3187360514493 ms/op 1.00
org.janusgraph.MgmtOlapJobBenchmark.runReindex 462.5267936142424 ms/op 457.21256604333337 ms/op 1.01
org.janusgraph.JanusGraphSpeedBenchmark.basicCount 379.5563868881741 ms/op 339.7662905587447 ms/op 1.12
org.janusgraph.CQLMultiQueryBenchmark.getIdToOutVerticesProjection 416.3556832806669 ms/op 398.7058041528264 ms/op 1.04
org.janusgraph.CQLMultiQueryBenchmark.getElementsWithUsingEmitRepeatSteps 30444.796893846426 ms/op 29790.763599713093 ms/op 1.02
org.janusgraph.CQLMultiQueryBenchmark.getAllElementsTraversedFromOuterVertex 14861.368943361136 ms/op 14584.747217241957 ms/op 1.02
org.janusgraph.CQLMultiQueryBenchmark.getNeighborNames 14852.586903136667 ms/op 14774.73570899399 ms/op 1.01
org.janusgraph.CQLMultiQueryBenchmark.getVerticesWithDoubleUnion 593.1339615957111 ms/op 597.5285513438765 ms/op 0.99
org.janusgraph.CQLMultiQueryBenchmark.getElementsWithUsingRepeatUntilSteps 15790.941347369526 ms/op 16211.509679201707 ms/op 0.97
org.janusgraph.CQLMultiQueryBenchmark.getAdjacentVerticesLocalCounts 15240.580481170833 ms/op 15057.017418793792 ms/op 1.01
org.janusgraph.CQLMultiQueryBenchmark.getNames 14814.202739906723 ms/op 14698.622305737883 ms/op 1.01
org.janusgraph.CQLMultiQueryBenchmark.getVerticesFilteredByAndStep 664.149354556395 ms/op 648.1313542954531 ms/op 1.02
org.janusgraph.CQLMultiQueryBenchmark.getVerticesFromMultiNestedRepeatStepStartingFromSingleVertex 21289.677038455 ms/op 20241.54119706591 ms/op 1.05
org.janusgraph.CQLMultiQueryBenchmark.getVerticesWithCoalesceUsage 587.6667770068879 ms/op 580.6018760218902 ms/op 1.01

This comment was automatically generated by workflow using github-action-benchmark.

Please sign in to comment.