Skip to content

Commit

Permalink
Enable multiQuery optimization for PropertyMapStep and ElementMapStep…
Browse files Browse the repository at this point in the history
… [cql-tests] [tp-tests]

Adds possibility to fetch properties and labels of vertices using valueMap, elementMap, propertyMap steps.

Adds fetching modes to properties, values, valueMap, elementMap, propertyMap steps to be able to preFetch all properties (single slice query) or only required properties (separate slice query per each requested property).

Fixes #2444

Signed-off-by: Oleksandr Porunov <alexandr.porunov@gmail.com>
  • Loading branch information
porunov committed Jun 10, 2023
1 parent b757b8b commit 5240e3b
Show file tree
Hide file tree
Showing 20 changed files with 1,762 additions and 87 deletions.
6 changes: 6 additions & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -189,6 +189,12 @@ GraphBinary is now used as the default MessageSerializer.hb
use `query.batch.has-step-mode = none` as replacement for `query.batch-property-prefetch = false` or use
`query.batch.has-step-mode = all_properties` as replacement for `query.batch-property-prefetch = true`.

`query.fast-property` has no influence on `values`, `properties`, `valueMap`, `propertyMap`, `elementMap` anymore when
`query.batch.enabled` is `true`. By default, those steps are configured to fetch only required properties
(with separate query per property), but the behaviour can be changed with the configuration `query.batch.properties-mode`.
In case previous behavior is desired, use `query.batch.properties-mode = required_properties_only` for `query.fast-property = false`
or use `query.batch.properties-mode = all_properties` for `query.fast-property = true`.

[Batch processing](https://docs.janusgraph.org/operations/batch-processing/) allows JanusGraph to fetch a batch of
vertices from the storage backend together instead of requesting each vertex individually which leads to a high number
of backend queries.
Expand Down
14 changes: 9 additions & 5 deletions docs/configs/janusgraph-cfg.md
Original file line number Diff line number Diff line change
Expand Up @@ -346,7 +346,10 @@ Configuration options for query processing

| Name | Description | Datatype | Default Value | Mutability |
| ---- | ---- | ---- | ---- | ---- |
| query.fast-property | Whether to pre-fetch all properties on first singular vertex property access. This can eliminate backend calls on subsequent property access for the same vertex at the expense of retrieving all properties at once. This can be expensive for vertices with many properties | Boolean | true | MASKABLE |
| query.fast-property | Whether to pre-fetch all properties on first singular vertex property access. This can eliminate backend calls on subsequent property access for the same vertex at the expense of retrieving all properties at once. This can be expensive for vertices with many properties.
This setting is applicable to direct vertex properties access (like `vertex.properties("foo")` but not to `vertex.properties("foo","bar")` because the later case is not a singular property access).
This setting is not applicable to the the next Gremlin steps: `valueMap`, `propertyMap`, `elementMap`, `properties`, `values` (configuration option `query.batch.properties-mode` should be used to configure their behavior).
When `true` this setting overwrites `query.batch.has-step-mode` to `all_properties` unless `none` mode is used. | Boolean | true | MASKABLE |
| query.force-index | Whether JanusGraph should throw an exception if a graph query cannot be answered using an index. Doing so limits the functionality of JanusGraph's graph queries but ensures that slow graph queries are avoided on large graphs. Recommended for production use of JanusGraph. | Boolean | false | MASKABLE |
| query.hard-max-limit | If smart-limit is disabled and no limit is given in the query, query optimizer adds a limit in light of possibly large result sets. It works in the same way as smart-limit except that hard-max-limit is usually a large number. Default value is Integer.MAX_VALUE which effectively disables this behavior. This option does not take effect when smart-limit is enabled. | Integer | 2147483647 | MASKABLE |
| query.ignore-unknown-index-key | Whether to ignore undefined types encountered in user-provided index queries | Boolean | false | MASKABLE |
Expand All @@ -362,17 +365,18 @@ Configuration options to configure batch queries optimization behavior
| Name | Description | Datatype | Default Value | Mutability |
| ---- | ---- | ---- | ---- | ---- |
| query.batch.enabled | Whether traversal queries should be batched when executed against the storage backend. This can lead to significant performance improvement if there is a non-trivial latency to the backend. If `false` then all other configuration options under `query.batch` namespace are ignored. | Boolean | true | MASKABLE |
| query.batch.has-step-mode | Properties pre-fetching mode for `has` step. Used only when query.batch.enabled is `true`.<br>Supported modes:<br>- `all_properties` Pre-fetch all vertex properties on any property access<br>- `required_properties_only` Pre-fetch necessary vertex properties for the whole chain of foldable `has` steps<br>- `required_and_next_properties` Prefetch the same properties as with `required_properties_only` mode, but also prefetch
| query.batch.has-step-mode | Properties pre-fetching mode for `has` step. Used only when query.batch.enabled is `true`.<br>Supported modes:<br>- `all_properties` - Pre-fetch all vertex properties on any property access (fetches all vertex properties in a single slice query)<br>- `required_properties_only` - Pre-fetch necessary vertex properties for the whole chain of foldable `has` steps (uses a separate slice query per each required property)<br>- `required_and_next_properties` - Prefetch the same properties as with `required_properties_only` mode, but also prefetch
properties which may be needed in the next properties access step like `values`, `properties,` `valueMap`, `elementMap`, or `propertyMap`.
In case the next step is not one of those properties access steps then this mode behaves same as `required_properties_only`.
In case the next step is one of the properties access steps with limited scope of properties, those properties will be
pre-fetched together in the same multi-query.
In case the next step is one of the properties access steps with unspecified scope of property keys then this mode
behaves same as `all_properties`.<br>- `required_and_next_properties_or_all` Prefetch the same properties as with `required_and_next_properties`, but in case the next step is not
`values`, `properties,` `valueMap`, `elementMap`, or `propertyMap` then acts like `all_properties`.<br>- `none` Skips `has` step batch properties pre-fetch optimization.<br> | String | required_and_next_properties | MASKABLE |
behaves same as `all_properties`.<br>- `required_and_next_properties_or_all` - Prefetch the same properties as with `required_and_next_properties`, but in case the next step is not
`values`, `properties,` `valueMap`, `elementMap`, or `propertyMap` then acts like `all_properties`.<br>- `none` - Skips `has` step batch properties pre-fetch optimization.<br> | String | required_and_next_properties | MASKABLE |
| query.batch.limited | Configure a maximum batch size for queries against the storage backend. This can be used to ensure responsiveness if batches tend to grow very large. The used batch size is equivalent to the barrier size of a preceding `barrier()` step. If a step has no preceding `barrier()`, the default barrier of TinkerPop will be inserted. This option only takes effect if `query.batch.enabled` is `true`. | Boolean | true | MASKABLE |
| query.batch.limited-size | Default batch size (barrier() step size) for queries. This size is applied only for cases where `LazyBarrierStrategy` strategy didn't apply `barrier` step and where user didn't apply barrier step either. This option is used only when `query.batch.limited` is `true`. Notice, value `2147483647` is considered to be unlimited. | Integer | 2500 | MASKABLE |
| query.batch.repeat-step-mode | Batch mode for `repeat` step. Used only when query.batch.enabled is `true`.<br>These modes are controlling how the child steps with batch support are behaving if they placed to the start of the `repeat`, `emit`, or `until` traversals.<br>Supported modes:<br>- `closest_repeat_parent` Child start steps are receiving vertices for batching from the closest `repeat` step parent only.<br>- `all_repeat_parents` Child start steps are receiving vertices for batching from all `repeat` step parents.<br>- `starts_only_of_all_repeat_parents` Child start steps are receiving vertices for batching from the closest `repeat` step parent (both for the parent start and for next iterations) and also from all `repeat` step parents for the parent start. | String | all_repeat_parents | MASKABLE |
| query.batch.properties-mode | Properties pre-fetching mode for `values`, `properties`, `valueMap`, `propertyMap`, `elementMap` steps. Used only when query.batch.enabled is `true`.<br>Supported modes:<br>- `all_properties` - Pre-fetch all vertex properties on non-singular property access (fetches all vertex properties in a single slice query). On single property access this mode behaves the same as `required_properties_only` mode.<br>- `required_properties_only` - Pre-fetch necessary vertex properties only (uses a separate slice query per each required property)<br>- `none` - Skips vertex properties pre-fetching optimization.<br> | String | required_properties_only | MASKABLE |
| query.batch.repeat-step-mode | Batch mode for `repeat` step. Used only when query.batch.enabled is `true`.<br>These modes are controlling how the child steps with batch support are behaving if they placed to the start of the `repeat`, `emit`, or `until` traversals.<br>Supported modes:<br>- `closest_repeat_parent` - Child start steps are receiving vertices for batching from the closest `repeat` step parent only.<br>- `all_repeat_parents` - Child start steps are receiving vertices for batching from all `repeat` step parents.<br>- `starts_only_of_all_repeat_parents` - Child start steps are receiving vertices for batching from the closest `repeat` step parent (both for the parent start and for next iterations) and also from all `repeat` step parents for the parent start. | String | all_repeat_parents | MASKABLE |

### schema
Schema related configuration options
Expand Down
Loading

0 comments on commit 5240e3b

Please sign in to comment.