[vtgate planner] Routing & Merging refactor #12197

systay · 2023-02-01T07:00:24Z

Description

This PR refactors how routing of queries is done during query planning.

Why?

The logic for which routes can be merged together is an important and complex part of the query planner.
Making the code easy to understand and talk about is critical to get this correct.

The old Route operator consisted of a set of fields:

type Route struct {
    Source              ops.Operator
    RouteOpCode         engine.Opcode
    Keyspace            *vindexes.Keyspace
    VindexPreds         []*VindexPlusPredicates
    Selected            *VindexOption
    SysTableTableSchema []evalengine.Expr
    SysTableTableName   map[string]evalengine.Expr
    SeenPredicates      []sqlparser.Expr
    TargetDestination   key.Destination
    Alternates          map[*vindexes.Keyspace]*Route
    MergedWith          []*Route
}

Of these, only Source, RouteOpCode and MergedWith are valid for all types of routes.
All other fields only make sense for some OpCodes that the route represents.

The fields VindexPreds and Selected only make sense for sharded tables, which are represented a lot of OpCodes, such as Scatter, EqualUnique, etc.
SysTableTableSchema, SysTableTableName are only used for information_schema tables (OpCode DBA).

In a lot of places, we had to use a switch statement on the OpCode to handle things differently depending on the type of Route we were dealing with.

The Change

To me, this screamed for an interface and multiple different implementation of this interface, depending on which type of route we have.
The new operator now contains:

Route struct {
    Source     ops.Operator
    MergedWith []*Route
    Routing    Routing
}

The Routing interface is then used for picking the best plan per table in the query, and then the merging of multiple Routes into as few as possible.

While doing this refactoring, I tried to keep the tests intact and only change the code behind. For the few exceptions to this rule, I have added comments in this PR explaining why the change was introduced.

Checklist

"Backport to:" labels have been added if this change should be back-ported
Tests were added or are not required
Documentation was added or is not required

vitess-bot · 2023-02-01T07:00:27Z

systay · 2023-02-13T20:11:45Z

go/vt/vtgate/planbuilder/testdata/filter_cases.json

@@ -1150,7 +1150,7 @@
      "Original": "select Id from user where 1 in ('aa', 'bb')",
      "Instructions": {
        "OperatorType": "Route",
-        "Variant": "Scatter",
+        "Variant": "None",


We are now using the evalengine to check if we can at plan-time evaluate expression. If we can, and if the result is false, we can use the None opcode which is very cheap.

systay · 2023-02-13T20:25:08Z

go/vt/vtgate/planbuilder/testdata/info_schema57_cases.json

@@ -279,9 +279,9 @@
          "Sharded": false
        },
        "FieldQuery": "select RC.CONSTRAINT_NAME, ORDINAL_POSITION from INFORMATION_SCHEMA.KEY_COLUMN_USAGE as KCU, INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS as RC where 1 != 1",
-        "Query": "select RC.CONSTRAINT_NAME, ORDINAL_POSITION from INFORMATION_SCHEMA.KEY_COLUMN_USAGE as KCU, INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS as RC where KCU.TABLE_SCHEMA = :__vtschemaname and KCU.TABLE_NAME = :KCU_TABLE_NAME and KCU.COLUMN_NAME = 'id' and KCU.REFERENCED_TABLE_SCHEMA = 'test' and KCU.CONSTRAINT_NAME = 'data_type_table_id_fkey' and KCU.CONSTRAINT_NAME = RC.CONSTRAINT_NAME order by KCU.CONSTRAINT_NAME asc, KCU.COLUMN_NAME asc",
+        "Query": "select RC.CONSTRAINT_NAME, ORDINAL_POSITION from INFORMATION_SCHEMA.KEY_COLUMN_USAGE as KCU, INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS as RC where KCU.TABLE_SCHEMA = :__vtschemaname and KCU.TABLE_NAME = :KCU_TABLE_NAME and KCU.COLUMN_NAME = 'id' and KCU.REFERENCED_TABLE_SCHEMA = :__vtschemaname and KCU.CONSTRAINT_NAME = 'data_type_table_id_fkey' and KCU.CONSTRAINT_NAME = RC.CONSTRAINT_NAME order by KCU.CONSTRAINT_NAME asc, KCU.COLUMN_NAME asc",


The old query was wrong - we should never use the given schema name. Instead we have to replace the literal value with the argument :__vtschemaname which is then filled in by the vttablet with the name of the underlying MySQL database.

go/vt/vtgate/planbuilder/testdata/info_schema57_cases.json

        "SysTableTableName": "[KCU_TABLE_NAME:VARCHAR(\"data_type_table\")]",
-        "SysTableTableSchema": "[VARCHAR(\"test\")]",
+        "SysTableTableSchema": "[VARCHAR(\"test\"), VARCHAR(\"test\")]",


systay · 2023-02-13T20:27:04Z

go/vt/vtgate/planbuilder/testdata/info_schema57_cases.json

-        "Keyspace": {
-          "Name": "main",
-          "Sharded": false
+        "OperatorType": "Join",


The old plan was wrong. Given WHERE kcu.table_schema = ? AND rc.constraint_schema = ?, we won't know until runtime if the user wants to look at information from the same or different keyspaces/schemas, and so merging these routes into a single one is invalid.

systay · 2023-02-13T20:28:36Z

go/vt/vtgate/planbuilder/testdata/info_schema57_cases.json

@@ -547,7 +571,7 @@
        "FieldQuery": "select fk.referenced_table_name as to_table, fk.referenced_column_name as primary_key, fk.column_name as `column`, fk.constraint_name as `name`, rc.update_rule as on_update, rc.delete_rule as on_delete from information_schema.referential_constraints as rc, information_schema.key_column_usage as fk where 1 != 1",
        "Query": "select fk.referenced_table_name as to_table, fk.referenced_column_name as primary_key, fk.column_name as `column`, fk.constraint_name as `name`, rc.update_rule as on_update, rc.delete_rule as on_delete from information_schema.referential_constraints as rc, information_schema.key_column_usage as fk where rc.constraint_schema = :__vtschemaname and rc.table_name = :rc_table_name and fk.referenced_column_name is not null and fk.table_schema = :__vtschemaname and fk.table_name = :fk_table_name",
        "SysTableTableName": "[fk_table_name:VARCHAR(\"table_name\"), rc_table_name:VARCHAR(\"table_name\")]",
-        "SysTableTableSchema": "[VARCHAR(\"table_schema\"), VARCHAR(\"table_schema\")]",
+        "SysTableTableSchema": "[VARCHAR(\"table_schema\")]",


Since we have merged the two routes, there is no need to specify the table_schema schema twice.

systay · 2023-02-13T20:30:15Z

go/vt/vtgate/planbuilder/testdata/info_schema80_cases.json

-        "SysTableTableName": "[tc_table_name:VARCHAR(\"table_name\")]",
-        "SysTableTableSchema": "[VARCHAR(\"constraint_schema\"), VARCHAR(\"table_schema\")]",
-        "Table": "information_schema.check_constraints, information_schema.table_constraints"
+        "OperatorType": "Join",


Here we know that the two schemas being searched for are different - WHERE tc.table_schema = 'table_schema' AND ... cc.constraint_schema = 'constraint_schema'. Merging these two was wrong.

systay · 2023-02-13T20:33:57Z

go/vt/vtgate/planbuilder/testdata/reference_cases.json

@@ -22,7 +22,7 @@
      "Original": "select * from ambiguous_ref_with_source",
      "Instructions": {
        "OperatorType": "Route",
-        "Variant": "Reference",
+        "Variant": "Unsharded",


ambiguous_ref_with_source exists both as an unsharded table in the main keyspace, and as a reference table in the user keyspace. The latter is a copy of the unsharded table spread out to all shards so that all joins can be local.

During route planning we have decided that we want to send the query to the unsharded main keyspace. The OpCode is more accurate if it's Unsharded for this route.

frouioui

This is great, it's an important change that makes things easier for all of us and that fills some gaps we had, thank you @systay 🙏🏻

I left a few questions, comments and nits after this first pass

Signed-off-by: Andres Taylor <andres@planetscale.com>

go/vt/vtgate/engine/route.go

go/vt/vtgate/executor_framework_test.go

go/vt/vtgate/planbuilder/operator_transformers.go

harshit-gangal · 2023-02-15T06:24:10Z

go/vt/vtgate/planbuilder/operators/info_schema_planning.go

+type InfoSchemaRouting struct {
+	SysTableTableSchema []sqlparser.Expr
+	SysTableTableName   map[string]sqlparser.Expr
+	Table               *QueryTable
+}
+
+func (isr *InfoSchemaRouting) UpdateRoutingParams(_ *plancontext.PlanningContext, rp *engine.RoutingParameters) error {
+	rp.SysTableTableSchema = nil
+	for _, expr := range isr.SysTableTableSchema {


now that we do not merge if SysTableTableSchema is different, should SysTableTableSchema be an Expr than slice of Expr? Similarly, Do we need SysTableTableName as a map?

valid points. wdyt about doing these fixes in a separate PR? this one has grown enough for now :)

We should add a task for it

go/vt/vtgate/planbuilder/operators/info_schema_planning.go

Signed-off-by: Andres Taylor <andres@planetscale.com>

frouioui

Looks good to me!

go/vt/vtgate/planbuilder/operators/reference_routing.go

go/vt/vtgate/planbuilder/operators/sharded_routing.go

go/vt/vtgate/planbuilder/operators/merging.go

Signed-off-by: Andres Taylor <andres@planetscale.com>

The rewriting on v16 didn't consider the case where we already had an extract subquery. In that case we don't extract again, to avoid infinite recursion. This does not affect v17 and later as this was fixed in the refactor in vitessio#12197. Signed-off-by: Dirkjan Bussink <d.bussink@gmail.com>

systay added Type: Internal Cleanup Component: Query Serving labels Feb 1, 2023

vitess-bot bot added NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsWebsiteDocsUpdate What it says labels Feb 1, 2023

systay force-pushed the routing-take-3 branch 4 times, most recently from 752194e to b19d35e Compare February 9, 2023 09:40

systay added the Skip CI Skip CI actions from running label Feb 9, 2023

systay force-pushed the routing-take-3 branch 5 times, most recently from 3b61762 to 8f8d98f Compare February 13, 2023 15:50

systay removed NeedsWebsiteDocsUpdate What it says Skip CI Skip CI actions from running labels Feb 13, 2023

systay force-pushed the routing-take-3 branch from 5addfe1 to 118a776 Compare February 13, 2023 16:52

systay commented Feb 13, 2023

View reviewed changes

systay removed the NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work label Feb 14, 2023

systay marked this pull request as ready for review February 14, 2023 11:05

systay requested review from harshit-gangal, frouioui and GuptaManan100 as code owners February 14, 2023 11:05

frouioui reviewed Feb 14, 2023

View reviewed changes

systay added 3 commits February 14, 2023 15:51

review feedback

c764429

Signed-off-by: Andres Taylor <andres@planetscale.com>

tidy up method after review feedback

ff8a713

Signed-off-by: Andres Taylor <andres@planetscale.com>

more cleanup - fix goland warnings in new files

1352119

Signed-off-by: Andres Taylor <andres@planetscale.com>