-
-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create interface for indexible tables in IndexedTableAccess
#1938
Conversation
0e965aa
to
2cfc612
Compare
2cfc612
to
4be5ccc
Compare
…le implements this interface but other structs can too.
…o longer necessarily a resolved table.
…Node, not plan.ResolvedTable
…functions that support it.
50e9d91
to
777d1bb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the generalization, I am not sure how UnresolvedTable fits in, but those only exist on non-execution paths now. I think it would be a mistake not to add a small test indexable table function in GMS as part of these changes, but I think it is equally good to get the first set of changes in and then follow-up with the test. TestTableFunctions
and memory/sequence_table.go
are a reference for how to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New interface seems fine to me, TableNode is a fine name
… ResolvedTable with TableNode.
32a29f0
to
999201d
Compare
Prior to this PR, it was not possible for system table functions to have indexes. (You could make the table function implement It’s now possible to write a system table function that can be optimized. The workflow looks like this:
Partitions were a feature inherited from GMS, designed to facilitate parallelism: a table is divided into multiple partitions that can each be iterated over separately. Dolt doesn’t support parallelism, but we still use partitions because indexes are implemented via partitions: every node that is indexable converts index lookups into an PartitionIterator.
These don’t need to be full implementations of the
The process of implementing these methods is cumbersome, but straightforward, so long as you’re only exposing a single index on a single column. Optimizing on multiple columns becomes more complicated. In addition, there may be optimizations that can’t be expressed as an Index. For instance, imagine a hypothetical system table function that has N columns and any combination of these columns can be filtered on efficiently. In order to make sure that every combination of filters can be done efficiently in the current framework, the node would need to provide N! different indexes. A better solution would be to allow system tables (and system table functions) to be aware of filters when generating their rows. Conceptually, this could be done by having system tables implement an interface which consumes a filter expression and produces a new system table node. In cases where no optimization can be performed, the interface returns the original system table unchanged. Then we add an optimization that runs after we push filters down the tree, that pattern matches for filters whose child nodes implement this interface. If we decide to optimize more system table functions in the future, we should strongly consider this better solution, since it will result in cleaner, more readable, more maintainable code that’s faster to write. |
Currently, only ResolvedTables are allowed to have indexes. There exists an interface,
sql.IndexAddressable
, which any node or table can implement in order to be a candidate for index-based optimization. But in practice, implementing that interface won't actually do anything because theIndexedTableAccess
struct explicitly requires a ResolvedTable.This PR replaces the
ResolvedTable
field inIndexedTableAccess
with a new interface tentatively calledTableNode
, although a more specific name would probably be better.In order for a node to be used for index-based optimization, it must implement this interface, and the table returned by the
UnderlyingTable
method must implementsql.IndexAddressable