-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Parameterized type #52
Support for Parameterized type #52
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #52 +/- ##
==========================================
+ Coverage 54.53% 54.93% +0.39%
==========================================
Files 27 35 +8
Lines 6355 6697 +342
==========================================
+ Hits 3466 3679 +213
- Misses 2697 2829 +132
+ Partials 192 189 -3 ☔ View full report in Codecov by Sentry. |
ff46f15
to
1a64c57
Compare
* Separate AnyType. This will be helpful in match method * Added support for ParameterizedFixedChar/VarChar/FixedBinary/Decimal * Added parser support for Parameterized/PrecisionTimestamp/PrecisionTimestampTz
1a64c57
to
5beb08c
Compare
types/parameterized_decimal_type.go
Outdated
func (m *ParameterizedDecimalType) WithNullability(n Nullability) Type { | ||
m.Nullability = n | ||
return m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as previous comment, this should probably be a value receiver, not a pointer receiver
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this got converted back to a pointer receiver, please change it to a value receiver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I deliberately converted it to pointer receiver. This is to catch error of using wrong type at compile time instead of run time.
Basically, if a interface is implemented with a pointer receiver. Value receiver won't implement this.
Since pointer of ParameterizedDecimalType implements "FuncDefArgType" so value receiver won't satisfy interface "FuncDefArgType"
I feel this is helpful if we have "FuncDefArgType" and need to switch case on "type". We can get compile error than
(sample program is here)
Let me know if this sounds ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that now, unlike with other types, calling WithNullability
actually changes ParameterizedDecimalType
as opposed to nearly all other cases where calling that just returns a copy with the nullability updated.
This inconsistency will be a problem for users
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In update PR, parameterized type don't implement WithNullability
Basically parametrised params implements separate interface "FundDefArgType" and som don't overload Type interface
2887a3c
to
2a81d60
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm struggling with how this pattern extends to things like List, LIST, STRUCT<T1, T2>, etc.
@@ -3,6 +3,7 @@ | |||
package parser |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally think we're just creating more debt by maintaining multiple parsers in the project rather than just using antlr everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will fix it in follow up PR to rely on antlr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an antlr implementation for Go that we can use which is easy to utilize and plug into the types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Antlr has generators for most languages, including go. Have used it extensively elsewhere and find it much more reliable/easy to maintain than writing parsers in multiple languages.
647ad12
to
a46c537
Compare
a46c537
to
a6ef7c8
Compare
@jacques-n @zeroshade
I have captured changes in PR description. Kindly read it first to get an high level idea of changes. Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made a new pass on this, left a bunch of comments
@@ -3,6 +3,7 @@ | |||
package parser |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an antlr implementation for Go that we can use which is easy to utilize and plug into the types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the overloading of Type here is wrong. You're upgrading Type to mean both concrete or parameterized. I think that has undesirable side effects and should be avoided.
A better way to approach is to keep Type to mean concrete types only. That's what users are going to be using in most of substrait-go. That's the only that exists in plans. Let's keep it that narrow.
We then have ParameterizedType. It could also be called FunctionDefinitionArgumentType since that it is the only place this concept needs to exist. This is a type that can be used in the definition of function arguments and can be parameterized or concrete. FunctionDefinitionArgumentType can't be used in plans and doesn't need to be converted to Protobuf binary values. In the protobuf files, this type is called ParameterizedType. Note that Parameterized types should not be confused with Compound Types which are types that are concrete types that have multiple components.
The third thing is what we call type expressions. Each function variant should have a type expression. This is mostly missing from substrait-go. That is the expression that is handed the concrete types and works with the FunctionDefinitionArgumentType values to figure things out.
Please update so that Type means what it means today and let's upgrade FunctionVariant to be defined using a new ParametertizedType or FunctionDefintitionArugmentType (or another name if you prefer).
A good test is all existing substrait-go type construction/manipulation code (outside of function variant construction) should have no changes. For example, this shouldn't be needed.
* Made parameter of function argument as separate interface "FuncDefArgType"
Updated PR to use interface "FuncDefArgType" for function parameters. Kindly check @jacques-n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few small comments but looks pretty good from my pov.
@@ -369,14 +371,24 @@ type ( | |||
WithNullability(Nullability) Type | |||
} | |||
|
|||
ParameterizedType interface { | |||
// CompositeType this represents a concrete type having components | |||
CompositeType interface { | |||
Type | |||
ParameterString() string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add docs to these methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, how important is it that these methods are exported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added doc
these are used in function registry which is in separate package. Basically, to convert substrait type to local name and to initialize local type registry.
60f3ad0
to
e9e3cb7
Compare
e9e3cb7
to
7beaaeb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the iterations!
Kindly check updated PR @zeroshade. Thanks |
Enhancements:
Added interface "FuncDefArgType". This represents argument of a function in substrait extension
Function implementations input argument are feched from parser via method ArgType method. ArgType returns interface "FuncDefArgType". Added another method to parser expression "RetType" to return Type (concrete) as return type. This is temporary and should be fixed with Type Derivation. To keep behavior same for return type I kept "RetType" implementations same to return "Type"
Added support for parameterized type
Added interface "LeafParameter". This indicates parameter which are not nested
* These can be of two type, concrete and abstract (name in code is "ConcreteIntParam" and "VariableIntParam")
* This allows representation of a leaf parameter. This also helps in evaluating if a given parameter is leaf or not (a parameter can be FuncDefArgType too)
Renamed "ParameterizedType" to "CompositeType". This was representing concrete serializable type so changed name to reflect so
Fixed string representation of type to be uniform for fields having more than one type (map, struct) to have space between different parameters
Enhanced existing PrecisionTimestampType, PrecisionTimestampTzType, StructType, ListType, MapType to indicate they are concrete parameterized type (added BaseString and ParameterString) method
Separate "AnyType" from existing nonParamType. This will be helpful in match method in follow up PR to match argument against parameter
Added API "HasSyncParams". This takes []FuncDefArgType as argument and returns true if any abstract leaf parameter is shared across different arguments. This will be used in Match API in follow up PR to act on such FunctionVariants