Nov 2022 Release
This is a periodic rollup release. It contains many features, bug fixes, and improvements.
The core API has still not stabilized and will not be guaranteed until 1.0.
Merged PRs
go-mysql-server
- 1364: Updated and expanded engine examples
TheREADME.md
had an updated example, but the actual_example/main.go
file did not. I've expanded the example file a bit to include information on setting up users, and slightly simplified theREADME.md
example. I've also added tests for everything, so that if anything breaks, we'll know we need to update both the example file and theREADME.md
portion. - 1363: Convert errors during
ComPrepare
toSQLError
We were already converting errors inComStmtExecute
,ComMultiQuery
, andComQuery
toSQLError
so that the correct error codes would be sent to clients. This change adds that support toComPrepare
, too.
Added a unit test for that case and took the opportunity to simplify the interface forCastSQLError
a little bit.
This change helps get Prisma support a little further along (dolthub/dolt#4511), but it doesn't look like it fully resolves everything Prisma needs to work with Dolt. - 1361: Fixed collation check on foreign key columns
- 1358: fix example package
close #1357
This fixes runtime panic raised by example app in/_example
.
I checked SQL client can obtain response in my local machine.~/go-mysql-server/_example$ go build ~/go-mysql-server/_example$ ./_example
$ mysql --host=127.0.0.1 --port=3306 --database=mydb -u root Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 5.7.9-Vitess Copyright (c) 2000, 2022, Oracle and/or its affiliates. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> select * from mytable; +----------+-------------------+-------------------------------+---------------------+ | name | email | phone_numbers | created_at | +----------+-------------------+-------------------------------+---------------------+ | Evil Bob | evilbob@gmail.com | ["555-666-555","666-666-666"] | 2018-04-18 09:41:13 | | Jane Doe | jane@doe.com | [] | 2018-04-18 09:41:13 | | John Doe | john@doe.com | ["555-555-555"] | 2018-04-18 09:41:13 | | John Doe | johnalt@doe.com | [] | 2018-04-18 09:41:13 | +----------+-------------------+-------------------------------+---------------------+ 4 rows in set (0.00 sec)
- 1356: add tests for sql type Zero() functions
Implementers of GMS might expect similar values to be returned by Convert() and Zero(). For decimal and enum implementations this was not the case and has been fixed. - 1355: Allow any select statement for
CREATE TABLE AS SELECT ...
Also fixes a semantics bug in the schema produced by some such statements. - 1354: Bug fix for pushdownSort handling missing cols qualified with a table name
The Dolt bump for my GMS change to fix an alias issue in sort node pushdown triggered an error with matching missing column names now that we can include qualified column names.
This PR adds a repro for that case to GMS and fixes the issue by ensuring we create aUnresolvedQualifiedColumn
when the missing column is qualified with a table name. I've run Dolt tests locally and confirmed there shouldn't be any other test failures in the next bump. - 1353: Fix panic for
show keys from information_schema.columns
Does this by removing the special handling ofinformation_schema.columns
as a separate node type, treats it just like any otherResolvedTable.
In the process, effectively rewrote how we handle column default values by 1) moving most logic to happen in theOnceBefore
batch, rather than default rules, and 2) splitting it up into multiple passes that each have a single purpose. I found in the process of 1) that the previous rules had a lot of side effects and unintended ordering constraints, so introduced new rules and tweaked others to eliminate those. - 1352: Update sort node to use alias reference when it will resolve missing columns
Fixes: dolthub/dolt#3016
Other changes:- Refactors the existing OrderBy/GroupBy tests into a ScriptTest.
- Introduces a new interface,
sql.Projector
, that unitesGroupBy
,Window
, andProject
if a caller just needs to get the projected expressions.
- 1351: Add support for database collations
This allows setting the database collation, such that all newly created tables within a database (that do not explicitly set their collation) will inherit the database collation.
Builds on dolthub/vitess#199 - 1350: Subquery row iter fix field indexes
Recent changes to subquery scope visibility use the scope to communicate column definition availability; i.e., we do not pass the scope into subqueries we have determined to not depend on the outer scope, marking the same scope as cacheable. This analysis change needs a corresponding runtime change to indicate whether the scope row is expected at execution time. - 1349: Skip process tracking when prepreparing queries
When we prepare a statement, theQueryProcess
node we create is bound to the current context's PID. This is not the same PID as the context that will execute a statement created from that template, which results in ProcessList metadata not being properly cleaned up after a query has finished processing.
Fixes: dolthub/dolt#4601
I couldn't find a great way to test this in the GMS package, but I'm working on a test in dolt that I'll link to shortly. - 1348: fix visibility for on duplicate key update
Currently, we treatplan.InsertInto.Source
independently fromDestination
, and is not considered one ofInsertInto
's children. It is evaluated much later in the analysis process in the ruleresolveInsertRules
in a similar way as subqueries (we recurse the analyzer on it). This is problematic if we want to reference tables fromSource
.
In this PR, I resolve the tables forInsertInto.Source
and added extra logic to correctly index those tables' columns.
There is a special case foron duplicate key update <expr>
in that the LHS of the expr can only seeInsert.Destination
while the RHS can seeInsert.Destination
andInsert.Source
.
Partial fix for: dolthub/dolt#4562
Note: This does not work for CTEs
This is only kind of a fix for the issue. The right way to fix this is probably to completely resolveInsertInto.Source
before doing anything else, but I wasn't able to get that working yet. - 1343: allow adding new primary key to table with > 1 row iff it has auto_increment
fix for: dolthub/dolt#4581
tests in dolt becausememory.Table
doesn't implementRewriteableTable
dolthub/dolt#4593 - 1341: fix comparison for geometry types
fix for: dolthub/dolt#3451
Maybe all geometry comparisons should just default to their EWKB formats; pretty confident this is what MySQL does. - 1339: Update DateAdd/DateSub to return correct types.
This change fixes dolthub/dolt#4376
Previous implementation was hard-coded to return sql.Date, but now we are following the MySQL standard and return a type based on the inputs.
The tests in the repo are verifying that the correct data is returned, but we're not testing the actual SQL data that is received, so none of our tests are catching this case yet. We should open a new work item to start testing the actual SQL that's being returned by dolt. For this bug, testing was performed using a local version of dolt with these changes.
The failing query from the original bug is now working:SELECT NOW(), DATE_ADD(NOW(), INTERVAL 14 DAY), dolt_version(); +----------------------------+----------------------------------+----------------+ | NOW() | DATE_ADD(NOW(), INTERVAL 14 DAY) | dolt_version() | +----------------------------+----------------------------------+----------------+ | 2022-10-18 18:53:24.406345 | 2022-11-01 18:53:24.406345 | 0.50.4 | +----------------------------+----------------------------------+----------------+ 1 row in set (0.02 sec)
- 1337: add support for
GeometryCollection
pt. 3
Part 10 of fix for: dolthub/dolt#3638
Changes:- these functions now work with
GeometryCollection
st_srid
st_swap
- these functions now work with
- 1336: can't alter blob, json, and geometry columns when other defaults are defined
fix for: dolthub/dolt#4543
Note: the syntax here is actually invalid in MySQL? - 1335: add support for
GeometryCollection
pt. 2
Part 9 of fix for: dolthub/dolt#3638
Changes:- refactor a bit of wkt code
- added missing implementation and tests for wkt for
multipolygon
- these functions now work with
GeometryCollection
st_geometrycollectionfromtext
st_geomcollfromtext
st_geomcollfromtxt
st_geomfromtext
st_astext
st_aswkt
st_dimension
- 1334: multipolygon support
- 1333: prevent creating views with same name as existing table
also fix for: dolthub/dolt#4549 - 1332: add support for
GeometryCollection
pt. 1
Part 8 of fix for: dolthub/dolt#3638
Changes:- have the deserialize data and write data methods return their own byte counts, to more easily track where we are in main deserailizer
- adding
GeometryCollection
type - these functions now work with
GeometryCollection
GeometryCollection
GeomCollection
st_aswkb
st_geometrycollectionfromwkb
st_geomcollfromwkb
- 1330: add support for
MultiPolygon
pt. 3
Part 7 of fix for: dolthub/dolt#3638
TODO: rebase to main, left as a merge tojames/mpoly2
for better readability
Changes:- these functions now work with
MultiPolygon
ST_ASGEOJSON
ST_GEOMFROMGEOJSON
- lots of tests
- these functions now work with
- 1329: explain output includes FilteredTable
We already print filtered table in the debug string forIndexedTableAccess
, this adds it to regular explain andResolvedTable
. - 1327: add support for
MultiPolygon
pt. 2
Part 6 of fix for: dolthub/dolt#3638
TODO: rebase to main, left as a merge tojames/mpoly
for better readability
Changes:- these functions now work with
MultiPolygon
ST_ASWKT
ST_GEOMFROMTEXT
- added engine tests
- these functions now work with
- 1324: add support for
MultiPolygon
pt. 1
Part 5 of fix for: dolthub/dolt#3638
TODO: rebase to main, left as a merge tojames/mline3
for better readability
Changes:- contains small fixes to
multilinestring
comments and allocation length - these functions now work with
MultiPolygon
MULTIPOLYGON
ST_ASWKB
ST_GEOMFROMWKB
ST_MPOLYFROMWKB
ST_MULTIPOLYGEOMFROMWKB
- added engine tests
- contains small fixes to
- 1323: add support for
MultiLineString
pt. 3
Part 4 of fix for: dolthub/dolt#3638
TODO: rebase to main, left as a merge tojames/mline2
for better readability
Changes:- these functions now work with
MultiLineString
ST_GEOMFROMGEOJSON
ST_ASGEOJSON
- added engine tests
- these functions now work with
- 1322: add support for
MultiLineString
pt. 2
Part 3 of fix for: dolthub/dolt#3638
TODO: rebase to main, left as a merge tojames/mline
for better readability
Changes:- small fix to polygon test
- these functions now work with
MultiLineString
ST_SWAP
ST_SRID
ST_DIMENSION
MULTILINESTRING
- 1321: add support for
MultiLineString
pt. 1
Part 2 of fix for: dolthub/dolt#3638
Changes:- small copy-pasta comment fixes
- adding multilinestring type and structs
MultiLineString
support for these sql functionsST_MULTILINESTRINGFROMWKB
ST_MULTILINESTRINGFROMTEXT
ST_MLINESTRINGFROMWKB
ST_MLINESTRINGFROMTEXT
ST_GEOMFROMWKB
ST_ASWKB
ST_ASTEXT
- 1320: rewrite table on unique key creation
Fix for: dolthub/dolt#4420
In-memory tables don't support unique keys, so tests are in dolt:
dolthub/dolt#4512 - 1316: Slight port correction on README example
- 1315: Updated README example to fit current usage
We haven't used[]string{""}
for JSON values in a long time, the README definitely needed an update. - 1311: internal/sockstate: Fix sockets ending up in blocking mode when we monitor them for disconnects.
As documented inos
, calling *os.File.Fd() will put the socket into blocking mode, makes SetDeadline methods stop working, etc. This syscall to SetNonblocking restores desired functionality.
This was discovered when testing clustering control plane operations in Dolt, which rely on the timely ability to terminate all client connections by calling Close() on them.
We were able to reproduce the issue on macOS by doing the same File() behavior there, despite not having an implementation to actually use the fd for external monitoring of the connection state. To keep as much behavioral parity going forward, I left the fd translation in, since we've observed that it can radically change behavior. - 1310: Derived table outer scope visibility
Add support for outer scope visibility for derived tables, as introduced in MySQL 8.0.14.
References: - 1309: Adding Support for
MULTIPOINT
and some refactoring
Part 1 of fix for: dolthub/dolt#3638
It looks like PRs for this are gonna get pretty big, so I'm going to split them up for easier reviewing.
Changes:- greatly improved code reuse for spatial types
- refactored code to be more readable
- some functions were misnamed
FROMWKT
, when they should've beenFROMTEXT
ST_POINTFROMWKT
->ST_POINTFROMTEXT
ST_LINEFROMWKT
->ST_LINEFROMTEXT
ST_POLYFROMWKT
->ST_POLYFROMTEXT
- missing function aliases
ST_LINESTRINGFROMTEXT
ST_LINESTRINGFROMWKB
ST_POLYGONFROMTEXT
ST_POLYGONFROMWKB
ST_GEOMETRYFROMTEXT
ST_GEOMETRYFROMWKB
- these functions now work with
MULTIPOINT
ST_SWAP
ST_SRID
ST_ASWKB
ST_MULTIPOINTFROMWKB
ST_MULTIPOINTFROMTEXT
ST_MPOINTFROMWKB
ST_MPOINTFROMTEXT
ST_GEOMFROMWKB
ST_ASGEOJSON
ST_FROMGEOJSON
ST_DIMENSION
- fixed bug where x and y geojson values were being swapped
- 1308: parallelize static lookups
- 1307: allow
having
node access to all tables in itself
HAVING
clause can reference column that is not in its select result or group by result (there can be no group by clause). These column references are from tables inHAVING
node that it is not in its immediate children nodes. - 1304: integration plan regressions
This PR addresses two problems:- Optimization rules that depend on pattern matching can fail to trigger with exchange nodes, which we failed to account for in testing. This fixes a small indexed join bug and runs integration query plans with parallelism = 2.
IndexedInSubqueryFilter
should be prevented when the static side of the join takes a dependency on the index lookup. The check is now looser, and opaque nodes cannot disallow the transform. When we introduce subqueries that do reference outer scopes, we will need to be more careful and do deeper validation.
- 1303: adding support for
st_area
,st_perimeter
,st_length
Only the Cartesian portion of these function is working, geodetic calculations are quite a bit harder.
I just throw an error for any unsupported functionality.
st_area
is calculated using shoelace formula.
Basic idea is to slice polygon into a bunch of triangles and sum up their areas. The function is not defined for polygons that intersect themselves, but from my testing it seems like MySQL returns the same values for these edge cases.
st_perimeter
is a NOT supported in MySQL, but it is in PostGIS (a postgres plugin for spatial types): https://postgis.net/docs/ST_Perimeter.html
Fix for: dolthub/dolt#4451 - 1301: Ensuring read-only check is executed for prepared and non-prepared statements
Quick fix for: dolthub/dolt#4434
Tested locally with a python script to verify that prepared statements to read-only dbs are now correctly blocked. After getting this fix out, we will follow up with more formal, automated test to cover this. - 1297: Add query plan integration tests
- New integration test queries
- Fix some join planning bugs related to i) column casing and ii) TableWrappers
- 1296: server: context.go: Expose *SessionManager.KillConnection.
- 1293: no column reference can be made on
dual
table
Depends on dolthub/vitess#196
No GetField reference can be made on dual table. To differentiate between dual table and `dual` table(which can be created).
Dual table is constructed as ResolvedTable with empty table name and a single column with empty name in the parser.
Any column reference that becomes deferredColumn is replaced into alias that is present inprojectedAliases
inreorderProjection
rule.
Any GetField column reference is not allowed on dual table. - 1291: allow adding column to table with autoincrement column
fix for: dolthub/dolt#2587 - 1290: New join planning
Join planning has many high level components:- table order
- tree shape (left deep, right deep, bushy, etc)
- indexes for lookup joins
- filter arrangement
- choosing between logically equivalent plans (costing)
This PR does not fix all of these components, but it does provides a structure for gracefully increasing the complexity of each.
The new memo data structure is a memory efficient IR for SQL queries that 1) groups equivalent plans by their output schemas, and 2) generalizes child relationships. For example, here is an example memo for a join query:
"Relational expression" each have their own expression group, within which several physical/concrete implementation can reside. An expression group is defined in terms of expression group relationships, not physical implementations, because the output from any relational expression within a group will be the same. Groups are usually keyed by a hash of the operator type and the child group keys, or in the case of scalar expression by the literal parameter values (this last point will be useful for prepared statements).select * from ab inner join uv on a = u full join pq on a = p memo: ├── G1: (tablescan: ab) ├── G2: (tablescan: uv) ├── G3: (tablescan: pq) ├── G4: (innerJoin 2 1) (innerJoin 1 2) └── G5: (leftJoin 4 3)
All valid reorderings of the join tree are added to the memo byjoinOrderBuilder
.
Applying indexes is a separate step, where we consider an indexed version for every join implementation where the outer (right) table is an indexable data source. Adding indexed plans to the query above yields:Hashed joins are added in the same manner:memo: ├── G1: (tablescan: ab) ├── G2: (tablescan: uv) ├── G3: (tablescan: pq) ├── G4: (indexedJoin 1 2) (indexedJoin 2 1) (innerJoin 2 1) (innerJoin 1 2) └── G5: (indexedJoin 4 3) (leftJoin 4 3)
This memo is missing several components that would make it more useful, including other node and expression types. For example, filter pushdown would be much easier in the memo. You could imagine extending the memo above with sql.Expression tree memo groups (scalar relations). Canonical (simplified) expression groups only requiring one allocation, and the scalar expression group caches properties like table and column dependencies similar tomemo: ├── G1: (tablescan: ab) ├── G2: (tablescan: uv) ├── G3: (tablescan: pq) ├── G4: (hashJoin 1 2) (hashJoin 2 1) (indexedJoin 1 2) (indexedJoin 2 1) (innerJoin 2 1) (innerJoin 1 2) └── G5: (hashJoin 4 3) (indexedJoin 4 3) (leftJoin 4 3)
relProp
column and table dependencies. We would need to build a memo prior to join planning to use filters for join planning, and run other transformation rules on the memo. I'd expect the memo to absorb rules over time, which will simplify transform logic and improve analyzer memory consumption. I added some starters to make this easier in the future, like codegen'd memo expressions, and a move towards visitor interfaces for IR transformations.
A fully exhausted memo tree is converted back into a physical plan first through a costing process. Costing builds the fastest implementation for child group expressions bottom-up, using only the best child implementations for costing parent expressions and eventually the root node.
The coster is designed to use histogram statistics to pick the fastest join plan. We only provide a subset of index metadata and table sizes at this moment. Additionally, costing should apply to generic SQL nodes and expressions, not just join trees and join leaves.
Other notes:- we miss join plans that result from transitive predicates (ex:
ab.a = uv.u
+uv.u = xy.x
=>ab.a = xy.x
). - we miss join plan optimizations resulting from table functional dependencies (ex: eliminate redundant filters if we already join on a a primary key), and null-allowing joins (ex:
IS NULL
). - we could more aggressively apply filters to join trees.
- converting from the memo to an execution plan is fraught for the same reasons it was before this PR, but tree building and fixup is condensed in
exec_builder
, which i'd expect to expand to include more exprGroup types in the future. - there is a hack for fixing up field indexes in join trees not considered for reordering.
- right joins are converted to left joins after
expandStars
intransposeRightJoins
. - non-left deep trees necessitate passing parent rows down the right side of join trees, which were previously only table sources (left deep).
- joinOrder hinting is rewritten to more easily fit within exprGroup costing.
- I started to use bitmaps to track expression group column dependencies, which should also be used for filter applicability.
- indexedJoinIter is duplicate and should be deprecated, but i had a hard time quickly removing it.
- every join node is condensed into
*plan.JoinNode
and differentiated based on its join type, which lets me bucket different join types more easily (refer toJoinType
type methods). - applyHashLookups should be deprecated, given that we will preemptively build hash joins. If hash joins do not select the lookup, we should either not cache the subquery or improve the coster to make better decisions.
- there are probably bugs where we apply hash joins to non-cacheable subquery scopes for derived tables. Need more testing.
Additional refs: - https://www.researchgate.net/publication/262216932_On_the_correct_and_complete_enumeration_of_the_core_search_space
- https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/opt/xform/join_order_builder.go.
- https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/xu-columbia-thesis1998.pdf
- 1288: add ValidateSession() to Session interface
dolthub/dolt#4395 depends on this PR - 1281: Alias resolution updates
This change moves our alias identification and column name qualification code closer to MySQL's behavior. The major changes are:- identify available alias, table, and column names on a per-scope level (instead of the previous "nestingLevel" that was based off of the node depth of the execution plan tree)
- supply additional context to the
qualifyExpression
method so that the containing node can be used to switch on different alias visibility rules
There are still more nuanced edge cases in MySQL's alias behavior that we should continue working on matching (e.g. dolthub/dolt#4535, dolthub/dolt#4537), but this change moves us a nice step forward.
Dolt CI Tests: - dolthub/dolt#4410
Fixes: - #525
- dolthub/dolt#4344
- 1280: More join types
AddFullOuterJoin
,SemiJoin
, andAntiJoin
.
None of these new join nodes are safe for join ordering transformations yet. They are explicitly excluded from join planning.
The getField indexes for these three nodes' join conditions deserve more consideration. I excluded them from auto-fixup after manually correcting join condition get fields for the appropriate schemas.
FullOuterJoin
uses a union distinct execution operator, which is correct but a lot slower than a merge join-esque operator.
SemiJoin
andAntiJoin
rearrange subquery expression scopes. I separateresolve
andfinalizeSubqueryExpressions
to perform decorrelation before predicate pushdown (where we were panicking on FixUpExpressions) and join ordering (we want to decorrelate scopes before join planning).
Other:- query plan tests added for exist hoisting edge cases i did not catch on first pass
- fixed bug with CTE stars
- 1279: Andy/mysql validator
- 1278: should not be returning hex on the wire for geometry types
Fix for: dolthub/dolt#4390
For improved display purposes, I changed the SQL method to convert the raw binary of spatial types to their hex equivalent (#1068).
DBeaver expects the binary in WKB format to generate their maps, so ourSQL
method must return them in binary - 1276: Apply sql_select_limit to first scope only
re: dolthub/dolt#4391 and dolthub/dolt#4353 - 1275: Throw error in alter table set default if default fails rule(s)
This PR makesalter column set default
error when its default expression fails the default expression rules. Previously the DDL succeeded and when an insert or any other query was performed the default would cause a continuous error.alter table t alter column col1 set default '{\"bye\":1}' -- errors with: TEXT, BLOB, GEOMETRY, and JSON types may only have expression default values
- 1273: fix test ordering
some tests on dolt were failing forJSON_TABLES
, due to abiguousORDER BY
Companion PR: dolthub/dolt#4355 - 1272: Removed exponential time complexity for foreign key analysis
This fixes #1268
Foreign key analysis created acyclical trees that were traversed during query execution to emulate cascade operations. This meant that cyclical foreign keys were converted to an acyclical tree. Normally this isn't possible as cyclical trees are infinitely traversable, but MySQL has a depth limit of 15, which allowed us to materialize an acyclic tree with a maximum height of 15 nodes. This, however, lead to trees with an exponential number of nodes: roughly(number_of_fks)¹⁵ × 1.5
nodes in the tree. With just 3 foreign keys, we'd get a tree with roughly 22 million nodes, which would take forever to process.
This PR completely changes the analysis step to now generate cyclical trees. In addition, depth checks are now properly implemented (during query execution rather than during analysis), being represented by a returned error once the depth limit has been reached. Interestingly, MySQL is supposed to process up to 15 operations (returning an error on the 16th), but cyclical foreign keys will error on the 15th operation. I think this is a bug in MySQL, but nonetheless the behavior has been duplicated here.
I also updated thetimestamp_test.go
file to grab an unused port. This prevents test failures due to requesting an already-in-use port. Not related to this PR in particular, but it was annoying to deal with so I fixed it. - 1271: enginetest: Add repro enginetest for unique key error
repro for Dolt bug #4372 - 1270: No new context for every json SQL convert
perf here: dolthub/dolt#4364 - 1269: cleanup top-level directories
- 1261: use json_table in crossjoin that references table column on left
Fix for: dolthub/dolt#4340
To use a column as theexpr
argument tojson_table
, the column must be part of aCROSS JOIN
and has to be on the left side of a cross join (reference). Oddly, you cannot reference the column directly nor through a subquery (stackoverflow reference).
As a result, this is partially an aliasing issue, which I do address in this PR.
Added special column resolving rules forJSONTable
for when its under a Join and subquery expression - 1259: Add sponsor button to GitHub
- 1258: Bug Fix: change
unix_timestamp(<expr>)
to return 0 when it can't convertexpr
to a date
MySQL's implementation ofunix_timestamp(<expr>)
returns0
when the expression can't be converted to a date and logs a warning. This PR changes Dolt to have the same behavior. It also includes a bug fix forctx.ClearWarnings
to correctly clear warnings.mysql> select unix_timestamp(1577995200); +----------------------------+ | unix_timestamp(1577995200) | +----------------------------+ | 0 | +----------------------------+ 1 row in set, 1 warning (0.00 sec) mysql> select unix_timestamp("jason"); +-------------------------+ | unix_timestamp("jason") | +-------------------------+ | 0.000000 | +-------------------------+ 1 row in set, 1 warning (0.00 sec) mysql> show warnings; +---------+------+-----------------------------------+ | Level | Code | Message | +---------+------+-----------------------------------+ | Warning | 1292 | Incorrect datetime value: 'jason' | +---------+------+-----------------------------------+ 1 row in set (0.00 sec)
- 1257: allow compare decimals with different precision and scale
- 1256: add check for chan is closed in Listener.Close()
dolthub/dolt#4301 depends on this PR.
A test for it is added in dolthub/dolt#4301 - 1255: fix
numberTypeImpl.SQL()
to convert from any type
The SQL method fornumberTypeImpl
expectsv
to already be the right type for thebaseType
, this is not always the case.
CASE
statements have an error guard for this behavior, which shouldn't be there.
There are likely other types that haveSQL()
methods that don't work for all types.
Changes:- added type switch for types that are unconvertable to number
- remove analyzer rule 'validateCaseResultTypesId`
Closes dolthub/dolt#4306
- 1253: Support Varchar(MAX)
related to dolthub/dolt#2261 - 1252: Reimplemented LIKE expressions to add collation support
Previously, we were converting the patterns fromLIKE
expressions into patterns that the standard regex parser would understand. Said parser does not support collations (as implemented in MySQL), therefore this is a custom-developed pattern matcher forLIKE
expressions that fully supports collations.
This is ONLY for patterns on theLIKE
expression. We still need to implement full regex parsing to supportREGEXP_LIKE
.. Once we haveREGEXP_LIKE
completed, we may revert this back to the regex conversion process, but that will be no time soon.
This also fixes dolthub/dolt#4303 - 1250: stripNodes in subquery block apply_hash_lookup optimization
A HashLookup requires two rules, 1) converting a JOIN -> SUBQUERY_ALIAS => JOIN => CACHED_RESULT(SQ), 2) converting the CACHED_RESULT into a HASHED_LOOKUP.
Prevent StripRowNode from breaking the pattern matcher for the first step. Other nodes can still interrupt the rule, but we have a test now at least. - 1249:
UpdateJoin
bug fixes
Bug fixes for dolthub/dolt#4288- qualifying check constraint expressions when multiple tables are being updated
- support finding UpdatableTables through ProcessIndexableTable and ProcessTable nodes
- erroring out when a table updater isn't available for a SetField expression, instead of silently dropping it
This does not address another UpdateJoin bug I just found while testing: dolthub/dolt#4304. Mentioning that one here so we have this context connected when we go to fix it.
I verified that Dolt's enginetest suite runs correctly with these changes.
- 1248: feat:(sql function) support last_insert_id with expr
mysql support parameter for LAST_INSERT_ID(expr), it is useful in upsert query when we want to get the updated recored id
ref: https://dev.mysql.com/doc/refman/8.0/en/information-functions.html#function_last-insert-id - 1246: adding
LEAD()
window function
Partial fix for: dolthub/dolt#4260 - 1245: adding
LAST_VALUE()
window function
Partial fix for: dolthub/dolt#4260 - 1244: use decimal type for arithmetic functions
fixes dolthub/dolt#4269
sum will be either float64 or decimal.Decimal type, but return type is decided on the type of the child expression of sum expression. This causes some test results to be modified into int from float.
the test results indecimal.Decimal
type is hard to get the exact matching using its convert method, so result is checked in string format by usingStringFixed()
method of the decimal.Decimal result value. - 1240: Additional fixes to collation coercion
Coercion rules specify that Unicode and non-Unicode characters may the Unicode collation applied in the event that they're being compared, concatenated, etc. I'm not exactly sure how to determine what a "Unicode" collation is, so I'm assuming that all collations that use more than one byte are Unicode. I can almost guarantee that this is incorrect, however it's a better approximation of behavior than not considering it at all. - 1239: More point range optimizations
SQL range now exposes whether a range is a point lookup, so the integrator does not have to check for itself. - 1238: Added limited support for collation coercibility
This adds a sort of "fix" for the first issue identified in #1232. This is basically a hack for collation coercibility, which is a significant undertaking. This should suffice in the meantime. - 1231: Return errors instead of asserting inside
runQueryPrepared
Test PR: - 1230: fix applying bindvars to filters and table functions
Changes:- Allow bindvars to be applied to FilteredTables using a new
DeferredFilteredTable
node - No longer pushdown filters in post prepared analysis
Test PR: - dolthub/dolt#4252
- Allow bindvars to be applied to FilteredTables using a new
- 1229: fix filters being dropped from prepareds on
dolt_diff
tables
TransformProjections
was not transferring filters from certain prepared plans that did not contain projections.
Fix for - 1228: Resolve CTEs ignored recursive changes
fixes regression where we do not propagate nested changes in cte resolve - 1227: Fixed collation error message, added more collations
Fixes dolthub/dolt#4236, in addition to adding two new collations:utf8mb3_unicode_ci
utf8mb4_unicode_520_ci
- 1226: recursive ctes resolve limit filter sort
General outline:- Union nodes absorb LIMIT, ORDERBY, and DISTINCT logic
- Recursive CTEs embed a UNION node, and the analyzer treats them the same for the most part
- Rewrite
resolve_cte
to bridge the gap between WITH -> rCTE -> UNION nodes
Re: (1), Union is basically collapsed into a single node. I think all scopes should probably look like this longterm for a few reasons. Adding an attribute grammar for resolving tables and columns becomes a lot easier. Transformation rules are simpler because they do not have to see through the tree within a scope. Logical nodes can be separate from physical nodes. Moving from a tree to a hash map of logical nodes is the main way of doing costed planning.
Re: (2), the easiest way to support arbitrary left/right nodes in recursive ctes is to try to run the regular analyzer on them. Notably, MySQL prevents many types of top-level nodes in rCTEs, which we have not prevented yet. For example, GROUPYBY, ORDERBY, LIMIT are not permitted in the middle of an rCTE (I think because the results are ambiguous). We do not error on those right now, but maybe should in a follow-up PR.
Re: (3), rCTEs are structurally parsed as Unions, and can be resolved like unions in most of the analyzer, but we need several manual steps to address the ways rCTEs are functionally different. rCTEs should be separated into recursive and non-recursive chains of UNIONs (refer to Aaron's comment). If a rCTE has no recursive portions, we convert to a regular CTE. We manually resolve references to the recursive binding, converting them fromUnresolvedTable
toRecursiveTable
. There is a chicken and egg problem where we have to resolve the left-side of the union before we can construct and apply a schema'dRecursiveTable
to the right.
- 1225: Hoist CTEs above Filter, Limit, OrderBy, and Having
- 1224: SET NAMES collation fix, results have proper charset
Fixes two issues.SET NAMES
would cause all statements that feature aCOLLATE
expression to error if the character set and collation did not match, however such a comparison only makes sense for string literals (at parse time). This has been fixed, as the parse-time check only occurs for literals, and checks that require analysis (such as table columns) now occur at the appropriate time. In addition, the system variablecharacter_set_results
was not used when returning string results, resulting in unexpected behavior, which has also been fixed. - 1223: fix infinite recursion during cte self-reference
Changes:- don't resolve subqueries where there are CTEs of the same name; transformations are bottom up, so they've been resolved at least once already
Fix for: - dolthub/dolt#4173
- don't resolve subqueries where there are CTEs of the same name; transformations are bottom up, so they've been resolved at least once already
- 1222: use float input as str for decimal type
The scale of float input in INSERT statement gets rounded when converting to float64 value.
The scale of input for DECIMAL data type should not be rounded. - 1221: 1 % 0 should not panic but return NULL
From https://dev.mysql.com/doc/refman/8.0/en/mathematical-functions.html#function_mod:MOD(N,0) returns NULL.
- 1217: any number of column is allowed for where exists subquery
Any number of column from the subquery is allowed. The result ofWHERE EXISTS subquery
is dependent on the number of rows rather than the number of columns
Fix for dolthub/dolt#3772 - 1216: fix infinite recursion for recursive cte used with union subquery
Changes:- recursive case actually hoists
With
node up - no need to recursively call analyzer rule on parent node
Test PR: - dolthub/dolt#4203
Fix for: - dolthub/dolt#4200
- recursive case actually hoists
- 1215: Collation fixes
Fixes the following issues:- dolthub/dolt#4172
- dolthub/dolt#4168
CREATE TABLE
and friends will now take the table collation into consideration. This extends to adding a modifying columns, which will use the table collation when no collation is specified. Additionally, thelatin1
character set has been added. Lastly, we properly block usage of non-implemented character sets and collations.
- 1214: Bug fix for table alias resolution
Fixes: dolthub/dolt#4183 - 1213: Show all currently-implemented charsets with default collations
- 1212: fix more prepared enginetests
Moved these rules to run during preprepared, as the explain queries for history table were missingExchange
nodeinSubqueryIndexesId
AutocommitId
TrackProcessId
parallelizeId
clearWarningsId
Tests:- dolthub/dolt#4179
- 1211: fix str_to_date to returns time or date or datetime type results
STR_TO_DATE()
returns result type depending on Date given in the query. If only date parts like year, month and/or day was defined in Date argument, then the result is only Date type without time parts. If Date defined is date with time, then Datetime type is returned.
Currently, STR_TO_DATE allows zero-date. - 1208: Handle IN expressions with NULL values for string typed columns
fixes string assertion panic here - 1207: fix some filters being dropped from history tables
Changes:- run
pruneTables
inprePreparedSelector
, to transfer projections correctly - added
Filters()
getter intosql.FilteredTables
interface transferProjections
also copies over table filters
Test PR:- dolthub/dolt#4163
- run
- 1206: Fix alias error that should qualify correctly
- 1205: Fixed inability to drop user without host being specified
Fixes github.com/dolthub/issues/4135 - 1204: Fix timestamp comparisons and add regression test
Summary
This PR fixes another regression introduced by #1061. That PR altered the way that timestamps are represented internally (they went fromstring
to[]byte
) without updating all of the locations where the previous representation was assumed.
The particular regression this PR fixes deals with timestamp comparisons. A test likeshould returnCREATE TABLE mytable (t TIMESTAMP); INSERT INTO mytable (t) VALUES ('1990-01-01'), ('2020-01-01'); SELECT COUNT(1) FROM mytable WHERE t > '2000-01-01';
1
, but today it returns2
.
The core issue is that the filter (e.g.,WHERE t > '2000-01-01'
) needs to compare timestamps. That comparison logic has type coercion, but the type coercion only knows how to coerce a string into a timestamp, and it is now being handed a[]byte
.
The fix here is two-fold:- increase the priority of timestamp type coercion so that it takes precedence over binary coercion
- ensure that the
convertValue
function correctly converts[]byte
into timestamps (and dates)
Motivation
#1139
This is not strictly speaking the same issue, but something wonky is going on with timestamps and I'd like to get to the bottom of it, and then fix it "for good". - 1203: Stop requiring
order by
clause for range frames that only use unbounded preceding/following and current row
We were requiring theorder by
clause for all range frame queries, but it is optional when the frame only usesunbounded preceding/following
andcurrent row
.
Fixes: dolthub/dolt#4141 - 1202: moveJoinConditionsToFilter bug
The move join condition rule selects filters not needed as a join condition, identify the target parent for the filter, and then moves the plucked filters to EVERY child of the parent of the join node. The correct behavior is to move the filter immediately above the join whose condition we plucked the filter from. - 1201: Add end-to-end tests for timestamp insertion, and patch the conversion issue
Summary
This PR introduces ane2etests/
directory with "end to end" tests:- set up a memory engine
- start the server
- connect using the normal Go mysql driver
- execute queries
This can catch issues that a unit test against the engine itself cannot.
There is only a single test thus far, and it is fairly verbose. I know that there is at least one more issue with timestamps, and I will open a follow-up PR to address that issue as soon as I track down its source. For now I'm opting for clarity rather than code reusability.
Lastly, this PR introduces a little patch to the timestamp conversion logic, which was accidentally broken in #1061 because it only handlesstring
but the server layer now passes it[]byte
. This patch is very dumb and simple: just convert[]byte
tostring
before parsing.
Motivation
#1139 - 1199: adding support for
RANK
andDENSE_RANK
Fix for:- dolthub/dolt#4126
Note: queries in MySQL seem to be ordered by the ranks by default.
- dolthub/dolt#4126
- 1198: adding implicit type conversion to
in <subquery>
statements
fix for: dolthub/dolt#4057 - 1197: Indexed joins range scans ignore non-equalities
When building index lookups, do not about when we see non-equality expressions. Instead, continue with the equality subset.
Ex:SELECT pk,i,f FROM one_pk RIGHT JOIN niltable ON pk=i and pk > 0 - RightJoin((one_pk.pk = niltable.i) AND (one_pk.pk > 0)) - ├─ Table(one_pk) - │ └─ columns: [pk] - └─ Table(niltable) - └─ columns: [i f] + Project(one_pk.pk, niltable.i, niltable.f) + └─ RightIndexedJoin((one_pk.pk = niltable.i) AND (one_pk.pk > 0)) + ├─ Table(niltable) + │ └─ columns: [i f] + └─ IndexedTableAccess(one_pk) + ├─ index: [one_pk.pk] + └─ columns: [pk]
- 1195: SkipResultsCheck logic for TestScriptPrepared
- 1194: Support Anonymous User
Fixes dolthub/dolt#4090 - 1193: Fixed typos in engine tests (skipped tests not actually getting skipped)
- 1192: support CONV() function
Fix for dolthub/dolt#4099 - 1190: add support for
BIT_AND
,BIT_OR
, andBIT_XOR
functions
fix for: dolthub/dolt#4098 - 1189: Fixed Privilege-Related Issues
Fixes the following issues: - 1188: Bug fix for recursive view definitions causing stack overflow during analysis
- 1185: empty host is any host
Fixes issue where empty string for host should be "%".
Context: #1151 (comment) - 1184: Added more wire tests for collations
- 1183: index lookup refactor
New implementation divides lookup management into analysis and exec components. During analysis,*plan.IndexedTableAccess
embeds anIndexedTable
. TheIndexLookup
itself is static, dynamic lookups will be passed multiple lookup objects at exec time.Minimal changes totype IndexLookup struct { fmt.Stringer Index Index Ranges RangeCollection } type IndexAddressable interface { // AsIndexedAccess returns a table that can perform scans constrained to an IndexLookup AsIndexedAccess(Index) IndexedTable // GetIndexes returns an array of this table's Indexes GetIndexes(ctx *Context) ([]Index, error) } type IndexAddressableTable interface { Table IndexAddressable } type IndexedTable interface { Table // LookupPartitions returns partitions scanned by the given IndexLookup LookupPartitions(*Context, IndexLookup) (PartitionIter, error) }
DriverIndexedLookup
andForeignKeyUpdater
to accommodate changes. - 1182: Use literal for Enum/set default
Currently enum / set default value will be evaluated(converted to int) and store ininformation_schema
, think better to leave as their literals.
PS: I'm also investigating if this issue related to dolthub/dolt#4503 - 1179: Preserve original enum value case on retrieval
When retrieving enum values forShowCreateTable
,Information_Schema.Columns
, andDescribe/ShowColumns
, they were always being converted to lowercase, which doesn't match MySQL's behavior. Instead, we should preserve the case each enum value was defined with, whenever we return those values.
This did not affect enum values returned as part of table data – these values were already correctly preserving the originally defined case.
Fixes: dolthub/dolt#3991
Fixes: #1161 - 1177: Secondary Lookup Range Templates
Do not use the IndexBuilder for secondary lookups. Directly create the equality ranges and/or nullable filters.
index_join_scan
latency change locally on OXS: 4.8ms -> 3.8 ms (-20%).
No interface changes, regular Dolt bump passes at the time the PR was created. - 1174: sql/analyzer: Update Index Selection heuristics to prioritize Primary Key indexes
In Dolt, this increases out TPC-C throughput by 2x for the new NBF. - 1172: Allow common table expressions in INSERT, UPDATE, DELETE statements
This fixes dolthub/dolt#4025 - 1170: Fix lock binary typing errors
- 1169: sql/parse: Add support for sqlparser.Union nodes that carry ORDER BY and LIMIT clauses.
- 1168: Add a series of skipped tests related to functionality gaps
- 1166: /.github/workflows: bump ubuntu 22.04
- 1159: Pushing
AsOf
expressions down to tables used in unions and subqueries in view definitions
Tables used in unions and subqueries in a view definition weren't getting updated with anasof
expression when one was used on the view. This PR extends the resolve_views analyzer rule to close those gaps.
Fixes: dolthub/dolt#4011
Added Dolt enginetests, too: dolthub/dolt#4028 - 1158: Add support for deprecated BINARY attribute after charset
Fixes dolthub/dolt#4019 - 1156: Refactoring ExternalStoredProcedureDatabase to ExternalStoredProcedureProvider
From dolthub/dolt#3934, we want customers to be able to call thedolt_clone
stored procedure when no database is selected. - 1151: displaying database and table grants
Makes it so that database and table specific grants show up for querySHOW GRANTS for <user>
Fix for: dolthub/dolt#3589
Also somewhat fix for: dolthub/dolt#4007
Testing PR: dolthub/dolt#4017 - 1150: /{.github,go.mod}: bump go to 1.19
- 1149: Prevent JSON and GEOMETRY columns from having literal default values
MySQL disallows BLOB, TEXT, JSON, and GEOMETRY columns from having a literal column default value:
https://dev.mysql.com/doc/refman/8.0/en/data-type-defaults.html#data-type-defaults.html#data-type-defaults-explicit
go-mysql-server already prevents BLOB and TEXT column types; this PR extends that check to cover JSON and GEOMETRY columns as well.
Fixes: dolthub/dolt#4003 - 1148: Big fix for resolving column default values in correct order
Fixes an issue where we weren't pulling the column definitions in the same order as the InsertInto specified the column values when resolving and validating column default values.
We did have a test for this case, but because the types in the test were int and varchar, the default values were being coerced silently and the tests were passing when they should have failed. I've updated that test and also added a new test case using an enum that closely matches the case the customer reported.
Fixes: dolthub/dolt#4004 - 1147: Real Collation Support
This PR adds true support for collations. Most of the files in theencodings
package were generated from collation-extractor, and thus inflate the overall line count.
Many tests have been implemented, however I do plan on replicating most of the engine tests over the wire, but I do not feel that that is a blocker (just additional validation). - 1146: fix inverted index IN filter error
- 1145: Remove decorator nodes, add info back in table plan printing
Delete*plan.DecoratedNode
and helper methods.
*plan.ResolvedTable
and*plan.IndexedTableAccess
compensate by printing range and column (logical) properties.- Projected table access on [pk c1 c2 c3 c4 c5] - └─ IndexedTableAccess(one_pk on [one_pk.pk] with ranges: [{[NULL, ∞)}]) + IndexedTableAccess(one_pk) + ├─ index: [one_pk.pk] + ├─ filters: [{[NULL, ∞)}] + └─ columns: [pk c1 c2 c3 c4 c5]
Project(a.i, a.s) └─ IndexedJoin(a.i = b.i) ├─ TableAlias(a) - │ └─ Projected table access on [i s] - │ └─ Table(mytable) + │ └─ Table(mytable) + │ └─ columns: [i s] └─ TableAlias(b) - └─ Projected table access on [i] - └─ IndexedTableAccess(mytable on [mytable.i]) + └─ IndexedTableAccess(mytable) + ├─ index: [mytable.i] + └─ columns: [i]
- 1143: distinct count supports multiple columns
Fix for: dolthub/dolt#3978
PR for testing: dolthub/dolt#3989 - 1142: adding
json_table
function
Adds some support forjson_table
function, which generates a table given a JSON string and column definitions.
Fix for: dolthub/dolt#2163
TODO:NESTED
FOR ORDINALITY
ON EMPTY
ON ERROR
- 1141: Bug fix for
now/current_timestamp
column default values not in parens
Our column default logic was conflating whether a default value was a literal value or an expression when thenow
orcurrent_timestamp
functions were used as the column default. Those column defaults were marked as literals, even though they are expressions that need to be resolved and evaluated. These two functions are the only functions that may be used as a column default value without being enclosed in parens, and when used that way, they were not getting resolved and causing a panic (see linked issue for more details).
This change makesColumnDefaultValue.IsLiteral
always return true if the default value is a literal value and always return false if the default value is some expression that needs to be resolved and evaluated. To keep consistency with MySQL's behavior, it requires that we track whether the column default was enclosed in parens when defined. This is necessary because MySQL allows the now/current_timestamp functions to be used without parens for datetime/timestamp column, but must be specified in parens when used as the default for any other column type. Since that validation is done in a separate spot from the parsing, we need to track that as part of ColumnDefaultValue.
Testing: Dolt engine tests and Dolt unit tests all pass correctly with these changes.
Fixes: dolthub/dolt#2618 - 1140: Error test fixes, extra test for EmptyTable
- 1138: Improving error message when a stored procedure isn't found and no DB is selected
Previously, if a user tried to call a stored procedure with no database selected, they would get this error message:We've seen some users connect to a sql-server, but not use a database and be confused by this error message, so this change adds extra content to the error message when no current database is selected:stored procedure "%s" does not exist
stored procedure "%s" does not exist: this might be because no database is selected
- 1135: Support UPDATE IGNORE
This pul request implements UPDATE IGNORE behavior. CC: - 1134: Populate max field length response metadata
The MySQL wire protocol includes a "column length" field as part of the Column Definition that tells clients the maximum amount of bytes they should expect for each serialized value in that column. Before this change, go-mysql-server was not populating this field and some MySQL clients were unable to parse responses without that max field byte length metadata. This change starts populating that field, with values identical to what MySQL returns.
It was pretty straightforward to plumb this new piece of data through, but things got a little tricky with some of the stringtypes, so I cleaned up that code a little bit to better distinguish between themaxChars
,maxBytes
, andmaxResponseBytes
for a stringtype.
Fixes: dolthub/dolt#3914
Fixes: dolthub/dolt#3893 - 1133: Add indexes to keyless data tests and validate engine after TestQueryWithContext
- 1132: Additional book keeping for dropping auto_increment
This PR add additional logic in the memory implementation of table and validate_create_table to handle cases where auto_increment is dropped in an ALTER TABLE MOIDFY statement. - 1130: Better read only errors
- 1129:
json_contains
bug fixes
Bug fixes for Dolt's implementation ofjson_contains
(MySQL Documentation)
Fixes: dolthub/dolt#3895 - 1127: super user is localhost, and don't persist superuser
Companion PR: dolthub/dolt#3810 - 1125: revert superuser to be any host
- 1124: still read JSON privilege files, but write flatbuffer format
fix for: dolthub/dolt#3859
tests are in dolt pr
companion pr: dolthub/dolt#3860 - 1123: Allow plugin for other auth types
Implementation doc: https://docs.google.com/document/d/1ts7ht9p-VkNgYhD2ouXRGNn8DdPP8wHTNm_T8voSDTE/edit?usp=sharing
Needed vitess changes: dolthub/vitess#172 - 1122: Prune *plan.ResolvedTable columns
Table schemas change after we prune columns by pushing projections into them. This is a one-way process, tables can't become un-projected after pushdown.
edit: I rewrote this completely to be more organized and avoid some of the resolve issues from before. - 1120: Support where clause for
show variables
Where clause is supported only forvariable_name
column of result table ofshow variables
query.
Fixes dolthub/dolt#3834 - 1118: superuser is localhost
Companion PR: dolthub/dolt#3810 - 1117: Add foreign key tests where the parent table has inverted primary keys
- 1116: Add tests for unique key violations and fix engine to pass them
- 1115: pulling out large JSON test scripts
The large JSON test scripts have been causinggo test -race
to fail for Dolt (on ubuntu at least). This PR removes them from go-mysql-server and PR dolthub/dolt#3824 moves them into Dolt.
Confirmed thatgo test -race
for Dolt's enginetests passes again on Ubuntu after this change. - 1114: support unix socket
dolthub/dolt#3796 is dependent of this PR
Merged in Aaron's branch with supporting two-listeners - 1113: Init RO sess vars
I needed a way to initialize read-only session variables so I can set them when a new user connects to a value that depends on which user connects. - 1112: Added ErrLockDeadlock, mapped to MySQL error code 1213 and SQLSTATE 40001
- 1111: Bug fixes for AlterPK and column defaults
Operations that modify a table's schema were not consistently resolving column default values:AlterPK
was missed in resolve column default switch cases, and also wasn't considering its expressions when checking for resolution.AlterDefaultSet
was resolving the wrong schema (its output schema,OkResultSchema
, instead of the target schema).
While I was in there, I cleaned up the code to more consistently use thesql.TargetSchema
interface and combined some copy/pasted switch cases.
Updated existing tests to use column default values and verified they caught both of these failures. Have already run Dolt engine tests and verified they pass correctly.
Fixes: dolthub/dolt#3788
- 1110: Allow floats to specify precision and scale to match MySQL's behavior
MySQL supports a non-standard SQL syntax for specifying precision and scale for floats and doubles:
https://dev.mysql.com/doc/refman/8.0/en/floating-point-types.html
MySQL has marked this as deprecated, but existing MySQL applications (e.g. NocoDB) use this, so we should support it for compatibility. Doubles already support this notation.
Fixes: dolthub/dolt#3778 - 1109: Escape backticks in identifier names in
show create table
Updated show create table output to match MySQL's backtick escaping behavior.
Related to: dolthub/dolt#3779 - 1108: sql/session.go: Migrate from opentracing to opentelemetry.
- 1106: Close duplicated file descriptor
- 1105: Performance improvements in permission checking and view resolution
- 1104: Expand support for explicit column defaults in insert into statements
The previous code for handling explicit column defaults in insert into statements couldn't handle many valid cases of using explicit column defaults. To handle inserting explicit column default values in all cases, we need to move to a different approach.
This change leavesDefaultColumn
values in theplan.Values
expression tuples and updates the analyzer rules to parse and resolveColumnDefaultValues
in those expressions.
I've run the Dolt engine tests and verified that all pass with this change.
Fixes: dolthub/dolt#3741 and #527 - 1102: remove
ValidateCanPersist()
removes theValidateCanPersist()
fromPersister
interface, as it is no longer used.
Companion PR: dolthub/dolt#3732 - 1101: Bump ruby-mysql from 2.9.14 to 2.10.0 in /_integration/ruby
Bumps ruby-mysql from 2.9.14 to 2.10.0.Commits
631384e
version 2.10.04124bec
support OPT_LOAD_DATA_LOCAL_DIR5b47f75
update constants and client error from 8.044520a1
Merge pull request #28 from duerst/master20ace1b
Update charset.rbb2e2ac4
version 2.9.146cf6ee1
support JSON type on MySQL 5.798a0954
test for MySQL 5.7ed7063a
add collations and constants from MySQL 5.7.10- See full diff in compare view
[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=ruby-mysql&package-manager=bundler&previous-version=2.9.14&new-version=2.10.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) ---Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/dolthub/go-mysql-server/network/alerts). - 1100: type convert simplifications
dolt bump here: dolthub/dolt#3757
This expands someConvert
methods into smaller helper functions that can be used more precisely:- avoid interface conversions
- sidestep bound checks that don't apply on the read path unless we are explicitly running a
Cast
These are a bit rough and ready, the interfaces and semantic check organization could be formalized, but gives 15% boost on table scan.
- 1099: Adjust limits on varbinary/json data
- Relaxed limit on incoming varbinary fields to allow for large JSON fields, which have a max size of 1GB and come across the wire as varbinary types.
- Added a test to ensure limits on varbinary fields are still being honored.
- Added a size limit on JSON fields at 1GB.
Fixes: dolthub/dolt#3728
- 1098: Added safety check to index filter pushdown
- 1093: raising upper bound for
@@join_complexity_limit
to 20 - 1092: Expose
JoinComplexityLimit
system variable
Fix for: #1091 - 1090: Changed character sets and collations to use integer IDs
- 1088: Avoid
NewEmptyContext()
to optimizeIN
filters - 1087: Change all info schema columns to capital letters
Update the information_schema implementation to use capital letters for column names. This matches the MYSQL implementation
This runs against Dolt correctly: dolthub/dolt#3719 - 1083: Update columns table numeric precision and scale
This fills in the numeric_precision and numeric_scale columns of the information_schema.column table - 1082: Integrate latest Vitess along with new LOAD DATA test case
- 1080: sql/analyzer: Make indexed joins more aware of NULL safe comparisons.
Fix joins and indexed joins using <=>. Ensure that joins using = do not scan NULL values in the index. - 1078: Improve how index lookups and ranges address NULL values.
- 1075: adding histogram builder
Also removes unnecessary call to AnalyzeTable in information_schema
Companion PR: dolthub/dolt#3365 - 1073: Fixed returning null type instead of null value
A few places, such asNULLIF()
, were returning the null type (sql.Null
) when they should have been returning a null value (nil
in Go). This has been fixed. - 1071: Revert #1070 and #1067. Needs more testing and some edge cases considered.
- 1069: Removed array type and non-compliant functions
- 1068: Serialize Geometry better
Fixes the SQL methods for point, linestring, polygon, and geometry to return the hex representations of the values.
Additionally, fixes the hex method to allow conversion of point, linestring, polygon, and geometry.
Fix for: dolthub/dolt#3645
Companion PR: https://github.com/dolthub/dolt/pull/3653/files
TODO: move geometry serialization package out of dolt into gms (in next pass). - 1067: Improve NULL handling in JOINs and index lookups on NULL comparisons.
INNER JOIN ON a <=> b was broken because of the implementation of
NullSafeEquals. The implementation returned an Int8, but the join
implementation was expecting a Boolean.
SELECT WHERE indexed_column = NULL was broken because index selection
would select the index, and also drop the filter. - 1066: Improved error message for converting unsupported geospatial types
Improved the error message to tell customers that a specific geospatial data type they tried to use is currently unsupported.
Fixes: dolthub/dolt#3512 - 1065: Support describe for views
This pr:- Introduces a new analyze rules to ensure
describe
works on views - Fixes a bug in the return of SHOW COLUMNS
- Makes sure describe works with expressions
- Introduces a new analyze rules to ensure
- 1064: Add more exhaustive cases for int, uint conversion
fix for dolthub/dolt#3632 - 1063: Adding a unique index should throw an error if duplicate rows exist
- 1062: adding enginetests to scriptgen
- 1061: blob enginetests
- 1060: remove suffix check
NOW()
does not have prefix(
but has suffix)
- 1059: Adding EngineTests for polygons with multiple linestrings
- 1044: Type value changes & row type assertions
This PR has two primary goals:- All types now pass their values around in a form most suitable to that type.
- Enforce that the aforementioned types are always passed to integrators, such that integrators do not need to do type validation on their end.
To elaborate on these points, some types already passed around values that were sensible and a best fit for that type, such asMEDIUMINT
returning anint32
. Other types, such asDECIMAL
, passed around strings, which necessitated conversions for integrators to be able to properly persist the values. Not only that, there is currently no guarantee that a row's values each have their best fit type (aBIGINT
can work with anint8
, but it should be able to always expectint64
). To make these guarantees, I'm adding a check at all GMS-integrator junctions that pass asql.Row
and verifying that the value types are exactly what that column expects.
This may have the side effect of changing the output of aSELECT
statement for integrators. As aSELECT
statement simply returns asql.RowIter
, additional logic will be needed to convert all values to their canonical MySQL representation. This can easily be achieved by passing all values throughType.SQL()
before display.
- 1036: single quoted default value in create table statements
Current Dolt default literal values are stores in double quotes, but MySQL does not parse double quoted default literal value inCREATE TABLE
statements for both sql shell and importing dumps.
Fixes dolthub/dolt#3218
Added character set and collation columns forshow create view
- 998: Column Statistics
Gathers the following column stats:- mean
- min
- max
- count
- null count
- distinct count
and stores them in a modifiedinformation_schema.column_statistics
table.
There is now a newsql.Histogram
struct that contains the column stats; the contents and format ofsql.Histogram
come from a mixture of Cockroach and MySQL.
Additionally, there is also a TableStatistics interface, which is meant to be implemented if people want statistics on that table.
vitess
- 199: Added ALTER DATABASE parsing
Adds support for parsingALTER DATABASE
queries. - 197: adding better support for transaction statements
MySQL reference: https://dev.mysql.com/doc/refman/8.0/en/commit.html
adds support forfix for: dolthub/dolt#4436BEGIN [WORK] COMMIT [WORK] [AND [NO] CHAIN] [[NO] RELEASE] ROLLBACK [WORK] [AND [NO] CHAIN] [[NO] RELEASE]
- 196: allow
dual
only for select from statements if used without back-ticks - 195: Allow other options to precede COLLATE in column type defintions
Fixes dolthub/dolt#4403
Previously,COLLATE
had to immediately follow the column type in a column definition. As per the above issue, this is not a required rule in MySQL. The fix involved moving all collation-related options from thecolumn_type
rule to thecolumn_type_options
. This caused a conflict incolumn_type_options
, ascolumn_default
also has aCOLLATE
rule. The conflicting rule resolved to the following:For reference, this is the newcolumn_type_options DEFAULT value_expression COLLATE ID
COLLATE
rule incolumn_type_options
:Given the MySQL expressioncolumn_type_options COLLATE ID
DEFAULT "xyz" COLLATE utf8mb4_bin
, it could match either rule, which caused the conflict.value_expression
is too permissive in this context, and although we filter out invalid expressions in GMS, we need to be more restrictive here in the parser to prevent conflicts. In addition, the above example should put the collation on the column, as it is not possible to add a collation to a default string literal (must use the expression form:DEFAULT ("xyz" COLLATE utf8mb4_bin)
).
To fix this, thecolumn_default
rule was updated to be vastly more restrictive. This also highlighted some tests in GMS that enforce incorrect behavior, but those have been fixed ahead-of-time, and will be incorporated into the bump PR.
NOTE: For some reason, we had tests allowing the use ofutc_timestamp
,utc_date
, etc. as default values without the need for parentheses. This is not allowed in MySQL, so I am unsure as to why we allowed them in the first place. Perhaps because they're similar toCURRENT_TIMESTAMP
, so we just allowed all of them? Regardless, the tests have been removed as they no longer pass (as they should not pass). - 194: adding missing keywords to token.go
I forgot to add some keywords used inROW_FMT
table option, which caused a bats test to fail - 193: adding support for
PARTITION
syntax for table creation
Parsingtable_options
for statements likeCREATE TABLE (column_definitions) table_options
better matches MySQL
Can parsepartition_options
based off of https://dev.mysql.com/doc/refman/8.0/en/create-table.html
Fix for: dolthub/dolt#4358 - 192: add full join parsing
- 191: go/mysql: query.go: Put strings into BindVars as copies, not slices of the recyclable wire packet buffers.
Cherry picks vitessio/vitess#5562. - 190: Adding back support for using reserved keywords unquoted in InsertInto
I previously excludedreserved_sql_id
and only includedsql_id
because I didn't think reserved keywords should be allowed as identifiers in insert into columns without being quoted, but it appears we have tests in Dolt that rely on that (e.g.sql-diff.bats calls: INSERT INTO test (pk, int, string, boolean, float, uint, uuid) values ...
) and I figured if we can support those without the quoting, it seems like a nice feature for customers, so I switched back toreserved_sql_id
.
It is confusing that we refer to identifiers as "reserved" keywords when they don't require backtick quoting though and would be nice to tidy that up in our grammar in the future. - 189: Grammar fixes for customer issue using non-reserved keywords in
insert into
statements
A customer reported an issue on Discord with usinginsert into t(pk, comment) values (1, 1)
. Comment is a keyword, but not a reserved keyword, so this is valid syntax, but our parser was failing on it.
A few changes to tidy up:- Updated tests to trigger the customer reported parsing bug when an (unquoted) keyword is used in a list of columns in an insert into statement.
- renamed
column_name_safe_reserved_keyword
tocolumn_name_safe_keyword
, since the contents are keywords, but are not "reserved" keywords that need to be backtick quoted - fixed a bug with recursion for insert into column names (
ins_column_list
rule) - removed
avg
,count
,sum
,min
,max
from lists of reserved keywords, since they are not reserved in MySQL and can parse correctly in tests without being backtick quoted. - added some more comments to try and make the intent of different grammar rules clearer.
- 188: Support Varchar(MAX)
related to dolthub/dolt#2261 - 187: CTEs on unions parse
- 186: add implicit
from dual
toselect x
in recursive cte statements - 185: add support for
grant all privileges to <user>
- 184: Added parse tests for all functions
- 183: resolving unreserved keyword inconsistencies with MySQL
Tests have been updated to document the inconsistencies below.
Changes:- unreserved many keywords
- allow reserved keywords to be used as identifiers when they are qualified
Words that parse in dolt, but don't in MySQL: dual
minute_second
Words that don't parse in dolt (only in quries that use these in where conditions), but do in MySQL:escape
next
off
sql_cache
sql_no_cache
Fix for: dolthub/dolt#3977
- 182: Fix error in REPEAT function
- 180: Support for common table expressions in INSERT, UPDATE, DELETE
- 179: allow unescaped reserved keywords when prefixed with table name, and add more reserved keywords
Changes:- fixed misidentified keywords as column safe
- fixed incorrect tests
- every reserved keyword in MySQL is reserved in our grammar now
- exceptions are
DUAL
andMINUTE_SECOND
Note: - we are still missing many unreserved keywords
- some of our reserved keywords are unreserved in MySQL
Fix for: dolthub/dolt#3979
- 178: Added BINARY to match deprecated functionality
Fixes dolthub/dolt#4019 - 177: Removed unused methods
Follow up from #175 - 176: /{.github,go.mod}: bump go to 1.19
- 175: Remove all package-level flags
Summary
This is my attempt to perform a minimally invasive purge of all package-level flags.
I would appreciate some input on:- whether any of the upstream repos that use this vitess fork will be impacted by these changes
- whether this change should be more invasive (e.g., if you don't use the
AuthServerClientCert
it may be better to remove it entirely rather than just prune the flags)
Motivation
dolthub/vitess#174Testing
I've removed all package-level flags:The existing tests still pass:ryanpbrewster@argon:~/p/g/dolthub-vitess$ rg "flag\.\w" go/mysql/auth_server_clientcert.go 35: if flag.CommandLine.Lookup("mysql_server_ssl_ca").Value.String() == "" {
ryanpbrewster@argon:~/p/g/dolthub-vitess$ go test go/... ok go/ast (cached) ok go/build (cached) ok go/build/constraint (cached) ok go/constant (cached) ok go/doc (cached) ok go/format (cached) ok go/importer (cached) ok go/internal/gccgoimporter (cached) ok go/internal/gcimporter (cached) ok go/internal/srcimporter (cached) ? go/internal/typeparams [no test files] ok go/parser (cached) ok go/printer (cached) ok go/scanner (cached) ok go/token (cached) ok go/types (cached)
- 173: parse json_table syntax
TODO:NESTED
FOR ORDINALITY
ON EMPTY
ON ERROR
- 172: Changes for adding new auth method
- 171: Add rows parsing for LOAD DATA IGNORE and address flaky tls test
This PR does two things- Addresses dolthub/dolt#2548 by introducing new conditions in sql.y
- Addresses flaky tls server tests by updating tests to use Tls13 and adding an additional error check
- 170: Adding support for
use db/branch
syntax
Currently,use <database>/<branch>
works only from the mysql client, because it parses theuse
command on the client side and sends it to the server as a different command (not a query). This means this syntax does not work withdolt sql
or with other clients that don't parse this command client side and translate it (e.g. go-sql-driver/mysql).
This PR adds support to our parser for this syntax, so thatuse <database>/<branch>
can be used consistently. - 169: Adding support for specifying a primary key name
MySQL allows you to specify an optional name following a primary key definition when creating or altering a table. This name is always ignored though – all primary keys in MySQL are always namedPRIMARY
.
Fixes: dolthub/dolt#3674 - 168: Remove glog dependency, reduce log levels, remove all logutilpb usage and support.
Closed Issues
- 1357:
_example/main.go
raises panic - 1319: Filter pushdown doesn't show in explain plan even though pushdown was successful
- 1331: CreateTable: DEFAULT CHARACTER SET missing
- 945: Client hangs if go-mysql-server takes too long to return query results
- 1164: Support the DO statement
- 1162: Support the EXTRACT function
- 1128: superusers should be hidden
- 1289: Some queries using
HAVING
can't be executed withoutGROUPBY
- 1314: Cannot connect to database to query MemoryTable
- 750: Copy the mysql behaviour regarding reserved keywords
- 738: mysql
SERIAL
alias unsupported - 644:
SHOW TABLE CREATE
PRIMARY KEY
ordering - 611: Bad error message for type conversion error on insert
- 610: EXISTS expects a single column in subquery result, should support any number
- 592: Creating a table without a current DB give confusing error message
- 562: Intermittent DATA RACE error when creating MySQL Servers
- 561: Support for generated columns
- 526: Cleanup PrimaryKey error vs UniqueKey error vs DuplicateEntry
- 513: Need tests of PREPARE
- 498: Inserting into UINT32 column can cause error with leading 0
- 479: Cannot use expression index in indexed join
- 476: Can a Table implement sql.IndexAddressableTable without implementing sql.DriverIndexableTable?
- 383: Unbound variables in subqueries
- 463: anyone writing the rocksdb backend?
- 1300: Inserts for some values get dropped when accessed via a trigger for inserts.
- 1144: Is it possible to run raw sql queries to mock the database?
- 525: Column alias in where clause should be an error
- 1268: Analyzing self-referential foreign keys triggering an infinite loop
- 1232: Case-sensitive comparison of string literal
- 1219: How can i change the log level?
- 1096: go 1.17 support
- 1218: The Decimal type treats the same value with different input format differently
- 1153: proposal: Refactor Indexed Table Access
- 1139: Error with passing time.Time as a field for saving
- 1161: ENUM values are always lowercased regardless of actual case
- 1091: expose
JoinComplexityLimit
system variable - 1089: I proxies mysql-8.0.23 to go-mysql-server,some problem confuse me
- 527: Gap in coverage for DEFAULT keyword in inserts
- 1086: Support Describe for information_schema tables
- 787: Add support for describing views
- 1055: REGEXP '^[-]?[0-9]+$' fails on int64 column.
- 174: Remove package-level Vitess flags