Skip to content

Nov 2022 Release

Compare
Choose a tag to compare
@zachmu zachmu released this 01 Nov 17:48
· 3975 commits to main since this release
621dc30

This is a periodic rollup release. It contains many features, bug fixes, and improvements.

The core API has still not stabilized and will not be guaranteed until 1.0.

Merged PRs

go-mysql-server

  • 1364: Updated and expanded engine examples
    The README.md had an updated example, but the actual _example/main.go file did not. I've expanded the example file a bit to include information on setting up users, and slightly simplified the README.md example. I've also added tests for everything, so that if anything breaks, we'll know we need to update both the example file and the README.md portion.
  • 1363: Convert errors during ComPrepare to SQLError
    We were already converting errors in ComStmtExecute, ComMultiQuery, and ComQuery to SQLError so that the correct error codes would be sent to clients. This change adds that support to ComPrepare, too.
    Added a unit test for that case and took the opportunity to simplify the interface for CastSQLError a little bit.
    This change helps get Prisma support a little further along (dolthub/dolt#4511), but it doesn't look like it fully resolves everything Prisma needs to work with Dolt.
  • 1361: Fixed collation check on foreign key columns
  • 1358: fix example package
    close #1357
    This fixes runtime panic raised by example app in /_example.
    I checked SQL client can obtain response in my local machine.
    ~/go-mysql-server/_example$ go build
    ~/go-mysql-server/_example$ ./_example
    $ mysql --host=127.0.0.1 --port=3306 --database=mydb -u root
    Reading table information for completion of table and column names
    You can turn off this feature to get a quicker startup with -A
    Welcome to the MySQL monitor.  Commands end with ; or \g.
    Your MySQL connection id is 1
    Server version: 5.7.9-Vitess
    Copyright (c) 2000, 2022, Oracle and/or its affiliates.
    Oracle is a registered trademark of Oracle Corporation and/or its
    affiliates. Other names may be trademarks of their respective
    owners.
    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    mysql> select * from mytable;
    +----------+-------------------+-------------------------------+---------------------+
    | name     | email             | phone_numbers                 | created_at          |
    +----------+-------------------+-------------------------------+---------------------+
    | Evil Bob | evilbob@gmail.com | ["555-666-555","666-666-666"] | 2018-04-18 09:41:13 |
    | Jane Doe | jane@doe.com      | []                            | 2018-04-18 09:41:13 |
    | John Doe | john@doe.com      | ["555-555-555"]               | 2018-04-18 09:41:13 |
    | John Doe | johnalt@doe.com   | []                            | 2018-04-18 09:41:13 |
    +----------+-------------------+-------------------------------+---------------------+
    4 rows in set (0.00 sec)
  • 1356: add tests for sql type Zero() functions
    Implementers of GMS might expect similar values to be returned by Convert() and Zero(). For decimal and enum implementations this was not the case and has been fixed.
  • 1355: Allow any select statement for CREATE TABLE AS SELECT ...
    Also fixes a semantics bug in the schema produced by some such statements.
  • 1354: Bug fix for pushdownSort handling missing cols qualified with a table name
    The Dolt bump for my GMS change to fix an alias issue in sort node pushdown triggered an error with matching missing column names now that we can include qualified column names.
    This PR adds a repro for that case to GMS and fixes the issue by ensuring we create a UnresolvedQualifiedColumn when the missing column is qualified with a table name. I've run Dolt tests locally and confirmed there shouldn't be any other test failures in the next bump.
  • 1353: Fix panic for show keys from information_schema.columns
    Does this by removing the special handling of information_schema.columns as a separate node type, treats it just like any other ResolvedTable.
    In the process, effectively rewrote how we handle column default values by 1) moving most logic to happen in the OnceBefore batch, rather than default rules, and 2) splitting it up into multiple passes that each have a single purpose. I found in the process of 1) that the previous rules had a lot of side effects and unintended ordering constraints, so introduced new rules and tweaked others to eliminate those.
  • 1352: Update sort node to use alias reference when it will resolve missing columns
    Fixes: dolthub/dolt#3016
    Other changes:
    • Refactors the existing OrderBy/GroupBy tests into a ScriptTest.
    • Introduces a new interface, sql.Projector, that unites GroupBy, Window, and Project if a caller just needs to get the projected expressions.
  • 1351: Add support for database collations
    This allows setting the database collation, such that all newly created tables within a database (that do not explicitly set their collation) will inherit the database collation.
    Builds on dolthub/vitess#199
  • 1350: Subquery row iter fix field indexes
    Recent changes to subquery scope visibility use the scope to communicate column definition availability; i.e., we do not pass the scope into subqueries we have determined to not depend on the outer scope, marking the same scope as cacheable. This analysis change needs a corresponding runtime change to indicate whether the scope row is expected at execution time.
  • 1349: Skip process tracking when prepreparing queries
    When we prepare a statement, the QueryProcess node we create is bound to the current context's PID. This is not the same PID as the context that will execute a statement created from that template, which results in ProcessList metadata not being properly cleaned up after a query has finished processing.
    Fixes: dolthub/dolt#4601
    I couldn't find a great way to test this in the GMS package, but I'm working on a test in dolt that I'll link to shortly.
  • 1348: fix visibility for on duplicate key update
    Currently, we treat plan.InsertInto.Source independently from Destination, and is not considered one of InsertInto's children. It is evaluated much later in the analysis process in the rule resolveInsertRules in a similar way as subqueries (we recurse the analyzer on it). This is problematic if we want to reference tables from Source.
    In this PR, I resolve the tables for InsertInto.Source and added extra logic to correctly index those tables' columns.
    There is a special case for on duplicate key update <expr> in that the LHS of the expr can only see Insert.Destination while the RHS can see Insert.Destination and Insert.Source.
    Partial fix for: dolthub/dolt#4562
    Note: This does not work for CTEs
    This is only kind of a fix for the issue. The right way to fix this is probably to completely resolve InsertInto.Source before doing anything else, but I wasn't able to get that working yet.
  • 1343: allow adding new primary key to table with > 1 row iff it has auto_increment
    fix for: dolthub/dolt#4581
    tests in dolt because memory.Table doesn't implement RewriteableTable
    dolthub/dolt#4593
  • 1341: fix comparison for geometry types
    fix for: dolthub/dolt#3451
    Maybe all geometry comparisons should just default to their EWKB formats; pretty confident this is what MySQL does.
  • 1339: Update DateAdd/DateSub to return correct types.
    This change fixes dolthub/dolt#4376
    Previous implementation was hard-coded to return sql.Date, but now we are following the MySQL standard and return a type based on the inputs.
    The tests in the repo are verifying that the correct data is returned, but we're not testing the actual SQL data that is received, so none of our tests are catching this case yet. We should open a new work item to start testing the actual SQL that's being returned by dolt. For this bug, testing was performed using a local version of dolt with these changes.
    The failing query from the original bug is now working:
    SELECT NOW(), DATE_ADD(NOW(), INTERVAL 14 DAY), dolt_version();
    +----------------------------+----------------------------------+----------------+
    | NOW()                      | DATE_ADD(NOW(), INTERVAL 14 DAY) | dolt_version() |
    +----------------------------+----------------------------------+----------------+
    | 2022-10-18 18:53:24.406345 | 2022-11-01 18:53:24.406345       | 0.50.4         |
    +----------------------------+----------------------------------+----------------+
    1 row in set (0.02 sec)
    
  • 1337: add support for GeometryCollection pt. 3
    Part 10 of fix for: dolthub/dolt#3638
    Changes:
    • these functions now work with GeometryCollection
    • st_srid
    • st_swap
  • 1336: can't alter blob, json, and geometry columns when other defaults are defined
    fix for: dolthub/dolt#4543
    Note: the syntax here is actually invalid in MySQL?
  • 1335: add support for GeometryCollection pt. 2
    Part 9 of fix for: dolthub/dolt#3638
    Changes:
    • refactor a bit of wkt code
    • added missing implementation and tests for wkt for multipolygon
    • these functions now work with GeometryCollection
    • st_geometrycollectionfromtext
    • st_geomcollfromtext
    • st_geomcollfromtxt
    • st_geomfromtext
    • st_astext
    • st_aswkt
    • st_dimension
  • 1334: multipolygon support
  • 1333: prevent creating views with same name as existing table
    also fix for: dolthub/dolt#4549
  • 1332: add support for GeometryCollection pt. 1
    Part 8 of fix for: dolthub/dolt#3638
    Changes:
    • have the deserialize data and write data methods return their own byte counts, to more easily track where we are in main deserailizer
    • adding GeometryCollection type
    • these functions now work with GeometryCollection
    • GeometryCollection
    • GeomCollection
    • st_aswkb
    • st_geometrycollectionfromwkb
    • st_geomcollfromwkb
  • 1330: add support for MultiPolygon pt. 3
    Part 7 of fix for: dolthub/dolt#3638
    TODO: rebase to main, left as a merge to james/mpoly2 for better readability
    Changes:
    • these functions now work with MultiPolygon
    • ST_ASGEOJSON
    • ST_GEOMFROMGEOJSON
    • lots of tests
  • 1329: explain output includes FilteredTable
    We already print filtered table in the debug string for IndexedTableAccess, this adds it to regular explain and ResolvedTable.
  • 1327: add support for MultiPolygon pt. 2
    Part 6 of fix for: dolthub/dolt#3638
    TODO: rebase to main, left as a merge to james/mpoly for better readability
    Changes:
    • these functions now work with MultiPolygon
    • ST_ASWKT
    • ST_GEOMFROMTEXT
    • added engine tests
  • 1324: add support for MultiPolygon pt. 1
    Part 5 of fix for: dolthub/dolt#3638
    TODO: rebase to main, left as a merge to james/mline3 for better readability
    Changes:
    • contains small fixes to multilinestring comments and allocation length
    • these functions now work with MultiPolygon
    • MULTIPOLYGON
    • ST_ASWKB
    • ST_GEOMFROMWKB
    • ST_MPOLYFROMWKB
    • ST_MULTIPOLYGEOMFROMWKB
    • added engine tests
  • 1323: add support for MultiLineString pt. 3
    Part 4 of fix for: dolthub/dolt#3638
    TODO: rebase to main, left as a merge to james/mline2 for better readability
    Changes:
    • these functions now work with MultiLineString
    • ST_GEOMFROMGEOJSON
    • ST_ASGEOJSON
    • added engine tests
  • 1322: add support for MultiLineString pt. 2
    Part 3 of fix for: dolthub/dolt#3638
    TODO: rebase to main, left as a merge to james/mline for better readability
    Changes:
    • small fix to polygon test
    • these functions now work with MultiLineString
    • ST_SWAP
    • ST_SRID
    • ST_DIMENSION
    • MULTILINESTRING
  • 1321: add support for MultiLineString pt. 1
    Part 2 of fix for: dolthub/dolt#3638
    Changes:
    • small copy-pasta comment fixes
    • adding multilinestring type and structs
    • MultiLineString support for these sql functions
    • ST_MULTILINESTRINGFROMWKB
    • ST_MULTILINESTRINGFROMTEXT
    • ST_MLINESTRINGFROMWKB
    • ST_MLINESTRINGFROMTEXT
    • ST_GEOMFROMWKB
    • ST_ASWKB
    • ST_ASTEXT
  • 1320: rewrite table on unique key creation
    Fix for: dolthub/dolt#4420
    In-memory tables don't support unique keys, so tests are in dolt:
    dolthub/dolt#4512
  • 1316: Slight port correction on README example
  • 1315: Updated README example to fit current usage
    We haven't used []string{""} for JSON values in a long time, the README definitely needed an update.
  • 1311: internal/sockstate: Fix sockets ending up in blocking mode when we monitor them for disconnects.
    As documented in os, calling *os.File.Fd() will put the socket into blocking mode, makes SetDeadline methods stop working, etc. This syscall to SetNonblocking restores desired functionality.
    This was discovered when testing clustering control plane operations in Dolt, which rely on the timely ability to terminate all client connections by calling Close() on them.
    We were able to reproduce the issue on macOS by doing the same File() behavior there, despite not having an implementation to actually use the fd for external monitoring of the connection state. To keep as much behavioral parity going forward, I left the fd translation in, since we've observed that it can radically change behavior.
  • 1310: Derived table outer scope visibility
    Add support for outer scope visibility for derived tables, as introduced in MySQL 8.0.14.
    References:
  • 1309: Adding Support for MULTIPOINT and some refactoring
    Part 1 of fix for: dolthub/dolt#3638
    It looks like PRs for this are gonna get pretty big, so I'm going to split them up for easier reviewing.
    Changes:
    • greatly improved code reuse for spatial types
    • refactored code to be more readable
    • some functions were misnamed FROMWKT, when they should've been FROMTEXT
    • ST_POINTFROMWKT -> ST_POINTFROMTEXT
    • ST_LINEFROMWKT -> ST_LINEFROMTEXT
    • ST_POLYFROMWKT -> ST_POLYFROMTEXT
    • missing function aliases
    • ST_LINESTRINGFROMTEXT
    • ST_LINESTRINGFROMWKB
    • ST_POLYGONFROMTEXT
    • ST_POLYGONFROMWKB
    • ST_GEOMETRYFROMTEXT
    • ST_GEOMETRYFROMWKB
    • these functions now work with MULTIPOINT
    • ST_SWAP
    • ST_SRID
    • ST_ASWKB
    • ST_MULTIPOINTFROMWKB
    • ST_MULTIPOINTFROMTEXT
    • ST_MPOINTFROMWKB
    • ST_MPOINTFROMTEXT
    • ST_GEOMFROMWKB
    • ST_ASGEOJSON
    • ST_FROMGEOJSON
    • ST_DIMENSION
    • fixed bug where x and y geojson values were being swapped
  • 1308: parallelize static lookups
  • 1307: allow having node access to all tables in itself
    HAVING clause can reference column that is not in its select result or group by result (there can be no group by clause). These column references are from tables in HAVING node that it is not in its immediate children nodes.
  • 1304: integration plan regressions
    This PR addresses two problems:
    1. Optimization rules that depend on pattern matching can fail to trigger with exchange nodes, which we failed to account for in testing. This fixes a small indexed join bug and runs integration query plans with parallelism = 2.
    2. IndexedInSubqueryFilter should be prevented when the static side of the join takes a dependency on the index lookup. The check is now looser, and opaque nodes cannot disallow the transform. When we introduce subqueries that do reference outer scopes, we will need to be more careful and do deeper validation.
  • 1303: adding support for st_area, st_perimeter, st_length
    Only the Cartesian portion of these function is working, geodetic calculations are quite a bit harder.
    I just throw an error for any unsupported functionality.
    st_area is calculated using shoelace formula.
    Basic idea is to slice polygon into a bunch of triangles and sum up their areas. The function is not defined for polygons that intersect themselves, but from my testing it seems like MySQL returns the same values for these edge cases.
    st_perimeter is a NOT supported in MySQL, but it is in PostGIS (a postgres plugin for spatial types): https://postgis.net/docs/ST_Perimeter.html
    Fix for: dolthub/dolt#4451
  • 1301: Ensuring read-only check is executed for prepared and non-prepared statements
    Quick fix for: dolthub/dolt#4434
    Tested locally with a python script to verify that prepared statements to read-only dbs are now correctly blocked. After getting this fix out, we will follow up with more formal, automated test to cover this.
  • 1297: Add query plan integration tests
    1. New integration test queries
    2. Fix some join planning bugs related to i) column casing and ii) TableWrappers
  • 1296: server: context.go: Expose *SessionManager.KillConnection.
  • 1293: no column reference can be made on dual table
    Depends on dolthub/vitess#196
    No GetField reference can be made on dual table. To differentiate between dual table and `dual` table(which can be created).
    Dual table is constructed as ResolvedTable with empty table name and a single column with empty name in the parser.
    Any column reference that becomes deferredColumn is replaced into alias that is present in projectedAliases in reorderProjection rule.
    Any GetField column reference is not allowed on dual table.
  • 1291: allow adding column to table with autoincrement column
    fix for: dolthub/dolt#2587
  • 1290: New join planning
    Join planning has many high level components:
    • table order
    • tree shape (left deep, right deep, bushy, etc)
    • indexes for lookup joins
    • filter arrangement
    • choosing between logically equivalent plans (costing)
      This PR does not fix all of these components, but it does provides a structure for gracefully increasing the complexity of each.
      The new memo data structure is a memory efficient IR for SQL queries that 1) groups equivalent plans by their output schemas, and 2) generalizes child relationships. For example, here is an example memo for a join query:
    select * from ab
    inner join uv on a = u
    full join pq on a = p
    memo:
    ├── G1: (tablescan: ab)
    ├── G2: (tablescan: uv)
    ├── G3: (tablescan: pq)
    ├── G4: (innerJoin 2 1) (innerJoin 1 2)
    └── G5: (leftJoin 4 3)
    
    "Relational expression" each have their own expression group, within which several physical/concrete implementation can reside. An expression group is defined in terms of expression group relationships, not physical implementations, because the output from any relational expression within a group will be the same. Groups are usually keyed by a hash of the operator type and the child group keys, or in the case of scalar expression by the literal parameter values (this last point will be useful for prepared statements).
    All valid reorderings of the join tree are added to the memo by joinOrderBuilder.
    Applying indexes is a separate step, where we consider an indexed version for every join implementation where the outer (right) table is an indexable data source. Adding indexed plans to the query above yields:
    memo:
    ├── G1: (tablescan: ab)
    ├── G2: (tablescan: uv)
    ├── G3: (tablescan: pq)
    ├── G4: (indexedJoin 1 2) (indexedJoin 2 1) (innerJoin 2 1) (innerJoin 1 2)
    └── G5: (indexedJoin 4 3) (leftJoin 4 3)
    
    Hashed joins are added in the same manner:
    memo:
    ├── G1: (tablescan: ab)
    ├── G2: (tablescan: uv)
    ├── G3: (tablescan: pq)
    ├── G4: (hashJoin 1 2) (hashJoin 2 1) (indexedJoin 1 2) (indexedJoin 2 1) (innerJoin 2 1) (innerJoin 1 2)
    └── G5: (hashJoin 4 3) (indexedJoin 4 3) (leftJoin 4 3)
    
    This memo is missing several components that would make it more useful, including other node and expression types. For example, filter pushdown would be much easier in the memo. You could imagine extending the memo above with sql.Expression tree memo groups (scalar relations). Canonical (simplified) expression groups only requiring one allocation, and the scalar expression group caches properties like table and column dependencies similar to relProp column and table dependencies. We would need to build a memo prior to join planning to use filters for join planning, and run other transformation rules on the memo. I'd expect the memo to absorb rules over time, which will simplify transform logic and improve analyzer memory consumption. I added some starters to make this easier in the future, like codegen'd memo expressions, and a move towards visitor interfaces for IR transformations.
    A fully exhausted memo tree is converted back into a physical plan first through a costing process. Costing builds the fastest implementation for child group expressions bottom-up, using only the best child implementations for costing parent expressions and eventually the root node.
    The coster is designed to use histogram statistics to pick the fastest join plan. We only provide a subset of index metadata and table sizes at this moment. Additionally, costing should apply to generic SQL nodes and expressions, not just join trees and join leaves.
    Other notes:
    • we miss join plans that result from transitive predicates (ex: ab.a = uv.u + uv.u = xy.x => ab.a = xy.x).
    • we miss join plan optimizations resulting from table functional dependencies (ex: eliminate redundant filters if we already join on a a primary key), and null-allowing joins (ex: IS NULL).
    • we could more aggressively apply filters to join trees.
    • converting from the memo to an execution plan is fraught for the same reasons it was before this PR, but tree building and fixup is condensed in exec_builder, which i'd expect to expand to include more exprGroup types in the future.
    • there is a hack for fixing up field indexes in join trees not considered for reordering.
    • right joins are converted to left joins after expandStars in transposeRightJoins.
    • non-left deep trees necessitate passing parent rows down the right side of join trees, which were previously only table sources (left deep).
    • joinOrder hinting is rewritten to more easily fit within exprGroup costing.
    • I started to use bitmaps to track expression group column dependencies, which should also be used for filter applicability.
    • indexedJoinIter is duplicate and should be deprecated, but i had a hard time quickly removing it.
    • every join node is condensed into *plan.JoinNode and differentiated based on its join type, which lets me bucket different join types more easily (refer to JoinType type methods).
    • applyHashLookups should be deprecated, given that we will preemptively build hash joins. If hash joins do not select the lookup, we should either not cache the subquery or improve the coster to make better decisions.
    • there are probably bugs where we apply hash joins to non-cacheable subquery scopes for derived tables. Need more testing.
      Additional refs:
    • https://www.researchgate.net/publication/262216932_On_the_correct_and_complete_enumeration_of_the_core_search_space
    • https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/opt/xform/join_order_builder.go.
    • https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/xu-columbia-thesis1998.pdf
  • 1288: add ValidateSession() to Session interface
    dolthub/dolt#4395 depends on this PR
  • 1281: Alias resolution updates
    This change moves our alias identification and column name qualification code closer to MySQL's behavior. The major changes are:
    • identify available alias, table, and column names on a per-scope level (instead of the previous "nestingLevel" that was based off of the node depth of the execution plan tree)
    • supply additional context to the qualifyExpression method so that the containing node can be used to switch on different alias visibility rules
      There are still more nuanced edge cases in MySQL's alias behavior that we should continue working on matching (e.g. dolthub/dolt#4535, dolthub/dolt#4537), but this change moves us a nice step forward.
      Dolt CI Tests:
    • dolthub/dolt#4410
      Fixes:
    • #525
    • dolthub/dolt#4344
  • 1280: More join types
    Add FullOuterJoin, SemiJoin, and AntiJoin.
    None of these new join nodes are safe for join ordering transformations yet. They are explicitly excluded from join planning.
    The getField indexes for these three nodes' join conditions deserve more consideration. I excluded them from auto-fixup after manually correcting join condition get fields for the appropriate schemas.
    FullOuterJoin uses a union distinct execution operator, which is correct but a lot slower than a merge join-esque operator.
    SemiJoin and AntiJoin rearrange subquery expression scopes. I separate resolve and finalizeSubqueryExpressions to perform decorrelation before predicate pushdown (where we were panicking on FixUpExpressions) and join ordering (we want to decorrelate scopes before join planning).
    Other:
    • query plan tests added for exist hoisting edge cases i did not catch on first pass
    • fixed bug with CTE stars
  • 1279: Andy/mysql validator
  • 1278: should not be returning hex on the wire for geometry types
    Fix for: dolthub/dolt#4390
    For improved display purposes, I changed the SQL method to convert the raw binary of spatial types to their hex equivalent (#1068).
    DBeaver expects the binary in WKB format to generate their maps, so our SQL method must return them in binary
  • 1276: Apply sql_select_limit to first scope only
    re: dolthub/dolt#4391 and dolthub/dolt#4353
  • 1275: Throw error in alter table set default if default fails rule(s)
    This PR makes alter column set default error when its default expression fails the default expression rules. Previously the DDL succeeded and when an insert or any other query was performed the default would cause a continuous error.
    alter table t alter column col1 set default '{\"bye\":1}'
    -- errors with: TEXT, BLOB, GEOMETRY, and JSON types may only have expression default values
  • 1273: fix test ordering
    some tests on dolt were failing for JSON_TABLES, due to abiguous ORDER BY
    Companion PR: dolthub/dolt#4355
  • 1272: Removed exponential time complexity for foreign key analysis
    This fixes #1268
    Foreign key analysis created acyclical trees that were traversed during query execution to emulate cascade operations. This meant that cyclical foreign keys were converted to an acyclical tree. Normally this isn't possible as cyclical trees are infinitely traversable, but MySQL has a depth limit of 15, which allowed us to materialize an acyclic tree with a maximum height of 15 nodes. This, however, lead to trees with an exponential number of nodes: roughly (number_of_fks)¹⁵ × 1.5 nodes in the tree. With just 3 foreign keys, we'd get a tree with roughly 22 million nodes, which would take forever to process.
    This PR completely changes the analysis step to now generate cyclical trees. In addition, depth checks are now properly implemented (during query execution rather than during analysis), being represented by a returned error once the depth limit has been reached. Interestingly, MySQL is supposed to process up to 15 operations (returning an error on the 16th), but cyclical foreign keys will error on the 15th operation. I think this is a bug in MySQL, but nonetheless the behavior has been duplicated here.
    I also updated the timestamp_test.go file to grab an unused port. This prevents test failures due to requesting an already-in-use port. Not related to this PR in particular, but it was annoying to deal with so I fixed it.
  • 1271: enginetest: Add repro enginetest for unique key error
    repro for Dolt bug #4372
  • 1270: No new context for every json SQL convert
    perf here: dolthub/dolt#4364
  • 1269: cleanup top-level directories
  • 1261: use json_table in crossjoin that references table column on left
    Fix for: dolthub/dolt#4340
    To use a column as the expr argument to json_table, the column must be part of a CROSS JOIN and has to be on the left side of a cross join (reference). Oddly, you cannot reference the column directly nor through a subquery (stackoverflow reference).
    As a result, this is partially an aliasing issue, which I do address in this PR.
    Added special column resolving rules for JSONTable for when its under a Join and subquery expression
  • 1259: Add sponsor button to GitHub
  • 1258: Bug Fix: change unix_timestamp(<expr>) to return 0 when it can't convert expr to a date
    MySQL's implementation of unix_timestamp(<expr>) returns 0 when the expression can't be converted to a date and logs a warning. This PR changes Dolt to have the same behavior. It also includes a bug fix for ctx.ClearWarnings to correctly clear warnings.
    mysql> select unix_timestamp(1577995200);
    +----------------------------+
    | unix_timestamp(1577995200) |
    +----------------------------+
    |                          0 |
    +----------------------------+
    1 row in set, 1 warning (0.00 sec)
    mysql> select unix_timestamp("jason");
    +-------------------------+
    | unix_timestamp("jason") |
    +-------------------------+
    |                0.000000 |
    +-------------------------+
    1 row in set, 1 warning (0.00 sec)
    mysql> show warnings;
    +---------+------+-----------------------------------+
    | Level   | Code | Message                           |
    +---------+------+-----------------------------------+
    | Warning | 1292 | Incorrect datetime value: 'jason' |
    +---------+------+-----------------------------------+
    1 row in set (0.00 sec)
  • 1257: allow compare decimals with different precision and scale
  • 1256: add check for chan is closed in Listener.Close()
    dolthub/dolt#4301 depends on this PR.
    A test for it is added in dolthub/dolt#4301
  • 1255: fix numberTypeImpl.SQL() to convert from any type
    The SQL method for numberTypeImpl expects v to already be the right type for the baseType, this is not always the case.
    CASE statements have an error guard for this behavior, which shouldn't be there.
    There are likely other types that have SQL() methods that don't work for all types.
    Changes:
    • added type switch for types that are unconvertable to number
    • remove analyzer rule 'validateCaseResultTypesId`
      Closes dolthub/dolt#4306
  • 1253: Support Varchar(MAX)
    related to dolthub/dolt#2261
  • 1252: Reimplemented LIKE expressions to add collation support
    Previously, we were converting the patterns from LIKE expressions into patterns that the standard regex parser would understand. Said parser does not support collations (as implemented in MySQL), therefore this is a custom-developed pattern matcher for LIKE expressions that fully supports collations.
    This is ONLY for patterns on the LIKE expression. We still need to implement full regex parsing to support REGEXP_LIKE.. Once we have REGEXP_LIKE completed, we may revert this back to the regex conversion process, but that will be no time soon.
    This also fixes dolthub/dolt#4303
  • 1250: stripNodes in subquery block apply_hash_lookup optimization
    A HashLookup requires two rules, 1) converting a JOIN -> SUBQUERY_ALIAS => JOIN => CACHED_RESULT(SQ), 2) converting the CACHED_RESULT into a HASHED_LOOKUP.
    Prevent StripRowNode from breaking the pattern matcher for the first step. Other nodes can still interrupt the rule, but we have a test now at least.
  • 1249: UpdateJoin bug fixes
    Bug fixes for dolthub/dolt#4288
    • qualifying check constraint expressions when multiple tables are being updated
    • support finding UpdatableTables through ProcessIndexableTable and ProcessTable nodes
    • erroring out when a table updater isn't available for a SetField expression, instead of silently dropping it
      This does not address another UpdateJoin bug I just found while testing: dolthub/dolt#4304. Mentioning that one here so we have this context connected when we go to fix it.
      I verified that Dolt's enginetest suite runs correctly with these changes.
  • 1248: feat:(sql function) support last_insert_id with expr
    mysql support parameter for LAST_INSERT_ID(expr), it is useful in upsert query when we want to get the updated recored id
    ref: https://dev.mysql.com/doc/refman/8.0/en/information-functions.html#function_last-insert-id
  • 1246: adding LEAD() window function
    Partial fix for: dolthub/dolt#4260
  • 1245: adding LAST_VALUE() window function
    Partial fix for: dolthub/dolt#4260
  • 1244: use decimal type for arithmetic functions
    fixes dolthub/dolt#4269
    sum will be either float64 or decimal.Decimal type, but return type is decided on the type of the child expression of sum expression. This causes some test results to be modified into int from float.
    the test results in decimal.Decimal type is hard to get the exact matching using its convert method, so result is checked in string format by using StringFixed() method of the decimal.Decimal result value.
  • 1240: Additional fixes to collation coercion
    Coercion rules specify that Unicode and non-Unicode characters may the Unicode collation applied in the event that they're being compared, concatenated, etc. I'm not exactly sure how to determine what a "Unicode" collation is, so I'm assuming that all collations that use more than one byte are Unicode. I can almost guarantee that this is incorrect, however it's a better approximation of behavior than not considering it at all.
  • 1239: More point range optimizations
    SQL range now exposes whether a range is a point lookup, so the integrator does not have to check for itself.
  • 1238: Added limited support for collation coercibility
    This adds a sort of "fix" for the first issue identified in #1232. This is basically a hack for collation coercibility, which is a significant undertaking. This should suffice in the meantime.
  • 1231: Return errors instead of asserting inside runQueryPrepared
    Test PR:
  • 1230: fix applying bindvars to filters and table functions
    Changes:
    • Allow bindvars to be applied to FilteredTables using a new DeferredFilteredTable node
    • No longer pushdown filters in post prepared analysis
      Test PR:
    • dolthub/dolt#4252
  • 1229: fix filters being dropped from prepareds on dolt_diff tables
    TransformProjections was not transferring filters from certain prepared plans that did not contain projections.
    Fix for
  • 1228: Resolve CTEs ignored recursive changes
    fixes regression where we do not propagate nested changes in cte resolve
  • 1227: Fixed collation error message, added more collations
    Fixes dolthub/dolt#4236, in addition to adding two new collations:
    • utf8mb3_unicode_ci
    • utf8mb4_unicode_520_ci
  • 1226: recursive ctes resolve limit filter sort
    General outline:
    1. Union nodes absorb LIMIT, ORDERBY, and DISTINCT logic
    2. Recursive CTEs embed a UNION node, and the analyzer treats them the same for the most part
    3. Rewrite resolve_cte to bridge the gap between WITH -> rCTE -> UNION nodes
      Re: (1), Union is basically collapsed into a single node. I think all scopes should probably look like this longterm for a few reasons. Adding an attribute grammar for resolving tables and columns becomes a lot easier. Transformation rules are simpler because they do not have to see through the tree within a scope. Logical nodes can be separate from physical nodes. Moving from a tree to a hash map of logical nodes is the main way of doing costed planning.
      Re: (2), the easiest way to support arbitrary left/right nodes in recursive ctes is to try to run the regular analyzer on them. Notably, MySQL prevents many types of top-level nodes in rCTEs, which we have not prevented yet. For example, GROUPYBY, ORDERBY, LIMIT are not permitted in the middle of an rCTE (I think because the results are ambiguous). We do not error on those right now, but maybe should in a follow-up PR.
      Re: (3), rCTEs are structurally parsed as Unions, and can be resolved like unions in most of the analyzer, but we need several manual steps to address the ways rCTEs are functionally different. rCTEs should be separated into recursive and non-recursive chains of UNIONs (refer to Aaron's comment). If a rCTE has no recursive portions, we convert to a regular CTE. We manually resolve references to the recursive binding, converting them from UnresolvedTable to RecursiveTable. There is a chicken and egg problem where we have to resolve the left-side of the union before we can construct and apply a schema'd RecursiveTable to the right.
  • 1225: Hoist CTEs above Filter, Limit, OrderBy, and Having
  • 1224: SET NAMES collation fix, results have proper charset
    Fixes two issues. SET NAMES would cause all statements that feature a COLLATE expression to error if the character set and collation did not match, however such a comparison only makes sense for string literals (at parse time). This has been fixed, as the parse-time check only occurs for literals, and checks that require analysis (such as table columns) now occur at the appropriate time. In addition, the system variable character_set_results was not used when returning string results, resulting in unexpected behavior, which has also been fixed.
  • 1223: fix infinite recursion during cte self-reference
    Changes:
    • don't resolve subqueries where there are CTEs of the same name; transformations are bottom up, so they've been resolved at least once already
      Fix for:
    • dolthub/dolt#4173
  • 1222: use float input as str for decimal type
    The scale of float input in INSERT statement gets rounded when converting to float64 value.
    The scale of input for DECIMAL data type should not be rounded.
  • 1221: 1 % 0 should not panic but return NULL
    From https://dev.mysql.com/doc/refman/8.0/en/mathematical-functions.html#function_mod:

    MOD(N,0) returns NULL.

  • 1217: any number of column is allowed for where exists subquery
    Any number of column from the subquery is allowed. The result of WHERE EXISTS subquery is dependent on the number of rows rather than the number of columns
    Fix for dolthub/dolt#3772
  • 1216: fix infinite recursion for recursive cte used with union subquery
    Changes:
  • 1215: Collation fixes
    Fixes the following issues:
    • dolthub/dolt#4172
    • dolthub/dolt#4168
      CREATE TABLE and friends will now take the table collation into consideration. This extends to adding a modifying columns, which will use the table collation when no collation is specified. Additionally, the latin1 character set has been added. Lastly, we properly block usage of non-implemented character sets and collations.
  • 1214: Bug fix for table alias resolution
    Fixes: dolthub/dolt#4183
  • 1213: Show all currently-implemented charsets with default collations
  • 1212: fix more prepared enginetests
    Moved these rules to run during preprepared, as the explain queries for history table were missing Exchange node
    • inSubqueryIndexesId
    • AutocommitId
    • TrackProcessId
    • parallelizeId
    • clearWarningsId
      Tests:
    • dolthub/dolt#4179
  • 1211: fix str_to_date to returns time or date or datetime type results
    STR_TO_DATE() returns result type depending on Date given in the query. If only date parts like year, month and/or day was defined in Date argument, then the result is only Date type without time parts. If Date defined is date with time, then Datetime type is returned.
    Currently, STR_TO_DATE allows zero-date.
  • 1208: Handle IN expressions with NULL values for string typed columns
    fixes string assertion panic here
  • 1207: fix some filters being dropped from history tables
    Changes:
    • run pruneTables in prePreparedSelector, to transfer projections correctly
    • added Filters() getter into sql.FilteredTables interface
    • transferProjections also copies over table filters
      Test PR:
    • dolthub/dolt#4163
  • 1206: Fix alias error that should qualify correctly
  • 1205: Fixed inability to drop user without host being specified
    Fixes github.com/dolthub/issues/4135
  • 1204: Fix timestamp comparisons and add regression test

    Summary

    This PR fixes another regression introduced by #1061. That PR altered the way that timestamps are represented internally (they went from string to []byte) without updating all of the locations where the previous representation was assumed.
    The particular regression this PR fixes deals with timestamp comparisons. A test like
    CREATE TABLE mytable (t TIMESTAMP);
    INSERT INTO mytable (t) VALUES ('1990-01-01'), ('2020-01-01');
    SELECT COUNT(1) FROM mytable WHERE t > '2000-01-01';
    
    should return 1, but today it returns 2.
    The core issue is that the filter (e.g., WHERE t > '2000-01-01') needs to compare timestamps. That comparison logic has type coercion, but the type coercion only knows how to coerce a string into a timestamp, and it is now being handed a []byte.
    The fix here is two-fold:
    1. increase the priority of timestamp type coercion so that it takes precedence over binary coercion
    2. ensure that the convertValue function correctly converts []byte into timestamps (and dates)

    Motivation

    #1139
    This is not strictly speaking the same issue, but something wonky is going on with timestamps and I'd like to get to the bottom of it, and then fix it "for good".
  • 1203: Stop requiring order by clause for range frames that only use unbounded preceding/following and current row
    We were requiring the order by clause for all range frame queries, but it is optional when the frame only uses unbounded preceding/following and current row.
    Fixes: dolthub/dolt#4141
  • 1202: moveJoinConditionsToFilter bug
    The move join condition rule selects filters not needed as a join condition, identify the target parent for the filter, and then moves the plucked filters to EVERY child of the parent of the join node. The correct behavior is to move the filter immediately above the join whose condition we plucked the filter from.
  • 1201: Add end-to-end tests for timestamp insertion, and patch the conversion issue

    Summary

    This PR introduces an e2etests/ directory with "end to end" tests:
    • set up a memory engine
    • start the server
    • connect using the normal Go mysql driver
    • execute queries
      This can catch issues that a unit test against the engine itself cannot.
      There is only a single test thus far, and it is fairly verbose. I know that there is at least one more issue with timestamps, and I will open a follow-up PR to address that issue as soon as I track down its source. For now I'm opting for clarity rather than code reusability.
      Lastly, this PR introduces a little patch to the timestamp conversion logic, which was accidentally broken in #1061 because it only handles string but the server layer now passes it []byte. This patch is very dumb and simple: just convert []byte to string before parsing.

    Motivation

    #1139
  • 1199: adding support for RANK and DENSE_RANK
    Fix for:
    • dolthub/dolt#4126
      Note: queries in MySQL seem to be ordered by the ranks by default.
  • 1198: adding implicit type conversion to in <subquery> statements
    fix for: dolthub/dolt#4057
  • 1197: Indexed joins range scans ignore non-equalities
    When building index lookups, do not about when we see non-equality expressions. Instead, continue with the equality subset.
    Ex:
    SELECT pk,i,f FROM one_pk RIGHT JOIN niltable ON pk=i and pk > 0
    -	RightJoin((one_pk.pk = niltable.i) AND (one_pk.pk > 0))
    -	 	├─ Table(one_pk)
    -		 │   └─ columns: [pk]
    -		 └─ Table(niltable)
    -		     └─ columns: [i f]
    +	Project(one_pk.pk, niltable.i, niltable.f)
    +	     └─ RightIndexedJoin((one_pk.pk = niltable.i) AND (one_pk.pk > 0))
    +		     ├─ Table(niltable)
    +		     │   └─ columns: [i f]
    +		     └─ IndexedTableAccess(one_pk)
    +		         ├─ index: [one_pk.pk]
    +		        └─ columns: [pk]
  • 1195: SkipResultsCheck logic for TestScriptPrepared
  • 1194: Support Anonymous User
    Fixes dolthub/dolt#4090
  • 1193: Fixed typos in engine tests (skipped tests not actually getting skipped)
  • 1192: support CONV() function
    Fix for dolthub/dolt#4099
  • 1190: add support for BIT_AND, BIT_OR, and BIT_XOR functions
    fix for: dolthub/dolt#4098
  • 1189: Fixed Privilege-Related Issues
    Fixes the following issues:
  • 1188: Bug fix for recursive view definitions causing stack overflow during analysis
  • 1185: empty host is any host
    Fixes issue where empty string for host should be "%".
    Context: #1151 (comment)
  • 1184: Added more wire tests for collations
  • 1183: index lookup refactor
    New implementation divides lookup management into analysis and exec components. During analysis, *plan.IndexedTableAccess embeds an IndexedTable. The IndexLookup itself is static, dynamic lookups will be passed multiple lookup objects at exec time.
    type IndexLookup struct {
    fmt.Stringer
    Index  Index
    Ranges RangeCollection
    }
    type IndexAddressable interface {
    // AsIndexedAccess returns a table that can perform scans constrained to an IndexLookup
    AsIndexedAccess(Index) IndexedTable
    // GetIndexes returns an array of this table's Indexes
    GetIndexes(ctx *Context) ([]Index, error)
    }
    type IndexAddressableTable interface {
    Table
    IndexAddressable
    }
    type IndexedTable interface {
    Table
    // LookupPartitions returns partitions scanned by the given IndexLookup
    LookupPartitions(*Context, IndexLookup) (PartitionIter, error)
    }
    Minimal changes to DriverIndexedLookup and ForeignKeyUpdater to accommodate changes.
  • 1182: Use literal for Enum/set default
    Currently enum / set default value will be evaluated(converted to int) and store in information_schema, think better to leave as their literals.
    PS: I'm also investigating if this issue related to dolthub/dolt#4503
  • 1179: Preserve original enum value case on retrieval
    When retrieving enum values for ShowCreateTable,Information_Schema.Columns, and Describe/ShowColumns, they were always being converted to lowercase, which doesn't match MySQL's behavior. Instead, we should preserve the case each enum value was defined with, whenever we return those values.
    This did not affect enum values returned as part of table data – these values were already correctly preserving the originally defined case.
    Fixes: dolthub/dolt#3991
    Fixes: #1161
  • 1177: Secondary Lookup Range Templates
    Do not use the IndexBuilder for secondary lookups. Directly create the equality ranges and/or nullable filters.
    index_join_scan latency change locally on OXS: 4.8ms -> 3.8 ms (-20%).
    No interface changes, regular Dolt bump passes at the time the PR was created.
  • 1174: sql/analyzer: Update Index Selection heuristics to prioritize Primary Key indexes
    In Dolt, this increases out TPC-C throughput by 2x for the new NBF.
  • 1172: Allow common table expressions in INSERT, UPDATE, DELETE statements
    This fixes dolthub/dolt#4025
  • 1170: Fix lock binary typing errors
  • 1169: sql/parse: Add support for sqlparser.Union nodes that carry ORDER BY and LIMIT clauses.
  • 1168: Add a series of skipped tests related to functionality gaps
  • 1166: /.github/workflows: bump ubuntu 22.04
  • 1159: Pushing AsOf expressions down to tables used in unions and subqueries in view definitions
    Tables used in unions and subqueries in a view definition weren't getting updated with an asof expression when one was used on the view. This PR extends the resolve_views analyzer rule to close those gaps.
    Fixes: dolthub/dolt#4011
    Added Dolt enginetests, too: dolthub/dolt#4028
  • 1158: Add support for deprecated BINARY attribute after charset
    Fixes dolthub/dolt#4019
  • 1156: Refactoring ExternalStoredProcedureDatabase to ExternalStoredProcedureProvider
    From dolthub/dolt#3934, we want customers to be able to call the dolt_clone stored procedure when no database is selected.
  • 1151: displaying database and table grants
    Makes it so that database and table specific grants show up for query SHOW GRANTS for <user>
    Fix for: dolthub/dolt#3589
    Also somewhat fix for: dolthub/dolt#4007
    Testing PR: dolthub/dolt#4017
  • 1150: /{.github,go.mod}: bump go to 1.19
  • 1149: Prevent JSON and GEOMETRY columns from having literal default values
    MySQL disallows BLOB, TEXT, JSON, and GEOMETRY columns from having a literal column default value:
    https://dev.mysql.com/doc/refman/8.0/en/data-type-defaults.html#data-type-defaults.html#data-type-defaults-explicit
    go-mysql-server already prevents BLOB and TEXT column types; this PR extends that check to cover JSON and GEOMETRY columns as well.
    Fixes: dolthub/dolt#4003
  • 1148: Big fix for resolving column default values in correct order
    Fixes an issue where we weren't pulling the column definitions in the same order as the InsertInto specified the column values when resolving and validating column default values.
    We did have a test for this case, but because the types in the test were int and varchar, the default values were being coerced silently and the tests were passing when they should have failed. I've updated that test and also added a new test case using an enum that closely matches the case the customer reported.
    Fixes: dolthub/dolt#4004
  • 1147: Real Collation Support
    This PR adds true support for collations. Most of the files in the encodings package were generated from collation-extractor, and thus inflate the overall line count.
    Many tests have been implemented, however I do plan on replicating most of the engine tests over the wire, but I do not feel that that is a blocker (just additional validation).
  • 1146: fix inverted index IN filter error
  • 1145: Remove decorator nodes, add info back in table plan printing
    Delete *plan.DecoratedNode and helper methods.
    *plan.ResolvedTable and *plan.IndexedTableAccess compensate by printing range and column (logical) properties.
    -                      Projected table access on [pk c1 c2 c3 c4 c5]
    -                          └─ IndexedTableAccess(one_pk on [one_pk.pk] with ranges: [{[NULL, ∞)}])
    +                      IndexedTableAccess(one_pk)
    +                          ├─ index: [one_pk.pk]
    +                          ├─ filters: [{[NULL, ∞)}]
    +                          └─ columns: [pk c1 c2 c3 c4 c5]
    Project(a.i, a.s)
    └─ IndexedJoin(a.i = b.i)
    ├─ TableAlias(a)
    -                           │   └─ Projected table access on [i s]
    -                           │       └─ Table(mytable)
    +                           │   └─ Table(mytable)
    +                           │       └─ columns: [i s]
    └─ TableAlias(b)
    -                               └─ Projected table access on [i]
    -                                   └─ IndexedTableAccess(mytable on [mytable.i])
    +                               └─ IndexedTableAccess(mytable)
    +                                   ├─ index: [mytable.i]
    +                                   └─ columns: [i]
  • 1143: distinct count supports multiple columns
    Fix for: dolthub/dolt#3978
    PR for testing: dolthub/dolt#3989
  • 1142: adding json_table function
    Adds some support for json_table function, which generates a table given a JSON string and column definitions.
    Fix for: dolthub/dolt#2163
    TODO:
    • NESTED
    • FOR ORDINALITY
    • ON EMPTY
    • ON ERROR
  • 1141: Bug fix for now/current_timestamp column default values not in parens
    Our column default logic was conflating whether a default value was a literal value or an expression when the now or current_timestamp functions were used as the column default. Those column defaults were marked as literals, even though they are expressions that need to be resolved and evaluated. These two functions are the only functions that may be used as a column default value without being enclosed in parens, and when used that way, they were not getting resolved and causing a panic (see linked issue for more details).
    This change makes ColumnDefaultValue.IsLiteral always return true if the default value is a literal value and always return false if the default value is some expression that needs to be resolved and evaluated. To keep consistency with MySQL's behavior, it requires that we track whether the column default was enclosed in parens when defined. This is necessary because MySQL allows the now/current_timestamp functions to be used without parens for datetime/timestamp column, but must be specified in parens when used as the default for any other column type. Since that validation is done in a separate spot from the parsing, we need to track that as part of ColumnDefaultValue.
    Testing: Dolt engine tests and Dolt unit tests all pass correctly with these changes.
    Fixes: dolthub/dolt#2618
  • 1140: Error test fixes, extra test for EmptyTable
  • 1138: Improving error message when a stored procedure isn't found and no DB is selected
    Previously, if a user tried to call a stored procedure with no database selected, they would get this error message:
    stored procedure "%s" does not exist
    
    We've seen some users connect to a sql-server, but not use a database and be confused by this error message, so this change adds extra content to the error message when no current database is selected:
    stored procedure "%s" does not exist: this might be because no database is selected
    
  • 1135: Support UPDATE IGNORE
    This pul request implements UPDATE IGNORE behavior. CC:
  • 1134: Populate max field length response metadata
    The MySQL wire protocol includes a "column length" field as part of the Column Definition that tells clients the maximum amount of bytes they should expect for each serialized value in that column. Before this change, go-mysql-server was not populating this field and some MySQL clients were unable to parse responses without that max field byte length metadata. This change starts populating that field, with values identical to what MySQL returns.
    It was pretty straightforward to plumb this new piece of data through, but things got a little tricky with some of the stringtypes, so I cleaned up that code a little bit to better distinguish between the maxChars, maxBytes, and maxResponseBytes for a stringtype.
    Fixes: dolthub/dolt#3914
    Fixes: dolthub/dolt#3893
  • 1133: Add indexes to keyless data tests and validate engine after TestQueryWithContext
  • 1132: Additional book keeping for dropping auto_increment
    This PR add additional logic in the memory implementation of table and validate_create_table to handle cases where auto_increment is dropped in an ALTER TABLE MOIDFY statement.
  • 1130: Better read only errors
  • 1129: json_contains bug fixes
    Bug fixes for Dolt's implementation of json_contains (MySQL Documentation)
    Fixes: dolthub/dolt#3895
  • 1127: super user is localhost, and don't persist superuser
    Companion PR: dolthub/dolt#3810
  • 1125: revert superuser to be any host
  • 1124: still read JSON privilege files, but write flatbuffer format
    fix for: dolthub/dolt#3859
    tests are in dolt pr
    companion pr: dolthub/dolt#3860
  • 1123: Allow plugin for other auth types
    Implementation doc: https://docs.google.com/document/d/1ts7ht9p-VkNgYhD2ouXRGNn8DdPP8wHTNm_T8voSDTE/edit?usp=sharing
    Needed vitess changes: dolthub/vitess#172
  • 1122: Prune *plan.ResolvedTable columns
    Table schemas change after we prune columns by pushing projections into them. This is a one-way process, tables can't become un-projected after pushdown.
    edit: I rewrote this completely to be more organized and avoid some of the resolve issues from before.
  • 1120: Support where clause for show variables
    Where clause is supported only for variable_name column of result table of show variables query.
    Fixes dolthub/dolt#3834
  • 1118: superuser is localhost
    Companion PR: dolthub/dolt#3810
  • 1117: Add foreign key tests where the parent table has inverted primary keys
  • 1116: Add tests for unique key violations and fix engine to pass them
  • 1115: pulling out large JSON test scripts
    The large JSON test scripts have been causing go test -race to fail for Dolt (on ubuntu at least). This PR removes them from go-mysql-server and PR dolthub/dolt#3824 moves them into Dolt.
    Confirmed that go test -race for Dolt's enginetests passes again on Ubuntu after this change.
  • 1114: support unix socket
    dolthub/dolt#3796 is dependent of this PR
    Merged in Aaron's branch with supporting two-listeners
  • 1113: Init RO sess vars
    I needed a way to initialize read-only session variables so I can set them when a new user connects to a value that depends on which user connects.
  • 1112: Added ErrLockDeadlock, mapped to MySQL error code 1213 and SQLSTATE 40001
  • 1111: Bug fixes for AlterPK and column defaults
    Operations that modify a table's schema were not consistently resolving column default values:
    • AlterPK was missed in resolve column default switch cases, and also wasn't considering its expressions when checking for resolution.
    • AlterDefaultSet was resolving the wrong schema (its output schema, OkResultSchema, instead of the target schema).
      While I was in there, I cleaned up the code to more consistently use the sql.TargetSchema interface and combined some copy/pasted switch cases.
      Updated existing tests to use column default values and verified they caught both of these failures. Have already run Dolt engine tests and verified they pass correctly.
      Fixes: dolthub/dolt#3788
  • 1110: Allow floats to specify precision and scale to match MySQL's behavior
    MySQL supports a non-standard SQL syntax for specifying precision and scale for floats and doubles:
    https://dev.mysql.com/doc/refman/8.0/en/floating-point-types.html
    MySQL has marked this as deprecated, but existing MySQL applications (e.g. NocoDB) use this, so we should support it for compatibility. Doubles already support this notation.
    Fixes: dolthub/dolt#3778
  • 1109: Escape backticks in identifier names in show create table
    Updated show create table output to match MySQL's backtick escaping behavior.
    Related to: dolthub/dolt#3779
  • 1108: sql/session.go: Migrate from opentracing to opentelemetry.
  • 1106: Close duplicated file descriptor
  • 1105: Performance improvements in permission checking and view resolution
  • 1104: Expand support for explicit column defaults in insert into statements
    The previous code for handling explicit column defaults in insert into statements couldn't handle many valid cases of using explicit column defaults. To handle inserting explicit column default values in all cases, we need to move to a different approach.
    This change leaves DefaultColumn values in the plan.Values expression tuples and updates the analyzer rules to parse and resolve ColumnDefaultValues in those expressions.
    I've run the Dolt engine tests and verified that all pass with this change.
    Fixes: dolthub/dolt#3741 and #527
  • 1102: remove ValidateCanPersist()
    removes the ValidateCanPersist() from Persister interface, as it is no longer used.
    Companion PR: dolthub/dolt#3732
  • 1101: Bump ruby-mysql from 2.9.14 to 2.10.0 in /_integration/ruby
    Bumps ruby-mysql from 2.9.14 to 2.10.0.
    Commits

    [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=ruby-mysql&package-manager=bundler&previous-version=2.9.14&new-version=2.10.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) ---
    Dependabot commands and options
    You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/dolthub/go-mysql-server/network/alerts).
  • 1100: type convert simplifications
    dolt bump here: dolthub/dolt#3757
    This expands some Convert methods into smaller helper functions that can be used more precisely:
    • avoid interface conversions
    • sidestep bound checks that don't apply on the read path unless we are explicitly running a Cast
      These are a bit rough and ready, the interfaces and semantic check organization could be formalized, but gives 15% boost on table scan.
  • 1099: Adjust limits on varbinary/json data
    • Relaxed limit on incoming varbinary fields to allow for large JSON fields, which have a max size of 1GB and come across the wire as varbinary types.
    • Added a test to ensure limits on varbinary fields are still being honored.
    • Added a size limit on JSON fields at 1GB.
      Fixes: dolthub/dolt#3728
  • 1098: Added safety check to index filter pushdown
  • 1093: raising upper bound for @@join_complexity_limit to 20
  • 1092: Expose JoinComplexityLimit system variable
    Fix for: #1091
  • 1090: Changed character sets and collations to use integer IDs
  • 1088: Avoid NewEmptyContext() to optimize IN filters
  • 1087: Change all info schema columns to capital letters
    Update the information_schema implementation to use capital letters for column names. This matches the MYSQL implementation
    This runs against Dolt correctly: dolthub/dolt#3719
  • 1083: Update columns table numeric precision and scale
    This fills in the numeric_precision and numeric_scale columns of the information_schema.column table
  • 1082: Integrate latest Vitess along with new LOAD DATA test case
  • 1080: sql/analyzer: Make indexed joins more aware of NULL safe comparisons.
    Fix joins and indexed joins using <=>. Ensure that joins using = do not scan NULL values in the index.
  • 1078: Improve how index lookups and ranges address NULL values.
  • 1075: adding histogram builder
    Also removes unnecessary call to AnalyzeTable in information_schema
    Companion PR: dolthub/dolt#3365
  • 1073: Fixed returning null type instead of null value
    A few places, such as NULLIF(), were returning the null type (sql.Null) when they should have been returning a null value (nil in Go). This has been fixed.
  • 1071: Revert #1070 and #1067. Needs more testing and some edge cases considered.
  • 1069: Removed array type and non-compliant functions
  • 1068: Serialize Geometry better
    Fixes the SQL methods for point, linestring, polygon, and geometry to return the hex representations of the values.
    Additionally, fixes the hex method to allow conversion of point, linestring, polygon, and geometry.
    Fix for: dolthub/dolt#3645
    Companion PR: https://github.com/dolthub/dolt/pull/3653/files
    TODO: move geometry serialization package out of dolt into gms (in next pass).
  • 1067: Improve NULL handling in JOINs and index lookups on NULL comparisons.
    INNER JOIN ON a <=> b was broken because of the implementation of
    NullSafeEquals. The implementation returned an Int8, but the join
    implementation was expecting a Boolean.
    SELECT WHERE indexed_column = NULL was broken because index selection
    would select the index, and also drop the filter.
  • 1066: Improved error message for converting unsupported geospatial types
    Improved the error message to tell customers that a specific geospatial data type they tried to use is currently unsupported.
    Fixes: dolthub/dolt#3512
  • 1065: Support describe for views
    This pr:
    1. Introduces a new analyze rules to ensure describe works on views
    2. Fixes a bug in the return of SHOW COLUMNS
    3. Makes sure describe works with expressions
  • 1064: Add more exhaustive cases for int, uint conversion
    fix for dolthub/dolt#3632
  • 1063: Adding a unique index should throw an error if duplicate rows exist
  • 1062: adding enginetests to scriptgen
  • 1061: blob enginetests
  • 1060: remove suffix check
    NOW() does not have prefix ( but has suffix )
  • 1059: Adding EngineTests for polygons with multiple linestrings
  • 1044: Type value changes & row type assertions
    This PR has two primary goals:
    1. All types now pass their values around in a form most suitable to that type.
    2. Enforce that the aforementioned types are always passed to integrators, such that integrators do not need to do type validation on their end.
      To elaborate on these points, some types already passed around values that were sensible and a best fit for that type, such as MEDIUMINT returning an int32. Other types, such as DECIMAL, passed around strings, which necessitated conversions for integrators to be able to properly persist the values. Not only that, there is currently no guarantee that a row's values each have their best fit type (a BIGINT can work with an int8, but it should be able to always expect int64). To make these guarantees, I'm adding a check at all GMS-integrator junctions that pass a sql.Row and verifying that the value types are exactly what that column expects.
      This may have the side effect of changing the output of a SELECT statement for integrators. As a SELECT statement simply returns a sql.RowIter, additional logic will be needed to convert all values to their canonical MySQL representation. This can easily be achieved by passing all values through Type.SQL() before display.
  • 1036: single quoted default value in create table statements
    Current Dolt default literal values are stores in double quotes, but MySQL does not parse double quoted default literal value in CREATE TABLE statements for both sql shell and importing dumps.
    Fixes dolthub/dolt#3218
    Added character set and collation columns for show create view
  • 998: Column Statistics
    Gathers the following column stats:
    • mean
    • min
    • max
    • count
    • null count
    • distinct count
      and stores them in a modified information_schema.column_statistics table.
      There is now a new sql.Histogram struct that contains the column stats; the contents and format of sql.Histogram come from a mixture of Cockroach and MySQL.
      Additionally, there is also a TableStatistics interface, which is meant to be implemented if people want statistics on that table.

vitess

  • 199: Added ALTER DATABASE parsing
    Adds support for parsing ALTER DATABASE queries.
  • 197: adding better support for transaction statements
    MySQL reference: https://dev.mysql.com/doc/refman/8.0/en/commit.html
    adds support for
    BEGIN [WORK]
    COMMIT [WORK] [AND [NO] CHAIN] [[NO] RELEASE]
    ROLLBACK [WORK] [AND [NO] CHAIN] [[NO] RELEASE]
    fix for: dolthub/dolt#4436
  • 196: allow dual only for select from statements if used without back-ticks
  • 195: Allow other options to precede COLLATE in column type defintions
    Fixes dolthub/dolt#4403
    Previously, COLLATE had to immediately follow the column type in a column definition. As per the above issue, this is not a required rule in MySQL. The fix involved moving all collation-related options from the column_type rule to the column_type_options. This caused a conflict in column_type_options, as column_default also has a COLLATE rule. The conflicting rule resolved to the following:
    column_type_options DEFAULT value_expression COLLATE ID
    
    For reference, this is the new COLLATE rule in column_type_options:
    column_type_options COLLATE ID
    
    Given the MySQL expression DEFAULT "xyz" COLLATE utf8mb4_bin, it could match either rule, which caused the conflict. value_expression is too permissive in this context, and although we filter out invalid expressions in GMS, we need to be more restrictive here in the parser to prevent conflicts. In addition, the above example should put the collation on the column, as it is not possible to add a collation to a default string literal (must use the expression form: DEFAULT ("xyz" COLLATE utf8mb4_bin)).
    To fix this, the column_default rule was updated to be vastly more restrictive. This also highlighted some tests in GMS that enforce incorrect behavior, but those have been fixed ahead-of-time, and will be incorporated into the bump PR.
    NOTE: For some reason, we had tests allowing the use of utc_timestamp, utc_date, etc. as default values without the need for parentheses. This is not allowed in MySQL, so I am unsure as to why we allowed them in the first place. Perhaps because they're similar to CURRENT_TIMESTAMP, so we just allowed all of them? Regardless, the tests have been removed as they no longer pass (as they should not pass).
  • 194: adding missing keywords to token.go
    I forgot to add some keywords used in ROW_FMT table option, which caused a bats test to fail
  • 193: adding support for PARTITION syntax for table creation
    Parsing table_options for statements like CREATE TABLE (column_definitions) table_options better matches MySQL
    Can parse partition_options based off of https://dev.mysql.com/doc/refman/8.0/en/create-table.html
    Fix for: dolthub/dolt#4358
  • 192: add full join parsing
  • 191: go/mysql: query.go: Put strings into BindVars as copies, not slices of the recyclable wire packet buffers.
    Cherry picks vitessio/vitess#5562.
  • 190: Adding back support for using reserved keywords unquoted in InsertInto
    I previously excluded reserved_sql_id and only included sql_id because I didn't think reserved keywords should be allowed as identifiers in insert into columns without being quoted, but it appears we have tests in Dolt that rely on that (e.g. sql-diff.bats calls: INSERT INTO test (pk, int, string, boolean, float, uint, uuid) values ...) and I figured if we can support those without the quoting, it seems like a nice feature for customers, so I switched back to reserved_sql_id.
    It is confusing that we refer to identifiers as "reserved" keywords when they don't require backtick quoting though and would be nice to tidy that up in our grammar in the future.
  • 189: Grammar fixes for customer issue using non-reserved keywords in insert into statements
    A customer reported an issue on Discord with using insert into t(pk, comment) values (1, 1). Comment is a keyword, but not a reserved keyword, so this is valid syntax, but our parser was failing on it.
    A few changes to tidy up:
    • Updated tests to trigger the customer reported parsing bug when an (unquoted) keyword is used in a list of columns in an insert into statement.
    • renamed column_name_safe_reserved_keyword to column_name_safe_keyword, since the contents are keywords, but are not "reserved" keywords that need to be backtick quoted
    • fixed a bug with recursion for insert into column names (ins_column_list rule)
    • removed avg,count,sum,min,max from lists of reserved keywords, since they are not reserved in MySQL and can parse correctly in tests without being backtick quoted.
    • added some more comments to try and make the intent of different grammar rules clearer.
  • 188: Support Varchar(MAX)
    related to dolthub/dolt#2261
  • 187: CTEs on unions parse
  • 186: add implicit from dual to select x in recursive cte statements
  • 185: add support for grant all privileges to <user>
  • 184: Added parse tests for all functions
  • 183: resolving unreserved keyword inconsistencies with MySQL
    Tests have been updated to document the inconsistencies below.
    Changes:
    • unreserved many keywords
    • allow reserved keywords to be used as identifiers when they are qualified
      Words that parse in dolt, but don't in MySQL:
    • dual
    • minute_second
      Words that don't parse in dolt (only in quries that use these in where conditions), but do in MySQL:
    • escape
    • next
    • off
    • sql_cache
    • sql_no_cache
      Fix for: dolthub/dolt#3977
  • 182: Fix error in REPEAT function
  • 180: Support for common table expressions in INSERT, UPDATE, DELETE
  • 179: allow unescaped reserved keywords when prefixed with table name, and add more reserved keywords
    Changes:
    • fixed misidentified keywords as column safe
    • fixed incorrect tests
    • every reserved keyword in MySQL is reserved in our grammar now
    • exceptions are DUAL and MINUTE_SECOND
      Note:
    • we are still missing many unreserved keywords
    • some of our reserved keywords are unreserved in MySQL
      Fix for: dolthub/dolt#3979
  • 178: Added BINARY to match deprecated functionality
    Fixes dolthub/dolt#4019
  • 177: Removed unused methods
    Follow up from #175
  • 176: /{.github,go.mod}: bump go to 1.19
  • 175: Remove all package-level flags

    Summary

    This is my attempt to perform a minimally invasive purge of all package-level flags.
    I would appreciate some input on:
    • whether any of the upstream repos that use this vitess fork will be impacted by these changes
    • whether this change should be more invasive (e.g., if you don't use the AuthServerClientCert it may be better to remove it entirely rather than just prune the flags)

    Motivation

    dolthub/vitess#174

    Testing

    I've removed all package-level flags:
    ryanpbrewster@argon:~/p/g/dolthub-vitess$ rg "flag\.\w"
    go/mysql/auth_server_clientcert.go
    35:     if flag.CommandLine.Lookup("mysql_server_ssl_ca").Value.String() == "" {
    
    The existing tests still pass:
    ryanpbrewster@argon:~/p/g/dolthub-vitess$ go test go/...
    ok      go/ast  (cached)
    ok      go/build        (cached)
    ok      go/build/constraint     (cached)
    ok      go/constant     (cached)
    ok      go/doc  (cached)
    ok      go/format       (cached)
    ok      go/importer     (cached)
    ok      go/internal/gccgoimporter       (cached)
    ok      go/internal/gcimporter  (cached)
    ok      go/internal/srcimporter (cached)
    ?       go/internal/typeparams  [no test files]
    ok      go/parser       (cached)
    ok      go/printer      (cached)
    ok      go/scanner      (cached)
    ok      go/token        (cached)
    ok      go/types        (cached)
    
  • 173: parse json_table syntax
    TODO:
    • NESTED
    • FOR ORDINALITY
    • ON EMPTY
    • ON ERROR
  • 172: Changes for adding new auth method
  • 171: Add rows parsing for LOAD DATA IGNORE and address flaky tls test
    This PR does two things
    1. Addresses dolthub/dolt#2548 by introducing new conditions in sql.y
    2. Addresses flaky tls server tests by updating tests to use Tls13 and adding an additional error check
  • 170: Adding support for use db/branch syntax
    Currently, use <database>/<branch> works only from the mysql client, because it parses the use command on the client side and sends it to the server as a different command (not a query). This means this syntax does not work with dolt sql or with other clients that don't parse this command client side and translate it (e.g. go-sql-driver/mysql).
    This PR adds support to our parser for this syntax, so that use <database>/<branch> can be used consistently.
  • 169: Adding support for specifying a primary key name
    MySQL allows you to specify an optional name following a primary key definition when creating or altering a table. This name is always ignored though – all primary keys in MySQL are always named PRIMARY.
    Fixes: dolthub/dolt#3674
  • 168: Remove glog dependency, reduce log levels, remove all logutilpb usage and support.

Closed Issues

  • 1357: _example/main.go raises panic
  • 1319: Filter pushdown doesn't show in explain plan even though pushdown was successful
  • 1331: CreateTable: DEFAULT CHARACTER SET missing
  • 945: Client hangs if go-mysql-server takes too long to return query results
  • 1164: Support the DO statement
  • 1162: Support the EXTRACT function
  • 1128: superusers should be hidden
  • 1289: Some queries using HAVING can't be executed without GROUPBY
  • 1314: Cannot connect to database to query MemoryTable
  • 750: Copy the mysql behaviour regarding reserved keywords
  • 738: mysql SERIAL alias unsupported
  • 644: SHOW TABLE CREATE PRIMARY KEY ordering
  • 611: Bad error message for type conversion error on insert
  • 610: EXISTS expects a single column in subquery result, should support any number
  • 592: Creating a table without a current DB give confusing error message
  • 562: Intermittent DATA RACE error when creating MySQL Servers
  • 561: Support for generated columns
  • 526: Cleanup PrimaryKey error vs UniqueKey error vs DuplicateEntry
  • 513: Need tests of PREPARE
  • 498: Inserting into UINT32 column can cause error with leading 0
  • 479: Cannot use expression index in indexed join
  • 476: Can a Table implement sql.IndexAddressableTable without implementing sql.DriverIndexableTable?
  • 383: Unbound variables in subqueries
  • 463: anyone writing the rocksdb backend?
  • 1300: Inserts for some values get dropped when accessed via a trigger for inserts.
  • 1144: Is it possible to run raw sql queries to mock the database?
  • 525: Column alias in where clause should be an error
  • 1268: Analyzing self-referential foreign keys triggering an infinite loop
  • 1232: Case-sensitive comparison of string literal
  • 1219: How can i change the log level?
  • 1096: go 1.17 support
  • 1218: The Decimal type treats the same value with different input format differently
  • 1153: proposal: Refactor Indexed Table Access
  • 1139: Error with passing time.Time as a field for saving
  • 1161: ENUM values are always lowercased regardless of actual case
  • 1091: expose JoinComplexityLimit system variable
  • 1089: I proxies mysql-8.0.23 to go-mysql-server,some problem confuse me
  • 527: Gap in coverage for DEFAULT keyword in inserts
  • 1086: Support Describe for information_schema tables
  • 787: Add support for describing views
  • 1055: REGEXP '^[-]?[0-9]+$' fails on int64 column.
  • 174: Remove package-level Vitess flags