Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding json_table function #1142

Merged
merged 38 commits into from
Aug 16, 2022
Merged

adding json_table function #1142

merged 38 commits into from
Aug 16, 2022

Conversation

jycor
Copy link
Contributor

@jycor jycor commented Jul 29, 2022

Adds some support for json_table function, which generates a table given a JSON string and column definitions.

Fix for: dolthub/dolt#2163

TODO:

  • NESTED
  • FOR ORDINALITY
  • ON EMPTY
  • ON ERROR

Copy link
Member

@zachmu zachmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start, see comments

}

// NewJSONTable creates a new in memory table from the JSON formatted data, a jsonpath path string, and table spec.
func NewJSONTable(data []byte, path string, spec *sqlparser.TableSpec, alias sqlparser.TableIdent, schema sql.PrimaryKeySchema) (sql.Node, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in the other PR for vitess, this needs to take an expression for the JSON data, not a []byte

Also sqlparser.* should not make it down to this layer of the code, all that translation should be handled in parse.go

rowIdx uint64
}

var _ sql.Table = &JSONTable{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might consider modelling this as a sql.TableFunction instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did originally, but I had trouble implementing the NewInstance method

Copy link
Member

@zachmu zachmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs more tests, missing some necessary things

@@ -495,6 +495,8 @@ func WidenRow(sch sql.Schema, row sql.Row) sql.Row {
vw = uint64(x)
case float32:
vw = float64(x)
case float64:
vw = float64(float32(x))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are rounding errors when using float64

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There really shouldn't be, this is a sign of a bug

Floats should round-trip with no precision loss

Does it come from parsing JSON?

@@ -111,6 +111,11 @@ func TestJoinQueries(t *testing.T) {
enginetest.TestJoinQueries(t, enginetest.NewMemoryHarness("simple", 1, testNumPartitions, true, nil))
}

// TestJoinQueries runs the canonical test queries against a single threaded index enabled harness.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy pasta error

"github.com/dolthub/go-mysql-server/sql"
)

var JSONTableQueryTests = []QueryTest{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need some tests joining json tables to physical tables

If they don't work, add skipped tests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also tests where the expression for json data comes from a column, not a string literal

func (t *JSONTable) RowIter(ctx *sql.Context, row sql.Row) (sql.RowIter, error) {
// TODO: need to resolve function calls like concat()
// data must evaluate to JSON string
data, err := t.dataExpr.Eval(ctx, nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to eval with a row

Copy link
Member

@zachmu zachmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking better, the table iteration needs a little love

@@ -495,6 +495,8 @@ func WidenRow(sch sql.Schema, row sql.Row) sql.Row {
vw = uint64(x)
case float32:
vw = float64(x)
case float64:
vw = float64(float32(x))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There really shouldn't be, this is a sign of a bug

Floats should round-trip with no precision loss

Does it come from parsing JSON?


// RowIter implements the sql.Node interface
func (t *JSONTable) RowIter(ctx *sql.Context, row sql.Row) (sql.RowIter, error) {
// TODO: need to resolve function calls like concat()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO is out of date, or no?

if err != nil {
return nil, err
}
strData, ok := data.(string)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't cast to a string like this, use Type.Convert()

}

// Fill in table with data
for colIdx, p := range t.colPaths {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't do any of this iteration in RowIter, and don't store it all in memory

Do it one row at a time in iter.Next()

Copy link
Member

@zachmu zachmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice work


var JSONTableQueryTests = []QueryTest{
{
Query: "SELECT * FROM JSON_TABLE(NULL,\"$[*]\" COLUMNS(x int path \"$.a\")) as t;",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all these tests, use ' instead of " for your quote delimiter (except where required because it's a json string).

Consider using backtick quoted golang strings so you don't need to escape " in json strings either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants