Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic support for INSERT/PUT ... VALUES with partial column spec #1394

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

sumwale
Copy link
Contributor

@sumwale sumwale commented Jul 29, 2019

Changes allow the following to be supported in addition to currently supported ones on all tables:

// simple constructs
INSERT (INTO|OVERWRITE) <table> VALUES ...
PUT INTO <table> VALUES ...
// with partial columns that can be out of order with table definition
INSERT (INTO|OVERWRITE) <table>(<col1>...) VALUES ...
INSERT (INTO|OVERWRITE) <table>(<col1>...) SELECT ...
PUT INTO <table>(<col1>...) VALUES ...
PUT INTO <table>(<col1>...) SELECT ...

Changes proposed in this pull request

  • add SnappyParser "inlineTable" rule to insert/put and removed "subSelectQuery" rule
    that was specifically added to avoid the "inlineTable" rule
  • enhanced the AnalyzeMutableOperations rule to deal with the case of "partial columns"
    which is parsed as a TableValueFunction; locate the columns in target table and
    add appropriate Project on child
  • capture the query string for possible later use in "partial column" construct for
    row tables where explicit non-null DEFAULT value for a column may have been specified;
    for this case switch to DMLExternalTable which may fail if query contains Spark-only functions
  • removed custom PutIntoValuesColumnTable and use consistent way for both INSERT and PUT
  • handle the QuestionMark case in VALUES() in a generic way as LocalRelation resolved by Spark
  • removed ResolveRelationsExtended and instead made table as proper child of DMLExternalTable
    so that it is resolved in the normal way by ResolveRelations

New unit tests to be added shortly.

Patch testing

precheckin

ReleaseNotes.txt changes

Document support for VALUES and partial column specification in INSERT/PUT. Limitation of
not being able to use Spark-only functions with partial column specification for ROW tables.

Other PRs

NA

sumwale added 2 commits July 30, 2019 03:58
- use JDBCMutableRelation.executeUpdate consistently that correctly sets current schema
  instead of direct JDBC calls
- for dropIndex when JDBCMutableRelation has not been resolved, SnappySession execution
  resolves the full table name before passing to dropRowStoreIndex
- update SHOW INDEXES example output
- minor formatting changes in ExternalStoreUtils and jdbcExtensions
Changes allow the following to be supported in addition to currently supported ones:

// simple construct
INSERT (INTO|OVERWRITE) <table> VALUES ...
PUT INTO <table> VALUES ...

// with partial columns that can be out of order with table definition
INSERT (INTO|OVERWRITE) <table>(<col1>...) VALUES ...
INSERT (INTO|OVERWRITE) <table>(<col1>...) SELECT ...
PUT INTO <table>(<col1>...) VALUES ...
PUT INTO <table>(<col1>...) SELECT ...

- add SnappyParser "inlineTable" rule to insert/put and removed "subSelectQuery" rule
  that was specifically added to avoid the "inlineTable" rule
- enhanced the AnalyzeMutableOperations rule to deal with the case of "partial columns"
  which is parsed as a TableValueFunction; locate the columns in target table and
  add appropriate Project on child
- capture the query string for possible later use in "partial column" construct for
  row tables where explicit non-null DEFAULT value for a column may have been specified;
  for this case switch to DMLExternalTable which may fail if query contains Spark-only functions
- removed custom PutIntoValuesColumnTable and use consistent way for both INSERT and PUT
- also handle the QuestionMark case in VALUES() in a generic way as LocalRelation resolved by Spark
- removed ResolveRelationsExtended and instead made table as proper child of DMLExternalTable
  so that it is resolved in the normal way by ResolveRelations
@sumwale sumwale changed the base branch from SNAP-2885 to master July 30, 2019 17:03
Copy link
Contributor

@dshirish dshirish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For complex datatypes, will we still continue using "insert into<> select <>" syntax?

Copy link
Contributor

@dshirish dshirish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like unit tests mentioned are to be pushed.
However the code change itself looks good to me.

@sumwale
Copy link
Contributor Author

sumwale commented Aug 1, 2019

For complex datatypes, will we still continue using "insert into<> select <>" syntax?

No, ARRAY/STRUCT/MAP will work now in VALUES like they do in SELECT.

@sumwale sumwale force-pushed the master branch 3 times, most recently from 2c254f0 to 0f2888f Compare October 18, 2021 17:01
@sumwale sumwale force-pushed the master branch 2 times, most recently from a466d26 to ea127bd Compare April 12, 2022 10:05
@sumwale sumwale force-pushed the master branch 2 times, most recently from 99ec79c to c7b84fa Compare June 12, 2022 04:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants