Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery, MySQL, SQLite and SQL Server support #27

Merged
merged 4 commits into from
May 29, 2024
Merged

Conversation

wendy-aw
Copy link
Contributor

  • Added a new script that translates the Postgres .sql files into the different dialects. Translation uses both sqlglot as well as regex operations.
  • The script performs translation, and also creation of databases in the different db systems. It further checks that values have been inserted properly.
  • Some modifications have been made to the original .sql files to strictly adhere to the data types set out in the DDL statements.

@wendy-aw wendy-aw requested review from wongjingping and rishsriv May 28, 2024 14:28
This will create one new SQL file per database per dialect.
For SQLite, the `.db` files will be saved in the folder `sqlite_dbs`.
Note that BigQuery, MySQL and SQLite do not support schemas and hence the SQL files will be modified to skip schema creation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very helpful! Thanks for laying this down so clearly :D

(2, 'Advanced Calculus', 'Mathematics', 'MATH201', 4, 'CS101', NULL, 'This course covers advanced topics in calculus.', 1, 3, false, false, true, true, 5, 4, 2, 3),
(3, 'Introduction to Physics', 'Physics', 'PHYS101', 3, NULL, 'MATH201', 'This course provides an introduction to physics principles.', 2, 1, true, true, true, true, 8, 4, 3, 5),
(4, 'Distributed Databases', 'Computer Science', 'CS302', 3, NULL, 'CS101', 'This course provides an introduction to distributed databases.', 2, 2, true, true, false, true, 4, 2, 1, 5)
(1, 'Introduction to Computer Science', 'Computer Science', 'CS101', '3', NULL, NULL, 'This course introduces the basics of computer science.', 2, 2, true, false, true, false, 10, 5, 3, 4),
Copy link
Member

@rishsriv rishsriv May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix here! TIL that Postgres likely automatically casted this to a string earlier

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THANK YOU FOR THIS! Wow this would've taken forever to get right. Much appreciated!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+100 to this!

Copy link
Member

@rishsriv rishsriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fantastic work here Wendy!

@rishsriv rishsriv merged commit 4bd1222 into main May 29, 2024
2 checks passed
@rishsriv rishsriv deleted the wendy/ddl_dialects branch May 29, 2024 02:27
@@ -228,6 +228,19 @@ def load_embeddings(emb_path: str) -> tuple[dict, dict]:
"journal.journalname,text,Name or title of the journal",
],
},
"broker": {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding these placeholders!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+100 to this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants