Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add broker and ewallet schema #16

Merged
merged 1 commit into from
Mar 26, 2024
Merged

Add broker and ewallet schema #16

merged 1 commit into from
Mar 26, 2024

Conversation

wongjingping
Copy link
Collaborator

@wongjingping wongjingping commented Mar 26, 2024

Add broker and ewallet schema (and data):

  • broker has camelCased table/column names and a standard prefix for all column names from the same table. broker also avoids the user of FOREIGN KEY/REFERENCES
  • ewallet contains a transactional tech/e-commerce company's schema. ewallet has a schema name consumer_div. prepended to each table's name, and uses REFERENCES
    Together with the earlier 2 schema:
  • car_dealership has a sales-ish inventory schema that mimic a sales function in a company
  • derm_treatment mimics a pharmaceutical company tracking clinical trials and various experimental treatments. It contains cohort-style data in a wide-format

We also sparsified the descriptions for the car_dealership schema to avoid excessively long contexts with redundant information.
Added tests to check for the number of columns for each schema.

Did a test on the number of tokens (using codellama's tokenizer) required to tokenize the sql (without the insert statements, and not using the json):

Schema Num_tokens
broker 715
car_dealership 867
derm_treatment 1218
ewallet 1484

And here are the number of tokens required to tokenize the glossary:

Schema Num_tokens
broker 172
car_dealership 265
derm_treatment 365
ewallet 325

Add tests for num columns
@wongjingping wongjingping requested a review from rishsriv March 26, 2024 10:01
Copy link
Member

@rishsriv rishsriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing. Thank you!

@rishsriv rishsriv merged commit d8a46d3 into main Mar 26, 2024
2 checks passed
@rishsriv rishsriv deleted the jp/broker_ewallet branch March 26, 2024 10:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants