Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SS-185: Turn Global-chem into a biobrick #317

Open
Sulstice opened this issue Jul 19, 2024 · 2 comments
Open

SS-185: Turn Global-chem into a biobrick #317

Sulstice opened this issue Jul 19, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers knowledge graphs
Milestone

Comments

@Sulstice
Copy link
Collaborator

We are going to be taking Global-Chem and turning into a biobrick:

I opened up the issue here and here is the code for it: https://github.com/Global-Chem/global-chem-brick. The first thing to do is create a python file that converts the CSV from global-chem into a parquet file.

Should be a one-liner. Make the directory similar to the one labeled here: https://github.com/biobricks-ai/drugbank-open.

Screenshot 2024-07-19 at 3 08 23 PM

And place your script in the global-chem brick repository: https://github.com/Global-Chem/global-chem-brick

@Sulstice Sulstice added the enhancement New feature or request label Jul 19, 2024
@Sulstice
Copy link
Collaborator Author

Sulstice commented Jul 22, 2024

@Nickspizza001 So let's start slow with this one.

This first thing to do would be to take global-chem tsv and convert it into a parquet file.

  1. Create a directory called transformers.
  2. In the directory write a python script that takes in the tsv file and converts it to parquet file. Tell me why parquet is used.

Open a pull request and show me.

@Sulstice Sulstice added this to the v2.0 milestone Jul 22, 2024
@Nickspizza001
Copy link
Collaborator

https://github.com/Global-Chem/global-chem-brick/tree/dami

Parquet is used because it compresses large files better than others

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers knowledge graphs
Projects
Development

No branches or pull requests

2 participants