-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add step to generate association data #888
Conversation
src/gentropy/common/utils.py
Outdated
), | ||
f.lit(0.0), | ||
lambda acc, x: acc | ||
+ x["score"]/f.pow(x["pos"], 2)/f.lit(sum(1 / ((i + 1)**2) for i in range(100))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why only first 100 are used?
And It should be a devision by 1.644 somewhere...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That part represents the division by ~1.644
I initially used first 1000 (1.6439..) but changed it to 100 (1.6349..)
Should I just change it to f.lit(1.644) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, misread it. All is fine. But I would use 1000 (to be consistent with the platform if documentation is correct)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR adds a new step to gentropy to generate disease/target association (direct and indirect) based on l2g evidence. This also requires the inclusion of a new method to compute harmonic sum of values.
✨ Context
We want to generate assocations from l2g evidence without relying on the platform etl.
🛠 What does this PR implement
This PR adds a step to generate direct and indirect associations from l2g evidence and saves them as parquet files.
🙈 Missing
🚦 Before submitting
dev
branch?make test
)?poetry run pre-commit run --all-files
)?