Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove has_peptide_quantifications values from proteomics records in workflow_execution_set (make-rdf workflow) #270

Merged
merged 6 commits into from
Oct 2, 2024

Conversation

turbomam
Copy link
Member

@turbomam turbomam commented Oct 2, 2024

This PR eliminates the conversion of metaproteomics analyses to RDF, as there are still all_proteins values that lack prefixes.

This could also be fixed by identifying opportunities to set a @base prefix in the schema or as part of the RDF conversion process.

@cmungall is investigating whether it might be _base, see


Guidelines

Soft Schema Freeze

The nmdc-schema and berkeley-schema-fy24 schemas are under a soft freeze, which means changes should not be made that have any downstream implications. To ensure this, all PRs created during the freeze will be closely reviewed with every component of the NMDC system in mind.

Reviewers

To ensure no changes are made unexpectedly, PR creators will use this PR template to tag and notify all task coordinators. Review should be specifically requested from all Berkeley Schema Roll Out task coordinators that you expect to be affected by this PR.

We expect task coordinators to review PRs and provide feedback/approval within 1 week of when they are identified as reviewers.

PRs will NOT be merged until all task coordinators (or their delegates) have approved it; either here on GitHub (via "Review changes > Approve" or an equivalent comment) or verbally.

Expedition, questions, and discussion can happen at any meeting.

Delays in review & merging should be addressed in meetings or with NMDC leadership.

If you expect the changes to
impact this component...
...request a review
from this person
Metadata
Schema
@mslarae13
Runtime
Mongo database
Database migrations
@eecavanna,
who will pull in
@shreddd as needed
Postgres
Ingest
@naglepuff
Data Portal @aclum
Workflows: MG & MT @mbthornton-lbl
Workflows: MetaB & NOM @corilo
Workflows: LipidO @kheal
Workflows: MetaP @SamuelPurvine
ETL code @sujaypatil96
Jupyter notebooks @brynnz22

PR Information

What type of PR is this? (check all applicable)

  • Refactor
  • Feature
  • Bug Fix
  • Optimization
  • Documentation
  • Schema change: Structure and content
    • created, updated, or deleted a class, slot, or enum
    • changed whether a slot is multivalued
    • changed the way a slot is assigned to a class
    • changed the permissible_values of an enum
    • etc.
  • Schema change: Cleanup and preparation
    • updated the description of a class, slot, or enum
    • updated the mappings of a class, slot, or enum to an ontology
    • added an enum for future use (it is not in the range of any slot)
    • etc.

Description

PRs should be small and concise.

Aim to create small, focused pull requests that fulfill a single purpose. Smaller pull requests are easier and faster to review and merge, leave less room to introduce bugs, and provide a clearer history of changes.

  • Replace this text with a description of what this PR branch contains. Please keep in mind that all reviewers will be reading this description. Example: "In this branch, I..."

Related Issues

All PRs should relate to or fix an issue(s). Please identify the issue(s) below.

  • Related Issue(s): #
  • Fixes: #

Did you add/update any tests?

  • Yes
  • No (Add a justification below)
  • I need help with writing tests

Could this schema change make it so any valid data becomes invalid?

This is a question about what the schema allows. It is not a question about what happens to exists in the NMDC database right now.

Example: If, in this PR branch, you renamed a slot from foo to foo_bar, the answer to this question would be "yes," even if nothing in the NMDC database currently uses the foo slot.

More examples: slot or class name changes, changes to a slot's multivalued state, changes to a slot's range (e.g. string to integer), changes to slot assignments to classes, changes to an enum's permissible_values

  • Yes (A migrator is required)
  • No
  • I need help determining this

If you answered "Yes", does this PR branch include that migrator?

  • Yes
  • No, this PR is incomplete and I need help writing the migrator

Does this PR have any downstream implications?

Examples: any change here that requires a change to workflows, workflow automation, the Mongo-to-Postgres ingest process, Jupyter notebooks, the Runtime, etc.

  • Yes (Explain below)
  • No

@turbomam turbomam changed the title add updated project.Makefile skip RDF conversion of metaproteomics analyses Oct 2, 2024
Copy link

github-actions bot commented Oct 2, 2024

PR Preview Action v1.4.8
🚀 Deployed preview to https://microbiomedata.github.io/berkeley-schema-fy24/pr-preview/pr-270/
on branch gh-pages at 2024-10-02 23:12 UTC

more hyphens for pure-export --output option
@turbomam turbomam marked this pull request as ready for review October 2, 2024 20:16
@eecavanna eecavanna self-requested a review October 2, 2024 20:54
@turbomam
Copy link
Member Author

turbomam commented Oct 2, 2024

just remove the one problematic slot

@turbomam
Copy link
Member Author

turbomam commented Oct 2, 2024

yq eval 'del(.workflow_execution_set[].has_peptide_quantifications)'

@turbomam
Copy link
Member Author

turbomam commented Oct 2, 2024

thanks @aclum

i'm just deleting has_peptide_quantifications values now

@turbomam turbomam merged commit 0e93b74 into main Oct 2, 2024
@turbomam turbomam deleted the exclude-metaproteomicsanalysis branch October 2, 2024 23:10
@turbomam turbomam changed the title skip RDF conversion of metaproteomics analyses Remove has_peptide_quantifications values from proteomics records in workflow_execution_set (make-rdf workflow) Oct 3, 2024
@turbomam turbomam changed the title Remove has_peptide_quantifications values from proteomics records in workflow_execution_set (make-rdf workflow) Remove has_peptide_quantifications values from proteomics records in workflow_execution_set (make-rdf workflow) Oct 3, 2024
@turbomam turbomam mentioned this pull request Oct 3, 2024
17 tasks
@turbomam
Copy link
Member Author

turbomam commented Oct 3, 2024

This can probably be reverted after

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants