-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Review and fix "required" and "default" flags for vocab download #278
Comments
I agree with most of the entries in the table. Ones I would question if they should be required in the future: US Census We could also remove the idea of "Required" in the interest of transparency and have a note appear on the page that a vocabulary is "Highly Recommended" when it is what we currently consider "Required" but still afford the user the opportunity to deselect it. Then we would only have a boolean for "Default" for each vocabulary that can be edited by the user when creating their vocabulary download. |
Hi, thanks for the input, @fdefalco |
Is there a timeline for implementation of this particular feature? |
I had hoped. @cgreich would give us his final "placet". I would then hand over the above list for processing by the vocab team and it should go to Athena with the next release. |
@ssuvorov-fls - could you check, if the above new settings would somewhat break something once they end up in Athena? Can we test run this in any QA instance? |
I think, the unspoken convention was to include everything that goes to the Domain missing its respective tables so that you don't miss the concepts for such "service" things as gender_concept_id, unit_concept_id, modifier_concept_id, route_concept_id, etc. Because it's not really obvious what vocabularies to pick if you want to add one more table/domain to your CDM. Region_concept_id somehow didn't materialize into a field but explains why OSM and US Cencus are there.
I wouldn't do it because the users that are updating their ETLs from some old vocabulary versions will just lose the concepts that appear it their mappings. I would never do it for the "service" small vocabularies.
I didn't get the logic behind. How the gender is more important than the race? And why Sponsor is better than a Geography?
Don't think it's a great choice before we cleaned up the EAV data. Otherwise, people will start map to UKB, PPI and NAACCR. And it's already the case. |
OSM is however one of the reasons, this whole discussion started... I guess I would still take it out of "required".
hmm... have we mapped old type concepts over to the new ones? If so, it would make sense to keep them. but otherwise aren't they simply useless now and all non-standard?
Well, this is derived a little from how it was before. Gender is really indispensable, whereas Race & Ethnicity is, as we know, US centric... and they are still marked as default, so most people will keep them in their download. They just have a choice to deselect.
Of course we would not follow that notion blindly and hence the above are not marked as default. But you cannot prevent people from selecting them for download, unless we would make them something like license restricted (only not license but something else). |
The original intent of the discussion was to promote transparency and flexibility in vocabulary download. As it stands, vocabularies that are not listed or selected are included in the download, so for transparency, they should be listed and selected by default. For flexibility the user can have the option to unselect vocabularies. I'm not sure what benefit preventing a user from unselecting a vocabulary would provide, if you reject defaults you should be doing so for a well understood reason. Perhaps a warning on the page that says 'Default vocabularies are selected to provide important concepts to most ETL processes, remove them from the selected vocabularies at your own risk.' :) |
@cgreich has an even stricter view on this. I think he used the word "dogmatic". Let's hear him out. (Christian, one exception to the rule should be vocabularies that have standard items but are also license restricted such as CDT or ISBT). |
I think Patrick echoed my concern on transparency here: https://forums.ohdsi.org/t/osm-vocabulary/16303/11 |
Are we debating here or there? |
We are discussing the changes to be made as part of this issue here, informed by the conversation there. I don't think there is any debate regarding the need for transparency of vocabularies that are included in a download. I imagine the remaining debate is whether or not to provide the user the ability to control whether not 'default' vocabularies are included. My vote is that the user is provide control with a stern warning about why defaults should be left as is. |
Hang on a sec. Right now, the thinking is we have three categories (not two):
The proprietary vocabularies are in the Rest category, since they need to be individually clicked and processed anyway. We will have to change Athena to always include all standard concepts (easy), and create different sets of recommended vocabularies (North America, Europe, Rest of World maybe). Not a big deal, but will require some work. |
A recent forum post highlighted the issue that some vocabularies are set to "required" and as such will be always part of a download bundle / cannot be deselected. In particular among these, the vocabularies Korean Revenue Code, OSM and SOPT seemed a little off to be indispensable for an OMOP CDM.
These are the ones currently marked as "OMOP required":
I guess we can remove all the ones with a "Type" in their name except for the new Type Concepts as they have replaced them. The respective concepts per vocabulary ID could probably also be retired.
There was also the notion to mark more vocabularies as default that have standard concepts.
Here are the ones with standard concepts or classifications and their respective count together with a proposal how to set the default and required flags:
Please review @cgreich and @fdefalco !
Thanks - mik
The text was updated successfully, but these errors were encountered: