-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store demo case file paths as absolute paths #4517
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #4517 +/- ##
==========================================
+ Coverage 84.63% 84.68% +0.04%
==========================================
Files 310 310
Lines 18598 18612 +14
==========================================
+ Hits 15741 15761 +20
+ Misses 2857 2851 -6 ☔ View full report in Codecov by Sentry. |
I don't really understand why we need to refactor case build and case loader (#4399) just to fix the fact that demo case doesn't have absolute but relative paths to the files? The purpose of pydantic is to validate and customise fields and this is just the case. |
This is not just about the demo case. The problem is most obvious there, since all new devs/users encounter it. It affects all use of relative paths on loading. Some of these are being checked for at various places in the code base, with varying success. I do not mind seeing a pydantic class fixing the issue, it is great for it: we just need to make the abstraction clear, and move over other similar tasks from build to it as well so one knows where to look for them. Adding one more task to what is currently primarily the pydantic load validator makes its role more confusing. Is it for validation? For initial load conversion to db? Or perhaps for reading db documents for display? As we discussed in #4388. We can refactor it, but I don't see why we should do all that refactoring before merging the first PR? And I dont see why we should start by solving a part of the original issue with this PR? |
Mmm sorry but I don't really like the approach of PR #4388 to fix the issue. I agree with you that having the pydantic validation plus the case build is confusing and stuff might be refactored to be more clear. This probably comes from the fact that we had many checks in build before we switch to pydantic, but since now we have pydantic I think we should make the. most of it. In current main there is a method that already checks if the paths are valid. So instead of having one hundred lines of extra code to fix the paths outside the validation, you can have just 2 in that pydanic function that do just the same? It's way more easy to maintain as well I think? |
I think in these case we might solve the impasse like this: we leave both PRs open and we see who gets tired to see them laying around and approves one first 🤣 |
Right, neither is the solution we would like in the long run, and we could argue for a long time on wether its better to have something that solves the problem, and how complete an abstraction should be before introducing it. I think this is a multiplication of effort for no reason other than a realisation about a possible future path, but since you made it, I will review it and ask that it at least satisfies solving #1394. Once it works, we could then make sure to solve #4399 before the next release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks convincing on VCF, but still does not fix e.g. custom images path. If you want, you could step through the changes in #4388 for the places I know of.
I haven't checked all places downstream that attempt to make an abs path for redundancy. I guess we may still want several of them to cover for old relative paths in the db.
The CaseLoader class now does part of what build_case() did before. I think the division you sketch is really good, but there are several equivalent sub functionalities left in build_case() that should move to CaseLoader to keep us from looking back and forth between the two.
Abstraction wise, we would want things that only needs to be done when parsing the config file in CaseLoader, and they should as per parsing in general also preferably not require the use of other db resources. Build would then handle the additional resources, and not have to worry about file integrity or syntax. We traditionally reserve dual use for the build function, so that they could sometimes be called when storing back items from prepped display object to db, but this is afaik not used for case objects - we only do field-by-field updates on them live.
@dnil I've fixed paths to images (cases and vars), chromograph and reviewer. Comparing with the demo case config it looks like that's all |
At the seminar now but will check later! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, this should now close #1394 afaik, so I approve. I guess we could say it starts #4399 as well, but there are other simpler value changes in build
. They can also be determined by just looking at the config file (e.g. variable type casts, booleans for valid config values (like has_svvariants
) and should go with the CaseLoader when allowing it to change values. But we can do them after.
item_path: str = values.get(item) | ||
if ( | ||
item_path | ||
): # it is an incomplete path -> replace the path to the directory with the abs path to the directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we move the comments to the docstring? There is only one thing happening here really, if the key exists on the obj. Using more descriptive variable names would also alleviate the need to comment the code. 😊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
item_path_dirname: str = dirname( | ||
item_path | ||
) # path to the directory containing multiple files | ||
values[item] = item_path.replace( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like here.. values[item]
could be made into a short essay describing what they represent..
values[item] = item_path.replace( | |
config_values[chromograph_path_item] = item_path.replace( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I have replaced the loop in the model check with for key in cls.model_fields:
, so I think it's clear that the keys used for collecting the values are the model fields!
return False | ||
def _resource_abs_path(string_path: str) -> str: | ||
"""Return the absolute path to a resource file.""" | ||
if exists(string_path) and isabs(string_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could flatten this conditional as
if exists(string_path) and isabs(string_path): | |
if not exists(string_path): | |
raise FileExistsError(f"Path {string_path} could not be found on disk.") | |
if isabs(string_path): | |
return string_path | |
return abspath(string_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I like it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
Co-authored-by: Daniel Nilsson <daniel.k.nilsson@gmail.com>
…-Genomics/scout into pydantic_fix_case_paths
Quality Gate passedIssues Measures |
I tested it again locally and seems to parse stuff fine. Merging when tests pass. Thanks for the useful review @dnil! |
This PR adds a functionality or fixes a bug.
Testing on cg-vm1 server (Clinical Genomics Stockholm)
Prepare for testing
scout-stage
and the server iscg-vm1
.ssh <USER.NAME>@cg-vm1.scilifelab.se
sudo -iu hiseq.clinical
ssh localhost
podman ps
systemctl --user stop scout.target
systemctl --user start scout@<this_branch>
systemctl --user status scout.target
scout-stage
) to be used for testing by other users.Testing on hasta server (Clinical Genomics Stockholm)
Prepare for testing
ssh <USER.NAME>@hasta.scilifelab.se
us; paxa -u <user> -s hasta -r scout-stage
. You can also use the WSGI Pax app available at https://pax.scilifelab.se/.conda activate S_scout; pip freeze | grep scout-browser
bash /home/proj/production/servers/resources/hasta.scilifelab.se/update-tool-stage.sh -e S_scout -t scout -b <this_branch>
us; scout --version
paxa
procedure, which will release the allocated resource (scout-stage
) to be used for testing by other users.How to test:
scout setup demo
using main branch and this branchExpected outcome:
The functionality should be working
Take a screenshot and attach or copy/paste the output.
Review: