Store demo case file paths as absolute paths #4517

northwestwitch · 2024-03-20T08:45:11Z

This PR adds a functionality or fixes a bug.

Alternative to Fix remaining relative paths on load #4388 but using the pydantic case checker

Testing on cg-vm1 server (Clinical Genomics Stockholm)

Prepare for testing

Make sure the PR is pushed and available on Docker Hub
Fist book your testing time using the Pax software available at https://pax.scilifelab.se/. The resource you are going to call dibs on is scout-stage and the server is cg-vm1.
ssh <USER.NAME>@cg-vm1.scilifelab.se
sudo -iu hiseq.clinical
ssh localhost
(optional) Find out which scout branch is currently deployed on cg-vm1: podman ps
Stop the service with current deployed branch: systemctl --user stop scout.target
Start the scout service with the branch to test: systemctl --user start scout@<this_branch>
Make sure the branch is deployed: systemctl --user status scout.target
After testing is done, repeat procedure at https://pax.scilifelab.se/, which will release the allocated resource (scout-stage) to be used for testing by other users.

Testing on hasta server (Clinical Genomics Stockholm)

Prepare for testing

ssh <USER.NAME>@hasta.scilifelab.se
Book your testing time using the Pax software. us; paxa -u <user> -s hasta -r scout-stage. You can also use the WSGI Pax app available at https://pax.scilifelab.se/.
(optional) Find out which scout branch is currently deployed on cg-vm1: conda activate S_scout; pip freeze | grep scout-browser
Deploy the branch to test: bash /home/proj/production/servers/resources/hasta.scilifelab.se/update-tool-stage.sh -e S_scout -t scout -b <this_branch>
Make sure the branch is deployed: us; scout --version
After testing is done, repeat the paxa procedure, which will release the allocated resource (scout-stage) to be used for testing by other users.

How to test:

Locally, run scout setup demo using main branch and this branch
Check on MongoDB compass that file paths for demo cases are all absolute paths (on main branch they are relative paths)

Expected outcome:
The functionality should be working
Take a screenshot and attach or copy/paste the output.

Review:

code approved by DN
tests executed by CR, DN

northwestwitch · 2024-03-20T09:42:43Z

Main branch:

This branch:

codecov · 2024-03-20T09:43:46Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.68%. Comparing base (8ceb4df) to head (10a3a36).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4517      +/-   ##
==========================================
+ Coverage   84.63%   84.68%   +0.04%     
==========================================
  Files         310      310              
  Lines       18598    18612      +14     
==========================================
+ Hits        15741    15761      +20     
+ Misses       2857     2851       -6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

dnil · 2024-03-20T11:16:24Z

I see the will here, and it is good, but this does not fully solve either #4399 (it starts it) or #1394, while it conflicts with #4388, plus its a duplication of effort which we have previous little time for. Let me eat something first, but Im currently not a fan. 😆

northwestwitch · 2024-03-20T11:50:10Z

I see the will here, and it is good, but this does not fully solve either #4399 (it starts it) or #1394, while it conflicts with #4388, plus its a duplication of effort which we have previous little time for. Let me eat something first, but Im currently not a fan.

I don't really understand why we need to refactor case build and case loader (#4399) just to fix the fact that demo case doesn't have absolute but relative paths to the files? The purpose of pydantic is to validate and customise fields and this is just the case.
I think this is a simpler solution to the problem here. I also understood from the discussion in your PR that you agreed in a solution like this, instead of the code present in your PR? Perhaps I misunderstood?

dnil · 2024-03-20T12:37:46Z

This is not just about the demo case. The problem is most obvious there, since all new devs/users encounter it. It affects all use of relative paths on loading. Some of these are being checked for at various places in the code base, with varying success.

I do not mind seeing a pydantic class fixing the issue, it is great for it: we just need to make the abstraction clear, and move over other similar tasks from build to it as well so one knows where to look for them. Adding one more task to what is currently primarily the pydantic load validator makes its role more confusing. Is it for validation? For initial load conversion to db? Or perhaps for reading db documents for display? As we discussed in #4388.

We can refactor it, but I don't see why we should do all that refactoring before merging the first PR? And I dont see why we should start by solving a part of the original issue with this PR?

northwestwitch · 2024-03-20T12:46:20Z

This is not just about the demo case. The problem is most obvious there, since all new devs/users encounter it. It affects all use of relative paths on loading. Some of these are being checked for at various places in the code base, with varying success.

I do not mind seeing a pydantic class fixing the issue, it is great for it: we just need to make the abstraction clear, and move over other similar tasks from build to it as well so one knows where to look for them. Adding one more task to what is currently primarily the pydantic load validator makes its role more confusing. Is it for validation? For initial load conversion to db? Or perhaps for reading db documents for display? As we discussed in #4388.

We can refactor it, but I don't see why we should do all that refactoring before merging the first PR? And I dont see why we should start by solving a part of the original issue with this PR?

Mmm sorry but I don't really like the approach of PR #4388 to fix the issue. I agree with you that having the pydantic validation plus the case build is confusing and stuff might be refactored to be more clear. This probably comes from the fact that we had many checks in build before we switch to pydantic, but since now we have pydantic I think we should make the. most of it. In current main there is a method that already checks if the paths are valid. So instead of having one hundred lines of extra code to fix the paths outside the validation, you can have just 2 in that pydanic function that do just the same? It's way more easy to maintain as well I think?

northwestwitch · 2024-03-20T13:31:56Z

I think in these case we might solve the impasse like this: we leave both PRs open and we see who gets tired to see them laying around and approves one first 🤣

dnil · 2024-03-20T13:54:11Z

Right, neither is the solution we would like in the long run, and we could argue for a long time on wether its better to have something that solves the problem, and how complete an abstraction should be before introducing it. I think this is a multiplication of effort for no reason other than a realisation about a possible future path, but since you made it, I will review it and ask that it at least satisfies solving #1394. Once it works, we could then make sure to solve #4399 before the next release.

dnil

Looks convincing on VCF, but still does not fix e.g. custom images path. If you want, you could step through the changes in #4388 for the places I know of.

I haven't checked all places downstream that attempt to make an abs path for redundancy. I guess we may still want several of them to cover for old relative paths in the db.

The CaseLoader class now does part of what build_case() did before. I think the division you sketch is really good, but there are several equivalent sub functionalities left in build_case() that should move to CaseLoader to keep us from looking back and forth between the two.

Abstraction wise, we would want things that only needs to be done when parsing the config file in CaseLoader, and they should as per parsing in general also preferably not require the use of other db resources. Build would then handle the additional resources, and not have to worry about file integrity or syntax. We traditionally reserve dual use for the build function, so that they could sometimes be called when storing back items from prepped display object to db, but this is afaik not used for case objects - we only do field-by-field updates on them live.

northwestwitch · 2024-03-21T07:47:53Z

@dnil I've fixed paths to images (cases and vars), chromograph and reviewer. Comparing with the demo case config it looks like that's all

dnil · 2024-03-21T07:59:58Z

@dnil I've fixed paths to images (cases and vars), chromograph and reviewer. Comparing with the demo case config it looks like that's all

At the seminar now but will check later!

# Conflicts: # CHANGELOG.md

dnil

Thank you, this should now close #1394 afaik, so I approve. I guess we could say it starts #4399 as well, but there are other simpler value changes in build. They can also be determined by just looking at the config file (e.g. variable type casts, booleans for valid config values (like has_svvariants) and should go with the CaseLoader when allowing it to change values. But we can do them after.

dnil · 2024-03-26T06:02:01Z