Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix remaining relative paths on load #4388

Closed
wants to merge 72 commits into from
Closed

Fix remaining relative paths on load #4388

wants to merge 72 commits into from

Conversation

dnil
Copy link
Collaborator

@dnil dnil commented Feb 1, 2024

This PR adds a functionality or fixes a bug.

Several relative paths can potentially get stored on db on load, relative to the working directory at load time. This is very seldom what is intended: most endpoints that serve files in fact attempt to convert to absoulte paths when used, but if the server app is started in a different directory, this will fail. For CG Solna, this is also rarely a problem since our configs use abs file paths in general. It is however noticeable on demo setups and potentially confusing to new users.

The downside might be it closes a potential workaround for container volume mounting, where the absolute path might not be the same on the CLI as on the web app.

On the total, making the behaviour predictable (absolute paths) is preferable.

Testing on cg-vm1 server (Clinical Genomics Stockholm)

Prepare for testing

  1. Make sure the PR is pushed and available on Docker Hub
  2. Fist book your testing time using the Pax software available at https://pax.scilifelab.se/. The resource you are going to call dibs on is scout-stage and the server is cg-vm1.
  3. ssh <USER.NAME>@cg-vm1.scilifelab.se
  4. sudo -iu hiseq.clinical
  5. ssh localhost
  6. (optional) Find out which scout branch is currently deployed on cg-vm1: podman ps
  7. Stop the service with current deployed branch: systemctl --user stop scout.target
  8. Start the scout service with the branch to test: systemctl --user start scout@<this_branch>
  9. Make sure the branch is deployed: systemctl --user status scout.target
  10. After testing is done, repeat procedure at https://pax.scilifelab.se/, which will release the allocated resource (scout-stage) to be used for testing by other users.
Testing on hasta server (Clinical Genomics Stockholm)

Prepare for testing

  1. ssh <USER.NAME>@hasta.scilifelab.se
  2. Book your testing time using the Pax software. us; paxa -u <user> -s hasta -r scout-stage. You can also use the WSGI Pax app available at https://pax.scilifelab.se/.
  3. (optional) Find out which scout branch is currently deployed on cg-vm1: conda activate S_scout; pip freeze | grep scout-browser
  4. Deploy the branch to test: bash /home/proj/production/servers/resources/hasta.scilifelab.se/update-tool-stage.sh -e S_scout -t scout -b <this_branch>
  5. Make sure the branch is deployed: us; scout --version
  6. After testing is done, repeat the paxa procedure, which will release the allocated resource (scout-stage) to be used for testing by other users.

How to test:

  1. how to test it, possibly with real cases/data

Expected outcome:
The functionality should be working
Take a screenshot and attach or copy/paste the output.

Review:

  • code approved by
  • tests executed by

Copy link

codecov bot commented Feb 1, 2024

Codecov Report

Attention: Patch coverage is 96.42857% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 84.66%. Comparing base (8ceb4df) to head (3b88047).

Files Patch % Lines
scout/build/individual.py 90.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4388      +/-   ##
==========================================
+ Coverage   84.63%   84.66%   +0.02%     
==========================================
  Files         310      310              
  Lines       18598    18643      +45     
==========================================
+ Hits        15741    15784      +43     
- Misses       2857     2859       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dnil
Copy link
Collaborator Author

dnil commented Feb 1, 2024

Update demo case, before:
Screenshot 2024-02-01 at 10 08 05

After:
Screenshot 2024-02-01 at 10 08 40

@dnil
Copy link
Collaborator Author

dnil commented Feb 1, 2024

Meh, still:
Screenshot 2024-02-01 at 10 10 30
Recurses!

@northwestwitch
Copy link
Member

Recurses!

In the sense of cursing again? 😆

@dnil
Copy link
Collaborator Author

dnil commented Feb 1, 2024

Better!
Screenshot 2024-02-01 at 10 54 37

@dnil
Copy link
Collaborator Author

dnil commented Feb 1, 2024

Recurses!

In the sense of cursing again? 😆

Doubly so! Cursing again because the recursion is cursed again.

@dnil dnil marked this pull request as ready for review February 1, 2024 10:09
@northwestwitch
Copy link
Member

northwestwitch commented Feb 2, 2024

Wouldn't be easier to fix relative paths directly when the case load config is validated with Pydantic? The validator has a function that checks if the path (relative or complete) exists on disk. Ifwe change that function to not only do the validation, but also set the paths to full path then you don't need any extra code I think.

The validator itself has this tiny function that does the trick for demo files (at least those for which is required a validation at the moment)

@dnil
Copy link
Collaborator Author

dnil commented Feb 2, 2024

Wouldn't be easier to fix relative paths directly when the case load config is validated with Pydantic? The validator has a function that checks if the path (relative or complete) exists on disk. Ifwe change that function to not only do the validation, but also set the paths to full path then you don't need any extra code I think.

The validator itself has this tiny function that does the trick for demo files (at least those for which is required a validation at the moment)

I wouldn't say easier, but I'm looking through now, and Im starting to agree that it would make sense to try to do it there when we anyway touch those files/variables.

As you say, there is already one that is called for some of the demo files, that must just be slightly bugged since we do have tons of relative paths on the demo.

I guess my only fundamental issue with that is that it will confound the role of that class, as it changes the returned values compared to what is in the config file as such, before we get around to storing it on db. If we start doing this there, we could start doing all of the build_case() tasks there as well. Which on the whole I really don't mind! In contrast to some of the weirder build functions, build_case() is only called on load, so should be somewhat straightforward.

@dnil
Copy link
Collaborator Author

dnil commented Feb 2, 2024

So, let's merge this perhaps: unless you find something wrong with it, that would really be the easiest. Then we can refactor the functionality of build_case() into the Pydantic CaseLoader, which will then graduate from just validation (and silently fixing some castable variable types) into a CaseBuilder.

@northwestwitch
Copy link
Member

As you say, there is already one that is called for some of the demo files, that must just be slightly bugged since we do have tons of relative paths on the demo.

I think it just checks that the relative paths are valid, but it doesn't fix it yet.

Regarding the pydantic class role: It does the fixing of other fields, for instance converting strings to integers, fixes the path for the str variants images, etc. So why not this?

@dnil
Copy link
Collaborator Author

dnil commented Feb 2, 2024

As you say, there is already one that is called for some of the demo files, that must just be slightly bugged since we do have tons of relative paths on the demo.

I think it just checks that the relative paths are valid, but it doesn't fix it yet.

Right, right; it checks that the absolute path version of the relative one would be valid.

Regarding the pydantic class role: It does the fixing of other fields, for instance converting strings to integers, fixes the path for the str variants images, etc. So why not this?

No, exactly - I agree! That would probably be the most standard use of the Pydantic Class, and is consistent with it being called to parse the case config. But there are more things that build_case() does, that it should do. Like check if the config had a case owner, sorting individuals affected first. Most of the renaming of file keys into db keys would make sense there as well, wouldn't it? I guess we still want a small build, bringing in data from other documents, like panels, phenotype terms etc but much could just be done in the CaseLoader!

Copy link

sonarcloud bot commented Mar 25, 2024

Quality Gate Passed Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@dnil dnil closed this Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants