Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggested changes to indices post consult session #4308

Merged
merged 104 commits into from
Mar 26, 2024
Merged

Suggested changes to indices post consult session #4308

merged 104 commits into from
Mar 26, 2024

Conversation

dnil
Copy link
Collaborator

@dnil dnil commented Dec 21, 2023

This PR adds a functionality or fixes a bug.

Testing on cg-vm1 server (Clinical Genomics Stockholm)

Prepare for testing

  1. Make sure the PR is pushed and available on Docker Hub
  2. Fist book your testing time using the Pax software available at https://pax.scilifelab.se/. The resource you are going to call dibs on is scout-stage and the server is cg-vm1.
  3. ssh <USER.NAME>@cg-vm1.scilifelab.se
  4. sudo -iu hiseq.clinical
  5. ssh localhost
  6. (optional) Find out which scout branch is currently deployed on cg-vm1: podman ps
  7. Stop the service with current deployed branch: systemctl --user stop scout.target
  8. Start the scout service with the branch to test: systemctl --user start scout@<this_branch>
  9. Make sure the branch is deployed: systemctl --user status scout.target
  10. After testing is done, repeat procedure at https://pax.scilifelab.se/, which will release the allocated resource (scout-stage) to be used for testing by other users.
Testing on hasta server (Clinical Genomics Stockholm)

Prepare for testing

  1. ssh <USER.NAME>@hasta.scilifelab.se
  2. Book your testing time using the Pax software. us; paxa -u <user> -s hasta -r scout-stage. You can also use the WSGI Pax app available at https://pax.scilifelab.se/.
  3. (optional) Find out which scout branch is currently deployed on cg-vm1: conda activate S_scout; pip freeze | grep scout-browser
  4. Deploy the branch to test: bash /home/proj/production/servers/resources/hasta.scilifelab.se/update-tool-stage.sh -e S_scout -t scout -b <this_branch>
  5. Make sure the branch is deployed: us; scout --version
  6. After testing is done, repeat the paxa procedure, which will release the allocated resource (scout-stage) to be used for testing by other users.

How to test:

  1. how to test it, possibly with real cases/data

Expected outcome:
The functionality should be working
Take a screenshot and attach or copy/paste the output.

Review:

  • code approved by
  • tests executed by

@dnil
Copy link
Collaborator Author

dnil commented Dec 21, 2023

scout [primary] scout> db.variant.aggregate( [ { $indexStats: { } } ] )
[
  {
    name: 'caseid_variantrank',
    key: { case_id: 1, category: 1, variant_rank: 1 },
    host: 'cg-mongo1-prod.scilifelab.se:27019',
    accesses: { ops: Long("15929"), since: ISODate("2023-11-16T10:27:35.410Z") },
    spec: {
      v: 1,
      key: { case_id: 1, category: 1, variant_rank: 1 },
      name: 'caseid_variantrank'
    }
  },
  {
    name: 'case_id_1_category_1_variant_type_1_rank_score_-1',
    key: { case_id: 1, category: 1, variant_type: 1, rank_score: -1 },
    host: 'cg-mongo1-prod.scilifelab.se:27019',
    accesses: { ops: Long("363994"), since: ISODate("2023-11-16T10:27:35.410Z") },
    spec: {
      v: 1,
      key: { case_id: 1, category: 1, variant_type: 1, rank_score: -1 },
      name: 'case_id_1_category_1_variant_type_1_rank_score_-1'
    }
  },
  {
    name: 'caseid_variantid',
    key: { case_id: 1, category: 1, variant_id: 1 },
    host: 'cg-mongo1-prod.scilifelab.se:27019',
    accesses: { ops: Long("413455"), since: ISODate("2023-11-16T10:27:35.410Z") },
    spec: {
      v: 1,
      key: { case_id: 1, category: 1, variant_id: 1 },
      name: 'caseid_variantid'
    }
  },
  {
    name: '_id_',
    key: { _id: 1 },
    host: 'cg-mongo1-prod.scilifelab.se:27019',
    accesses: {
      ops: Long("23395992"),
      since: ISODate("2023-11-16T10:27:35.410Z")
    },
    spec: { v: 2, key: { _id: 1 }, name: '_id_' }
  },
  {
    name: 'caseid_rankscore',
    key: { case_id: 1, category: 1, rank_score: -1 },
    host: 'cg-mongo1-prod.scilifelab.se:27019',
        end: 1
      },
      name: 'caseid_category_chromosome_start_end'
    }
  },
  {
    name: 'sanger',
    key: { sanger_ordered: 1 },
    host: 'cg-mongo1-prod.scilifelab.se:27019',
    accesses: { ops: Long("0"), since: ISODate("2023-11-16T10:27:35.410Z") },
    spec: { v: 1, key: { sanger_ordered: 1 }, name: 'sanger', sparse: true }
  },
  {
    name: 'hgnc_symbols_1_rank_score_-1_category_1_variant_type_1',
    key: { hgnc_symbols: 1, rank_score: -1, category: 1, variant_type: 1 },
    host: 'cg-mongo1-prod.scilifelab.se:27019',
    accesses: { ops: Long("160"), since: ISODate("2023-11-16T10:27:35.410Z") },
    spec: {
      v: 1,
      key: { hgnc_symbols: 1, rank_score: -1, category: 1, variant_type: 1 },
      name: 'hgnc_symbols_1_rank_score_-1_category_1_variant_type_1',
      partialFilterExpression: { rank_score: { '$gte': 5 } }
    }
  },
  {
    name: 'base_variant_idx',
    key: {
      case_id: 1,
      category: 1,
      variant_type: 1,
      panels: 1,
      variant_rank: 1
    },
    host: 'cg-mongo1-prod.scilifelab.se:27019',
    accesses: { ops: Long("22398"), since: ISODate("2023-11-16T10:27:35.410Z") },
    spec: {
      v: 1,
      key: {
        case_id: 1,
        category: 1,
        variant_type: 1,
        panels: 1,
        variant_rank: 1
      },
      name: 'base_variant_idx'
    }
  }
]

requirements.txt Outdated Show resolved Hide resolved
@@ -62,11 +54,11 @@
),
IndexModel(
[
("variant_id", ASCENDING),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I think we are good with those really; the rest we had was for indexes that we keep around, but that are not in the official constants file. Before we start dropping on stage, some testing is in order so we know it is safe to do on prod with consistent performance.

Copy link

codecov bot commented Dec 21, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.68%. Comparing base (8c3dadc) to head (5305875).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #4308   +/-   ##
=======================================
  Coverage   84.68%   84.68%           
=======================================
  Files         310      310           
  Lines       18612    18612           
=======================================
  Hits        15761    15761           
  Misses       2851     2851           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dnil dnil marked this pull request as ready for review December 21, 2023 16:40
Copy link
Member

@northwestwitch northwestwitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I'm closing my PR, which a partial duplication of this one!

scout/constants/indexes.py Outdated Show resolved Hide resolved
@northwestwitch
Copy link
Member

I wanted to deploy and test this during the holidays but haven't had time in the end. We have to find another opportunity window, easier on stage than prod, but still cumbersome if we are in the middle of testing stuff from other PRs

@dnil
Copy link
Collaborator Author

dnil commented Mar 18, 2024

Verifying that update is indeed applied to stage:
Screenshot 2024-03-18 at 09 44 33

@dnil
Copy link
Collaborator Author

dnil commented Mar 18, 2024

Pruning the original collections for the ones that got rearranged.

The most tenacious one is this:
Screenshot 2024-03-18 at 09 54 10
since it is in use on loading, and in particular when updating the variant_rank
Screenshot 2024-03-18 at 10 00 20
It is a one-off: the compute to sort might be have been done in excess when updating cases, but we have actually deleted the old variants before inserting again.

It is also used in that delete variants command that we discussed at some point. It sees very little use, and could easily be rewritten to sort on variant_rank.

@dnil
Copy link
Collaborator Author

dnil commented Mar 18, 2024

Stage variant is now nicely in sync with constants:
Screenshot 2024-03-18 at 10 17 18
Proceeding with loqusdb now.

Copy link

sonarcloud bot commented Mar 26, 2024

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@dnil
Copy link
Collaborator Author

dnil commented Mar 26, 2024

This has been running stably on stage for a while. Let's merge, aiming at starting the background reindex over easter.

@dnil dnil merged commit 0248dd4 into main Mar 26, 2024
20 checks passed
@northwestwitch northwestwitch deleted the index_suggestions branch April 9, 2024 11:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants