Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: TDR dev dataset is stale (HumanCellAtlas/dcp2#17) #3441

Merged
merged 3 commits into from
Oct 2, 2021

Conversation

jessebrennan
Copy link
Contributor

@jessebrennan jessebrennan commented Sep 17, 2021

HumanCellAtlas/dcp2#17

Author

  • PR title references issue
  • PR title matches issue title (preceded by Fix: for bugs) or there is a good reason why they're different
  • Title of main commit references issue
  • PR is connected to Zenhub issue and description links to issue

Author (reindex)

  • Added r tag to commit title or this PR does not require reindexing
  • Added reindex label to PR or this PR does not require reindexing

Author (freebies & chains)

  • Freebies are blocked on this PR or there are no freebies in this PR
  • Freebies are referenced in commit titles or there are no freebies in this PR
  • This PR is blocked by previous PR in the chain or this PR is not chained to another PR
  • Added chain label to the blocking PR or this PR is not chained to another PR

Author (upgrading)

  • Documented upgrading of deployments in UPGRADING.rst or this PR does not require upgrading
  • Added u tag to commit title or this PR does not require upgrading
  • Added upgrade label to PR or this PR does not require upgrading
  • Added announcement to PR description or this PR does not require announcement

Author (requirements, before every review)

  • Ran make requirements_update or this PR leaves requirements*.txt, common.mk and Makefile untouched
  • Added R tag to commit title or this PR leaves requirements*.txt untouched
  • Added reqs label to PR or this PR leaves requirements*.txt untouched

Author (before every review)

  • make integration_test passes in personal deployment or this PR does not touch functionality that could break the IT
  • Rebased branch on develop, squashed old fixups

Primary reviewer (after approval)

  • Commented in issue about demo expectations or labelled issue as no demo
  • Decided if PR can be labeled no sandbox
  • PR title is appropriate as title of merge commit
  • Moved ticket to Approved column
  • Assigned PR to an operator

Operator (before pushing merge the commit)

  • Checked reindex label and r commit title tag
  • Checked that demo expectations are clear or issue is labeled as no demo
  • Rebased and squashed branch
  • Sanity-checked history
  • Pushed PR branch to Github
  • Follow upgrading instructions with sandbox selected
  • Branch pushed to Gitlab and added sandbox label or PR is labeled no sandbox
  • Build passed in sandbox or PR is labeled no sandbox
  • Started reindex in sandbox or this PR does not require reindexing sandbox
  • Checked for failures in sandbox or this PR does not require reindexing sandbox
  • Follow upgrading instructions with dev selected
  • Added PR reference to merge commit title
  • Collected commit title tags in merge commit title
  • Moved linked issue to Merged column
  • Pushed merge commit to Github

Operator (after pushing the merge commit)

  • Announce that devs will need to follow upgrading instructions
  • Moved freebies to Merged column or there are no freebies in this PR
  • Shortened the PR chain or this PR is not the base of another PR
  • Verified that N reviews labelling is accurate
  • Pushed merge commit to Gitlab or this changes can be pushed later, together with another PR
  • Deleted PR branch from Github and Gitlab

Operator (reindex)

  • Started reindex in dev or this PR does not require reindexing or does not target dev
  • Checked for failures in dev or this PR does not require reindexing or does not target dev
  • Announce whether reindexing these snapshots succeeded or failed
  • Started reindex in prod or this PR does not require reindexing or does not target prod
  • Checked for failures in prod or this PR does not require reindexing or does not target prod

Operator

  • Unassigned PR

@github-actions github-actions bot added the orange [process] Done by the Azul team label Sep 17, 2021
@jessebrennan jessebrennan added the reindex:dev [process] PR requires reindexing dev label Sep 17, 2021
@codecov
Copy link

codecov bot commented Sep 17, 2021

Codecov Report

Merging #3441 (d5c935f) into develop (019f103) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff            @@
##           develop    #3441   +/-   ##
========================================
  Coverage    82.02%   82.02%           
========================================
  Files          123      123           
  Lines        14417    14417           
========================================
  Hits         11825    11825           
  Misses        2592     2592           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 019f103...d5c935f. Read the comment docs.

@coveralls
Copy link

coveralls commented Sep 17, 2021

Coverage Status

Coverage remained the same at 82.265% when pulling d5c935f on issues/jesse/dcp2-17-dev-snapshots into 019f103 on develop.

@amarjandu amarjandu removed their assignment Sep 17, 2021
'tdr:datarepo-dev-87beea11:snapshot/hca_dev_ae71be1dddd84feb9bed24c3ddb6e1ad__20210916_20210916:4',
# Managed access snapshots:
'tdr:datarepo-dev-d4b988d6:snapshot/hca_dev_a004b1501c364af69bbd070c06dbc17d__20210830_20210903:4',
'tdr:datarepo-dev-02c59b72:snapshot/hca_dev_99101928d9b14aafb759e97958ac7403__20210830_20210903:4',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1190 links total, 84 under the common prefix 4

@jessebrennan
Copy link
Contributor Author

Since we can't attach this to its ticket via Zenhub because of a permissions issue @hannes-ucsc mentioned, I'm assigning myself so that it stays on my board.

Copy link
Contributor

@amarjandu amarjandu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to update google project id.

'tdr:datarepo-dev-4c3e6011:snapshot/hca_dev_248fcf0316c64a41b6ccaad4d894ca42__20210907_20210907:4',
'tdr:datarepo-dev-76de829d:snapshot/hca_dev_2043c65a1cf84828a6569e247d4e64f1__20210831_20210907:4',
'tdr:datarepo-dev-24e9529e:snapshot/hca_dev_f83165c5e2ea4d15a5cf33f3550bffde__20210901_20210908:4',
'tdr:datarepo-dev-87beea11:snapshot/hca_dev_ae71be1dddd84feb9bed24c3ddb6e1ad__20210916_20210916:4',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The project name is wrong here. datarepo-dev-1dce87e5 should be used.
TDR team updated the spreadsheet.
https://docs.google.com/spreadsheets/d/1yYiB3TrX0wvLCHvtWY1GTZZcAf5MmMREgn52OKCHsqs/edit#gid=1005857707

'tdr:datarepo-dev-4c3e6011:snapshot/hca_dev_248fcf0316c64a41b6ccaad4d894ca42__20210907_20210907:',
'tdr:datarepo-dev-76de829d:snapshot/hca_dev_2043c65a1cf84828a6569e247d4e64f1__20210831_20210907:',
'tdr:datarepo-dev-24e9529e:snapshot/hca_dev_f83165c5e2ea4d15a5cf33f3550bffde__20210901_20210908:',
'tdr:datarepo-dev-87beea11:snapshot/hca_dev_ae71be1dddd84feb9bed24c3ddb6e1ad__20210916_20210916:',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here too

Comment on lines 155 to 158
# Managed access snapshots:
'tdr:datarepo-dev-d4b988d6:snapshot/hca_dev_a004b1501c364af69bbd070c06dbc17d__20210830_20210903:',
'tdr:datarepo-dev-02c59b72:snapshot/hca_dev_99101928d9b14aafb759e97958ac7403__20210830_20210903:',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed my mind. We should list these in the same order as they occur in the spreadsheet.

Each of the two MA sources should have the comment

# Managed access:

preceding it.

@hannes-ucsc hannes-ucsc added the 0 reviews [process] Lead didn't request any changes label Sep 21, 2021
@hannes-ucsc hannes-ucsc removed their assignment Sep 21, 2021
@jessebrennan jessebrennan force-pushed the issues/jesse/dcp2-17-dev-snapshots branch from 55aa7e1 to 237cf41 Compare September 21, 2021 17:47
@hannes-ucsc
Copy link
Member

I'll provide demo instructions later.

@nadove-ucsc nadove-ucsc force-pushed the issues/jesse/dcp2-17-dev-snapshots branch from ca832a8 to dcd6e1b Compare October 2, 2021 00:33
@nadove-ucsc nadove-ucsc added the sandbox [process] Resolution is being verified in sandbox deployment label Oct 2, 2021
@nadove-ucsc
Copy link
Contributor

Failed in GitLab due to inconsistent snapshot storage locations: https://gitlab.dev.singlecell.gi.ucsc.edu/ucsc/azul/-/jobs/30152

The offending snapshots (location != 'US') are

TDRClient.TDRSource(project='datarepo-dev-bdc9f342', id='d5222430-0f7a-4bd1-8706-c58b4397a9b8', location='us-central1')
TDRClient.TDRSource(project='datarepo-dev-12b7a9e1', id='58a0eacf-0caa-449b-9fc8-4fe629e99176', location='us-central1')
TDRClient.TDRSource(project='datarepo-dev-02c59b72', id='cddb722a-4229-47bb-ba86-736b7898406f', location='us-central1')
TDRClient.TDRSource(project='datarepo-dev-71926fdc', id='403ef1b1-7b7f-4e13-bf4a-d898f047310a', location='us-central1')
TDRClient.TDRSource(project='datarepo-dev-dbc582d9', id='79f24905-a9e2-4328-91da-fc18036b77eb', location='us-central1')
TDRClient.TDRSource(project='datarepo-dev-10f0610a', id='2c5901f6-67bf-40cc-aea1-621a60c19baa', location='us-central1')
TDRClient.TDRSource(project='datarepo-dev-71de019e', id='768c4624-c07b-4496-8b7b-76275a149774', location='us-central1')
TDRClient.TDRSource(project='datarepo-dev-78bae095', id='f142a2f7-4c50-458d-a047-27bdb8e0d367', location='us-central1')
TDRClient.TDRSource(project='datarepo-dev-a9252919', id='90c2d68f-a000-4c8d-a493-8be93a71b44d', location='us-central1')
TDRClient.TDRSource(project='datarepo-dev-1c2c69d9', id='c594fc9a-f1c5-4770-acaa-013c2852e660', location='us-central1')
TDRClient.TDRSource(project='datarepo-dev-a198b032', id='38c7e1c4-db48-4cd7-bdf2-36bebc169eac', location='us-central1')
TDRClient.TDRSource(project='datarepo-dev-d4b988d6', id='2c4ec571-f489-4277-b215-da8dedd86803', location='us-central1')
TDRClient.TDRSource(project='datarepo-dev-7b7daff7', id='385eb563-005e-43ed-adb7-d3c9e0a8ae93', location='us-central1')

@nadove-ucsc nadove-ucsc force-pushed the issues/jesse/dcp2-17-dev-snapshots branch from 0c99854 to d5c935f Compare October 2, 2021 02:15
@nadove-ucsc nadove-ucsc merged commit ff56875 into develop Oct 2, 2021
@nadove-ucsc nadove-ucsc deleted the issues/jesse/dcp2-17-dev-snapshots branch October 2, 2021 03:55
@nadove-ucsc nadove-ucsc restored the issues/jesse/dcp2-17-dev-snapshots branch October 2, 2021 06:30
@nadove-ucsc nadove-ucsc deleted the issues/jesse/dcp2-17-dev-snapshots branch October 2, 2021 06:37
@hannes-ucsc
Copy link
Member

Reindex in dev resulted in 16 failed notifications, no failed tallies.

Due to the catalog duplication, there are 8 distinct failures.

7 tracked in HumanCellAtlas/dcp2#45
1 tracked in HumanCellAtlas/dcp2#48

@hannes-ucsc
Copy link
Member

@jessebrennan, the checklist item "Announce whether reindexing these snapshots succeeded or failed" is lacking the information where to announce that.

@hannes-ucsc
Copy link
Member


Before upgrading to this commit, run::

python scripts/reindex.py --delete --catalog dcp2ebi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't delete the corresponding IT catalog it2ebi and the flag name is --catalogs not --catalog.

@hannes-ucsc hannes-ucsc added 2 reviews [process] Lead requested changes twice and removed 1 review [process] Lead requested changes once labels Oct 4, 2021
@hannes-ucsc
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 reviews [process] Lead requested changes twice orange [process] Done by the Azul team reindex:dev [process] PR requires reindexing dev sandbox [process] Resolution is being verified in sandbox deployment upgrade [process] PR includes commit requiring manual upgrade
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants