Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert datetime values to strings before indexing (#3416) #3419

Merged

Conversation

dsotirho-ucsc
Copy link
Contributor

@dsotirho-ucsc dsotirho-ucsc commented Sep 10, 2021

#3416

Author

  • PR title references issue
  • PR title matches issue title (preceded by Fix: for bugs) or there is a good reason why they're different
  • Title of main commit references issue
  • PR is connected to Zenhub issue and description links to issue

Author (reindex)

  • Added r tag to commit title or this PR does not require reindexing
  • Added reindex label to PR or this PR does not require reindexing

Author (freebies & chains)

  • Freebies are blocked on this PR or there are no freebies in this PR
  • Freebies are referenced in commit titles or there are no freebies in this PR
  • This PR is blocked by previous PR in the chain or this PR is not chained to another PR
  • Added chain label to the blocking PR or this PR is not chained to another PR

Author (upgrading)

  • Documented upgrading of deployments in UPGRADING.rst or this PR does not require upgrading
  • Added u tag to commit title or this PR does not require upgrading
  • Added upgrade label to PR or this PR does not require upgrading
  • Added announcement to PR description or this PR does not require announcement

Author (requirements, before every review)

  • Ran make requirements_update or this PR leaves requirements*.txt, common.mk and Makefile untouched
  • Added R tag to commit title or this PR leaves requirements*.txt untouched
  • Added reqs label to PR or this PR leaves requirements*.txt untouched

Author (before every review)

  • make integration_test passes in personal deployment or this PR does not touch functionality that could break the IT
  • Rebased branch on develop, squashed old fixups

Primary reviewer (after approval)

  • Commented in issue about demo expectations or labelled issue as no demo
  • Decided if PR can be labeled no sandbox
  • PR title is appropriate as title of merge commit
  • Moved ticket to Approved column
  • Assigned PR to an operator

Operator (before pushing merge the commit)

  • Checked reindex label and r commit title tag
  • Checked that demo expectations are clear or issue is labeled as no demo
  • Rebased and squashed branch
  • Sanity-checked history
  • Pushed PR branch to Github
  • Branch pushed to Gitlab and added sandbox label or PR is labeled no sandbox
  • Build passed in sandbox or PR is labeled no sandbox
  • Started reindex in sandbox or this PR does not require reindexing sandbox
  • Checked for failures in sandbox or this PR does not require reindexing sandbox
  • Added PR reference to merge commit title
  • Collected commit title tags in merge commit title
  • Moved linked issue to Merged column
  • Pushed merge commit to Github

Operator (after pushing the merge commit)

  • Made announcement requested by author or PR description does not contain an announcement
  • Moved freebies to Merged column or there are no freebies in this PR
  • Shortened the PR chain or this PR is not the base of another PR
  • Verified that N reviews labelling is accurate
  • Pushed merge commit to Gitlab or this changes can be pushed later, together with another PR
  • Deleted PR branch from Github and Gitlab

Operator (reindex)

  • Started reindex in dev or this PR does not require reindexing or does not target dev
  • Checked for failures in dev or this PR does not require reindexing or does not target dev
  • Started reindex in prod or this PR does not require reindexing or does not target prod
  • Checked for failures in prod or this PR does not require reindexing or does not target prod

Operator

  • Unassigned PR

@github-actions github-actions bot added the orange [process] Done by the Azul team label Sep 10, 2021
@dsotirho-ucsc dsotirho-ucsc added reindex:dev [process] PR requires reindexing dev base [process] Another PR needs to be rebased before merging this one labels Sep 10, 2021
@codecov
Copy link

codecov bot commented Sep 10, 2021

Codecov Report

Merging #3419 (e7a3355) into develop (0a40364) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

❗ Current head e7a3355 differs from pull request most recent head 3f17533. Consider uploading reports for the commit 3f17533 to get more accurate results
Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #3419      +/-   ##
===========================================
- Coverage    82.66%   82.65%   -0.01%     
===========================================
  Files          123      123              
  Lines        14218    14212       -6     
===========================================
- Hits         11753    11747       -6     
  Misses        2465     2465              
Impacted Files Coverage Δ
src/azul/service/hca_response_v5.py 92.41% <ø> (-0.11%) ⬇️
src/azul/time.py 76.47% <ø> (ø)
src/azul/indexer/document.py 95.89% <100.00%> (+0.01%) ⬆️
src/azul/plugins/metadata/hca/transform.py 98.64% <100.00%> (+0.02%) ⬆️
src/azul/service/avro_pfb.py 95.76% <100.00%> (-0.43%) ⬇️
test/service/test_manifest_async.py 100.00% <0.00%> (ø)
src/azul/service/manifest_service.py 94.75% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0a40364...3f17533. Read the comment docs.

@coveralls
Copy link

coveralls commented Sep 10, 2021

Coverage Status

Coverage remained the same at 82.885% when pulling 3f17533 on issues/danielsotirhos/3416-convert-datetime-to-string into 0a40364 on develop.

@dsotirho-ucsc dsotirho-ucsc removed the request for review from amarjandu September 10, 2021 22:21
@dsotirho-ucsc dsotirho-ucsc removed the base [process] Another PR needs to be rebased before merging this one label Sep 14, 2021
@dsotirho-ucsc dsotirho-ucsc force-pushed the issues/danielsotirhos/3416-convert-datetime-to-string branch from c808295 to 521df86 Compare September 14, 2021 22:19
Copy link
Member

@achave11-ucsc achave11-ucsc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Only made an optional suggestion.

if value is None:
return self.null
else:
return format_dcp2_datetime(value)
parse_dcp2_datetime(value) # to verify a valid dcp2 datetime string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that this comment is necessary. Up to you.

@achave11-ucsc achave11-ucsc removed their assignment Sep 14, 2021
@dsotirho-ucsc dsotirho-ucsc force-pushed the issues/danielsotirhos/3416-convert-datetime-to-string branch from 521df86 to ea3d269 Compare September 15, 2021 17:17
src/azul/time.py Outdated
Comment on lines 136 to 138
if d is None:
return None
else:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A fixpoint like this is a smell. If None is a fixpoint in one function why is it not a fixpoint of every function?

Also, format and parse functions should be each other's inverse and this breaks that.

@hannes-ucsc hannes-ucsc removed their assignment Sep 17, 2021
@hannes-ucsc hannes-ucsc added 0 reviews [process] Lead didn't request any changes 1 review [process] Lead requested changes once and removed 0 reviews [process] Lead didn't request any changes labels Sep 17, 2021
update_date = format_dcp2_datetime(entity.update_date)
return {
'submission_date': format_dcp2_datetime(entity.submission_date),
'update_date': update_date,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
'update_date': update_date,
'update_date': None
if entity.update_date is None else
format_dcp2_datetime(entity.update_date),

src/azul/time.py Outdated
Comment on lines 129 to 133
>>> format_dcp2_datetime(None)
Traceback (most recent call last):
...
AttributeError: 'NoneType' object has no attribute 'tzinfo'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
>>> format_dcp2_datetime(None)
Traceback (most recent call last):
...
AttributeError: 'NoneType' object has no attribute 'tzinfo'

@hannes-ucsc hannes-ucsc added 2 reviews [process] Lead requested changes twice and removed 1 review [process] Lead requested changes once labels Sep 17, 2021
@hannes-ucsc hannes-ucsc removed their assignment Sep 17, 2021
@jessebrennan jessebrennan force-pushed the issues/danielsotirhos/3416-convert-datetime-to-string branch from 73a24e4 to 3f17533 Compare September 21, 2021 23:40
@jessebrennan jessebrennan added the sandbox [process] Resolution is being verified in sandbox deployment label Sep 21, 2021
@jessebrennan jessebrennan merged commit bd666cd into develop Sep 22, 2021
@jessebrennan jessebrennan deleted the issues/danielsotirhos/3416-convert-datetime-to-string branch September 22, 2021 20:07
@jessebrennan
Copy link
Contributor

Failures during reindexing in dev:

[
  {
    "spec": "tdr:broad-jade-dev-data:snapshot/hca_dev_20201203___20210524_lattice:/2",
    "uuid": "40733888-3b9f-500e-84fd-1a18a64aadbe"
  },
  {
    "spec": "tdr:broad-jade-dev-data:snapshot/hca_dev_20201203___20210524_lattice:/2",
    "uuid": "94565940-7b2f-5fd7-beb1-dc309687fe09"
  },
  {
    "spec": "tdr:broad-jade-dev-data:snapshot/hca_dev_20201203___20210524_lattice:/2",
    "uuid": "9d0f5cd1-0ef6-5eed-8dd7-f15d36a34185"
  },
  {
    "spec": "tdr:broad-jade-dev-data:snapshot/hca_dev_20201203___20210524_lattice:/2",
    "uuid": "ae338c4e-650f-545b-8631-3c8d755aa2f9"
  },
  {
    "spec": "tdr:broad-jade-dev-data:snapshot/hca_dev_20201203___20210524_lattice:/2",
    "uuid": "c12a6ca2-3234-5182-ad0b-f8c2fb881ab5"
  }
]

These failure are expected because of:

@hannes-ucsc
Copy link
Member

What's uuid above?

@jessebrennan
Copy link
Contributor

It's the bundle UUID. To produce this I ran:

jq '[.messages[].Body.notification|{"spec": .source.spec, "uuid": .match.bundle_uuid}]' azul-notifications_fail-dev.json 

@jessebrennan jessebrennan removed their assignment Sep 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 reviews [process] Lead requested changes twice orange [process] Done by the Azul team reindex:dev [process] PR requires reindexing dev sandbox [process] Resolution is being verified in sandbox deployment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants