Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow to use different hashing algorithms #133

Merged
merged 7 commits into from
Feb 5, 2020

Conversation

Xarthisius
Copy link
Contributor

@Xarthisius Xarthisius commented Jan 9, 2020

Rationale

I'm trying to use pooch to manage/fetch data stored in external data repositories. All of them provide some sort of checksum, unfortunately it's not always SHA256. Ideally I'd like to generate pooch's registry file using hashes obtained directly from a given data repository. However, I need to be able to use proper hashing algorithm for data verification.

Implementation

Extend registry format from:

name hash url

to

name alg:hash url

Whenever old format registry is detected (i.e. : is not present in hash) sha256: is prepended to the hash automatically, which ensures backward compatibility.

TODO

  • Update docs

@welcome
Copy link

welcome bot commented Jan 9, 2020

💖 Thanks for opening this pull request! 💖

Please make sure you read our contributing guidelines and abide by our code of conduct.

A few things to keep in mind:

  • Remember to run make format to make sure your code follows our style guide.
  • If you need help writing tests, take a look at the existing ones for inspiration. If you don't know where to start, let us know and we'll walk you through it.
  • All new features should be documented. It helps to write the docstrings for your functions/classes before writing the code. This will help you think about your code design and results in better code.
  • No matter what, we are really grateful that you put in the effort to do this! 🎉

Copy link
Member

@leouieda leouieda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Xarthisius thank you for taking the time to implement this! 🎉 I was actually just thinking of this a couple of days ago. I left a few comments (mostly minor) that need to be addressed before moving forward. Let me know if you need any feedback/help/opinion.

pooch/core.py Outdated Show resolved Hide resolved
pooch/core.py Show resolved Hide resolved
pooch/core.py Outdated Show resolved Hide resolved
pooch/core.py Outdated Show resolved Hide resolved
pooch/core.py Outdated Show resolved Hide resolved
pooch/tests/utils.py Outdated Show resolved Hide resolved
pooch/utils.py Outdated Show resolved Hide resolved
}
return registry


def add_hash_algs(registry):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be a useful function to have in pooch/core.py? The same code is used in Pooch.__init__.

@leouieda
Copy link
Member

@Xarthisius thanks for making those changes. Looks like this is almost good to go! Only need some text in the docstrings about specifying the hash algorithms and that one comment about the add_hash_algs function.

If you have the time, it would also be nice to have a short section in doc/usage.rst page about specifying different hashes.

@leouieda
Copy link
Member

It would also be nice to have at least 1 test using a different hash algorithm.

@Xarthisius Xarthisius force-pushed the hash_algs branch 2 times, most recently from 44d993f to e994336 Compare January 29, 2020 17:13
@Xarthisius Xarthisius changed the title [WIP/RFC] Allow to use different hashing algorithms Allow to use different hashing algorithms Jan 29, 2020
@Xarthisius
Copy link
Contributor Author

@leouieda I believe I addressed all the comments. I'd appreciate a 2nd review.

@leouieda
Copy link
Member

leouieda commented Feb 5, 2020

@Xarthisius looks good to me 👍 Merging this in!

Thank you for the contribution to Pooch! It's much appreciated. Please feel free to open another PR adding yourself to the AUTHORS.md file (if you want, of course).

@leouieda leouieda merged commit 548fac0 into fatiando:master Feb 5, 2020
@welcome
Copy link

welcome bot commented Feb 5, 2020

🎉🎉🎉 Congrats on merging your first pull request and welcome to the team! 🎉🎉🎉

Please open a new pull request to add yourself to the AUTHORS.md file. We hope that this was a good experience for you. Let us know if there is any way that the contributing process could be improved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants