Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add signing and verification CLI scripts. #240

Merged
merged 2 commits into from
Aug 21, 2024

Conversation

susperius
Copy link
Contributor

Summary

This PR adds scripts to sign and verify FileLevelManifest. It combines the existing functionality to create a fully working proof of concept.

Signing workflow:

  1. The model folder is hashed using the SimpleFileHasher and FilesSerializer
  2. The resulting FileLevelManifest is converted into an in-toto statement. This is only done because sigstore does not support any other payloads in DSSE envelopes.
  3. The in-toto statement is signed and wrapped into an DSSE envelope, which in turn is wrapped into a sigstore bundle.
  4. The sigstore bundle is converted to json and stored to disk

Verification workflow:

  1. The sigstore bundle is loaded from disk
  2. The bundle's signature is checked
  3. The local model files are serialized into a FileLevelManifest using the SimpleFileHasher and FilesSerializer.
  4. The in-toto statement is converted to a FileLevelManifest
  5. The hashes are compared

Release Note

  • CLI scripts sign.py and verify.py to sign and verify models
  • Conversion functionality from FileLevelManifest to in-toto statement and the other way around
  • Adjusted README to reflect the new scripts.
  • Removed old main.py and model.py

@mihaimaruseac
Copy link
Collaborator

Hmm, this is slightly working against the support needed for the internal support (BCID) :-/

@susperius
Copy link
Contributor Author

susperius commented Jul 17, 2024

Hmm, this is slightly working against the support needed for the internal support (BCID) :-/

Is it about the conversion? the changes to the manifest or the CLI tool itself?

PLMK if I can help and if it's worth to try and fix the unit tests. I'd like to avoid wasting time on that if the final call is going to be that we can't merge this change.

@mihaimaruseac
Copy link
Collaborator

Let's wait a little, as I'm still testing those parts (mostly the serialization of the manifest need to support multiple formats, so it cannot be done from FileManifest itself).

@susperius
Copy link
Contributor Author

Let's wait a little, as I'm still testing those parts (mostly the serialization of the manifest need to support multiple formats, so it cannot be done from FileManifest itself).

This is just the start to finally bring the functionality of the PoC into the repo. Afterwards we can work on supporting the other manifest types.

@susperius susperius force-pushed the feat/new_manifest_cli branch 2 times, most recently from 8242bbb to a168f07 Compare July 19, 2024 13:14
model_signing/README.md Outdated Show resolved Hide resolved
model_signing/manifest/in_toto.py Outdated Show resolved Hide resolved
model_signing/manifest/manifest.py Outdated Show resolved Hide resolved
model_signing/manifest/manifest.py Outdated Show resolved Hide resolved
model_signing/manifest/manifest.py Outdated Show resolved Hide resolved
model_signing/sign.py Outdated Show resolved Hide resolved
model_signing/sign.py Outdated Show resolved Hide resolved
model_signing/sign.py Outdated Show resolved Hide resolved
model_signing/serialization/serialize_by_file.py Outdated Show resolved Hide resolved
mihaimaruseac added a commit to mihaimaruseac/model-transparency that referenced this pull request Jul 24, 2024
THIS IS DRAFT, WIP. Will split into separate PRs once it works. But
posting publicly to show what the plans are (sigstore#224, sigstore#248, sigstore#240, sigstore#111).

Signed-off-by: Mihai Maruseac <mihaimaruseac@google.com>
mihaimaruseac added a commit to mihaimaruseac/model-transparency that referenced this pull request Jul 24, 2024
THIS IS DRAFT, WIP. Will split into separate PRs once it works. But
posting publicly to show what the plans are (sigstore#224, sigstore#248, sigstore#240, sigstore#111).

Signed-off-by: Mihai Maruseac <mihaimaruseac@google.com>
mihaimaruseac added a commit to mihaimaruseac/model-transparency that referenced this pull request Jul 24, 2024
THIS IS DRAFT, WIP. Will split into separate PRs once it works. But
posting publicly to show what the plans are (sigstore#224, sigstore#248, sigstore#240, sigstore#111).

Signed-off-by: Mihai Maruseac <mihaimaruseac@google.com>
mihaimaruseac added a commit to mihaimaruseac/model-transparency that referenced this pull request Jul 24, 2024
THIS IS DRAFT, WIP. Will split into separate PRs once it works. But
posting publicly to show what the plans are (sigstore#224, sigstore#248, sigstore#240, sigstore#111).

Signed-off-by: Mihai Maruseac <mihaimaruseac@google.com>
@mihaimaruseac
Copy link
Collaborator

Sorry, just saw this was sent at the same time as #276. Can you rebase again, please? There are several changes here that are no longer needed.

While here, can we use the existing main.py for the CLI instead of 2 files for signing and verification? argparse supports command modes, and this would also match sigstore-python's CLI (which would be nice, since then we can say that signatures can be verified with sigstore-python too, but not at the same level of detail (and not always))

@susperius
Copy link
Contributor Author

susperius commented Aug 6, 2024

Sorry, just saw this was sent at the same time as #276. Can you rebase again, please? There are several changes here that are no longer needed.

While here, can we use the existing main.py for the CLI instead of 2 files for signing and verification? argparse supports command modes, and this would also match sigstore-python's CLI (which would be nice, since then we can say that signatures can be verified with sigstore-python too, but not at the same level of detail (and not always))

That's very unfortunate.

Then I'm going to rebase. I decided against a single main script because it becomes way too cluttered so I prefer the one script per function approach.

Copy link
Collaborator

@mihaimaruseac mihaimaruseac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still prefer just one CLI script instead of two. A lot of the changes would no longer be needed if there is just one script.

Also, it seems you're redefining existing machinery.

model_signing/model.py Outdated Show resolved Hide resolved
model_signing/model.py Outdated Show resolved Hide resolved
model_signing/model.py Outdated Show resolved Hide resolved
model_signing/model.py Outdated Show resolved Hide resolved
model_signing/model.py Outdated Show resolved Hide resolved
sign_model.py Outdated Show resolved Hide resolved
sign_model.py Outdated Show resolved Hide resolved
model_signing/signature/__init__.py Outdated Show resolved Hide resolved
model_signing/README.md Outdated Show resolved Hide resolved
model_signing/README.md Outdated Show resolved Hide resolved
@mihaimaruseac mihaimaruseac added this to the V1 release milestone Aug 6, 2024
model_signing/model.py Outdated Show resolved Hide resolved
verify_model.py Outdated Show resolved Hide resolved
@laurentsimon
Copy link
Collaborator

laurentsimon commented Aug 14, 2024

Just catching up on this PR. What remains to be updated before we get a chance to merge?

@mihaimaruseac
Copy link
Collaborator

Just catching up no this PR. What remains to be updated before we get a chance to merge?

Most importantly, moving all the experimental IntotoSignature/IntotoSigner/IntotoVerifier to signature/ (#240 (comment)) as the place where they are right now require a large amount of copybara changes (via flaky rules that remove text if found in file -- as soon as the text changes the rule does not trigger). The same would then happen to the changes in the test (model_signing/signing/in_toto_test.py).

I'd still say we should have sign/verify in a single script, to match sigstore-python (#240 (comment)). The last iteration of the code here already uses subcommands to select the signer, but that should be via a flag, with a default, so we don't force users to always pass sigstore/pki/etc. Instead, sign/verify, which are related modes, with related arguments, should be unified.

There's also the fact that the README has unrelated changes (https://github.com/sigstore/model-transparency/pull/240/files#r1705707987) but I have to edit it anyway in/after #283 so this is not high priority/blocker.

@susperius
Copy link
Contributor Author

Just catching up no this PR. What remains to be updated before we get a chance to merge?

Most importantly, moving all the experimental IntotoSignature/IntotoSigner/IntotoVerifier to signature/ (#240 (comment)) as the place where they are right now require a large amount of copybara changes (via flaky rules that remove text if found in file -- as soon as the text changes the rule does not trigger). The same would then happen to the changes in the test (model_signing/signing/in_toto_test.py).

I'd still say we should have sign/verify in a single script, to match sigstore-python (#240 (comment)). The last iteration of the code here already uses subcommands to select the signer, but that should be via a flag, with a default, so we don't force users to always pass sigstore/pki/etc. Instead, sign/verify, which are related modes, with related arguments, should be unified.

There's also the fact that the README has unrelated changes (https://github.com/sigstore/model-transparency/pull/240/files#r1705707987) but I have to edit it anyway in/after #283 so this is not high priority/blocker.

I'm still arguing that it isn't experimental code. I'm not sure whether you get the irony but you are complaining in a sigstore project about importing sigstore.
For script splits I already made my position clear.

BTW if you haven't seen I restored most of the README a week ago.

@laurentsimon
Copy link
Collaborator

The last PR from @mihaimaruseac got merged so last rebasing needed.
Is there a way to create tracking issues for some of the unsettled points, so we can fix them async?

@mihaimaruseac
Copy link
Collaborator

I'd say the first two (moving the code to signature -- maybe rename it to pki?) and merging the CLI scripts into a single one are blockers for merging this.

@susperius
Copy link
Contributor Author

I'd say the first two (moving the code to signature -- maybe rename it to pki?) and merging the CLI scripts into a single one are blockers for merging this.

I laid out the technical reason in the lengthy thread above for why it's implemented like it is as well as why two scripts should be preferable.

Furthermore, the implementation shows a technically solid way to provide multiple raw_signers producing sigstore bundles in a single class responsible for intoto dsse payloads. Since, it is not experimental I'd like to keep there. It also aligns with the general API design outlined by @laurentsimon.

@mihaimaruseac
Copy link
Collaborator

mihaimaruseac commented Aug 20, 2024

This got to be too long, so let's summarize here the remaining points.

Location of IntotoSignature, IntotoSigner and IntotoVerifier classes

To summarize: the PR adds several classes that depend on missing features from sigstore-python, and which would result in skipping validation. The classes are added to remove a data format round-trip, which is a good benefit. However, it would either require us to duplicate validation from sigstore-python or ask sigstore-python to give more support in handling these scenarios. This is the first reason to move.

The second reason to move is that having these classes where they are would block importing the code into the Google monorepo in a reliable way, given that it would require Copybara changes that are known to be flaky.

Since these classes can be imported in exactly the same way regardless of where they are located, I'd still say they should be in signature/.

We might rename signature/ to pki/ and have it contain all the deviations from Sigstore to support other signature formats. So, another benefit for moving them.

Number of CLI tools/scripts

To summarize: the PR removes a single CLI script for signing and verifying and instead creates 2 separate ones. This results in duplicated code and also forces users to always specify a signer/verifier instead of having one by default.

Code can be written in a way that reduces duplication and clutter.


This is where we are before we can make progress here. I'd be willing to compromise and accept 2 CLI scripts for now, but we definitely cannot continue with the layering breakage, those classes have to move.

@susperius
Copy link
Contributor Author

@mihaimaruseac I hope this solves your import issues.

Renaming signature to PKI does not make sense it supports signing with sigstore, pki and ec keys.

wrt two scripts:
You can have default values with two scripts too. Overall, it shows an easy example of how to use the library on a use case basis.

@mihaimaruseac
Copy link
Collaborator

mihaimaruseac commented Aug 20, 2024

@mihaimaruseac I hope this solves your import issues.

Sure, but since you created the file, you could have created it in the signature/ directory as said in the PR review. I won't insist on it though, just like the other PRs we might have to do cleanup after.

Renaming signature to PKI does not make sense it supports signing with sigstore, pki and ec keys.

There is signing with sigstore under signing, testing with a mocked Sigstore, 100% coverage.

wrt two scripts: You can have default values with two scripts too. Overall, it shows an easy example of how to use the library on a use case basis.

Can you have signing/verifying with Sigstore be default and show this in the README, please?

Please make sure to have good test coverage and fix the failing checks.

@susperius
Copy link
Contributor Author

@mihaimaruseac I hope this solves your import issues.

Sure, but since you created the file, you could have created it in the signature/ directory as said in the PR review. I won't insist on it though, just like the other PRs we might have to do cleanup after.

Renaming signature to PKI does not make sense it supports signing with sigstore, pki and ec keys.

There is signing with sigstore under signing, testing with a mocked Sigstore, 100% coverage.

Awesome, then it can be used in a later PR. It also looks like the tests are flaky. I haven't touched the code and the tests are failing sometimes in py3.11 and sometimes in py3.12.

wrt two scripts: You can have default values with two scripts too. Overall, it shows an easy example of how to use the library on a use case basis.

Can you have signing/verifying with Sigstore be default and show this in the README, please?

We can do it in a later PR.

Please make sure to have good test coverage and fix the failing checks.

Please Mihai stop moving the goal posts. All checks worked before I was forced to rebase again and you continue to add new requests on top of it.
We can talk about tests afterwards if you want to. As you know all too well tests can also be added in a separate PR.

@mihaimaruseac
Copy link
Collaborator

mihaimaruseac commented Aug 20, 2024

We want to have good quality of code in here. I'm not moving the goalposts and the tests are definitely not flaky. But if you show me two runs on the same code where a test fails then passes I can revisit it.

You said that we can have a default mode with the 2 scripts in the CLI, I wanted to see that in this PR.

Regarding tests: in all of your PRs you've been asked to write them. Now we have a checker for coverage, but this is not moving the goalposts. We always asked for tests.

@susperius
Copy link
Contributor Author

We want to have good quality of code in here. I'm not moving the goalposts and the tests are definitely not flaky. But if you show me two runs on the same code where a test fails then passes I can revisit it.

You said that we can have a default mode with the 2 scripts in the CLI, I wanted to see that in this PR.

It fails here in py3.11: https://github.com/sigstore/model-transparency/actions/runs/10475773857/job/29013190889
And py3.11 here: https://github.com/sigstore/model-transparency/actions/runs/10477683799/job/29019352610
It fails here in py3.12: https://github.com/sigstore/model-transparency/actions/runs/10477482257/job/29018678803

without me touching the code.

I won't add any major things to this PR.
We can do follow up PRs with tests. I don't see any reason why you can do that but I can't.

README.model_signing.md Show resolved Hide resolved
README.model_signing.md Show resolved Hide resolved
README.model_signing.md Show resolved Hide resolved
README.model_signing.md Show resolved Hide resolved
README.model_signing.md Outdated Show resolved Hide resolved
src/verify.py Outdated Show resolved Hide resolved
src/verify.py Outdated Show resolved Hide resolved
tests/signing/in_toto_signature_test.py Outdated Show resolved Hide resolved
tests/signing/in_toto_signature_test.py Outdated Show resolved Hide resolved
tests/signing/in_toto_signature_test.py Show resolved Hide resolved
@mihaimaruseac
Copy link
Collaborator

mihaimaruseac commented Aug 20, 2024

We want to have good quality of code in here. I'm not moving the goalposts and the tests are definitely not flaky. But if you show me two runs on the same code where a test fails then passes I can revisit it.
You said that we can have a default mode with the 2 scripts in the CLI, I wanted to see that in this PR.

It fails here in py3.11: https://github.com/sigstore/model-transparency/actions/runs/10475773857/job/29013190889 And py3.11 here: https://github.com/sigstore/model-transparency/actions/runs/10477683799/job/29019352610 It fails here in py3.12: https://github.com/sigstore/model-transparency/actions/runs/10477482257/job/29018678803

Thank you. I'll look into those. They are all on Windows, fail with the same error message. Windows is not required to merge.

I won't add any major things to this PR. We can do follow up PRs with tests. I don't see any reason why you can do that but I can't.

The only tests I moved for later was for mocking sigstore and that was pushed for later since you wanted that PR to land to continue on your pushes. This is not the case here.

@di
Copy link
Member

di commented Aug 20, 2024

We want to have good quality of code in here. I'm not moving the goalposts and the tests are definitely not flaky. But if you show me two runs on the same code where a test fails then passes I can revisit it.
You said that we can have a default mode with the 2 scripts in the CLI, I wanted to see that in this PR.

It fails here in py3.11: https://github.com/sigstore/model-transparency/actions/runs/10475773857/job/29013190889 And py3.11 here: https://github.com/sigstore/model-transparency/actions/runs/10477683799/job/29019352610 It fails here in py3.12: https://github.com/sigstore/model-transparency/actions/runs/10477482257/job/29018678803

Thank you. I'll look into those. They are all on Windows, fail with the same error message. Windows is not required to merge.

Two possible culprits here are a) TUF filepath lengths being too long for Windows to handle and b) files being moved/renamed while a file handler still has them open. Not sure if either are relevant here but thought I'd offer a pointer. (see also theupdateframework/python-tuf#454), h/t @woodruffw).

@mihaimaruseac
Copy link
Collaborator

I think it is because I didn't mock it properly, testing in #297

@mihaimaruseac
Copy link
Collaborator

I think it is because I didn't mock it properly, testing in #297

That seemed to be the reason, after more mocking 5 consecutive CI runs passed without problems. There should be no merge conflict and there is no requirement to rebase this on top of #297.

The flakyness raised an issue on the future e2e test though: it seems we run into the issue b) that Dustin mentioned. We'll solve that when we have e2e tests.

Copy link
Collaborator

@mihaimaruseac mihaimaruseac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there, please change type of ignored_paths. The others could be considered as nits, but easy to fix in this PR too, since they are one-line fixes each.

Sad that this makes CLI work with anything but sigstore.

src/model_signing/model.py Outdated Show resolved Hide resolved
src/model_signing/model.py Outdated Show resolved Hide resolved
README.model_signing.md Show resolved Hide resolved
tests/signing/in_toto_signature_test.py Outdated Show resolved Hide resolved
src/sign.py Outdated Show resolved Hide resolved
@mihaimaruseac
Copy link
Collaborator

Oh, one of your commits fails DCO.

mihaimaruseac
mihaimaruseac previously approved these changes Aug 21, 2024
src/sign.py Outdated Show resolved Hide resolved
src/model_signing/model.py Outdated Show resolved Hide resolved
@susperius
Copy link
Contributor Author

susperius commented Aug 21, 2024

Oh, one of your commits fails DCO.

Ok I think I broke it completely

Edit: fixed it (?)

Signed-off-by: Martin Sablotny <msablotny@nvidia.com>
Signed-off-by: Martin Sablotny <msablotny@nvidia.com>
Copy link
Collaborator

@mihaimaruseac mihaimaruseac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@mihaimaruseac mihaimaruseac merged commit e18b1d5 into sigstore:main Aug 21, 2024
17 checks passed
@susperius susperius deleted the feat/new_manifest_cli branch August 21, 2024 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants