Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PoC: Tap 14 #2114

Closed
wants to merge 40 commits into from
Closed

PoC: Tap 14 #2114

wants to merge 40 commits into from

Conversation

mnm678
Copy link
Contributor

@mnm678 mnm678 commented Sep 22, 2022

supersedes #2049

This pr is the proof of concept for TAP 14 for the TAP approval process. It should not be merged until that TAP is finalized.

Please fill in the fields below to submit a pull request. The more information
that is provided, the better.

Fixes #2040

Description of the changes being introduced by the pull request:

Please verify and check that the pull request fulfills the following
requirements
:

  • The code follows the Code Style Guidelines
  • Tests have been added for the bug fix or new feature
  • Docs have been added for the bug fix or new feature

abs007 and others added 16 commits September 22, 2022 10:26
Removed the folders "metadata.staged" and "metadata" and
added all the metadata inside the "1.0.0" and "2.0.0"

Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Updated the spec version inside the metadata files inside
the "2.0.0" folder

Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Added the all the metadata files alongside the "1.0.0",
"2.0.0" and "targets" folders

Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Reverted the directory structure to keep all metadata
inside "metadata.staged" and "metadata" folders

Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Created a new "TAP 14" folder wherein I added all the
metadata inside the "1.0.0" and  "2.0.0" folders and
added the "targets" folder

Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Added the test case which checks for the TAP 14 folder

Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Added a test function that check the contents inside the
TAP 14 folder.

Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Added a function to select specification version folder
to download metadata from

Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Added the logic for the new client update process inside
tuf/ngclient/updater.py. Added test functions for the new
process inside tests/test_updater_ng.py. Also made changes
to some test files.

Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Worked upon adding better tests in test_updater_ng.py
and worked on the code structure for updater.py

Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Made changes to _get_repository_versions() to read the
supported-versions file and also removed _look() from
fetcher.py and requests_fetcher.py

Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Made changes to updater.py and config.py

Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Added some more changes adhering to the reviews. Also
worked on cleaning the code in the test file.
This is the last pull request as part of GSoC'22

Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
@coveralls
Copy link

coveralls commented Sep 22, 2022

Pull Request Test Coverage Report for Build 3362649524

  • 94 of 97 (96.91%) changed or added relevant lines in 4 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.2%) to 97.926%

Changes Missing Coverage Covered Lines Changed/Added Lines %
tuf/ngclient/updater.py 82 85 96.47%
Totals Coverage Status
Change from base Build 3089293137: -0.2%
Covered Lines: 1435
Relevant Lines: 1454

💛 - Coveralls

Signed-off-by: Marina Moore <mnm678@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
@mnm678 mnm678 changed the title Tap14 PoC: Tap 14 Sep 23, 2022
[
Root.type,
Snapshot.type,
"spec_version",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

obligatory "we should probably make a type for spec_version" 🙂

@@ -122,12 +131,80 @@ def refresh(self) -> None:
RepositoryError: Metadata failed to verify in some way
DownloadError: Download of a metadata file failed in some way
"""
try:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably worth refactoring into its own method

@@ -122,12 +131,80 @@ def refresh(self) -> None:
RepositoryError: Metadata failed to verify in some way
DownloadError: Download of a metadata file failed in some way
"""
try:
version_bytes = self._load_local_metadata("spec_version")
self._spec_version = json.loads(version_bytes.decode("utf-8"))[

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this line into an else clause?

target_bytes = self._fetcher.download_bytes(
url, self.config.supported_versions_max_length
)
repository_versions = json.loads(target_bytes)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would love to pull the parsing/types/etc. out into a SupportedVersions type that could be tested independently

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's probably a more natural home for e.g. _get_spec_versions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also thought about moving some of this into the metadata api, especially if we finalize the decision to sign this file in theupdateframework/taps#158

self._load_timestamp()
self._load_snapshot()
self._load_targets(Targets.type, Root.type)

def _get_repository_versions(self) -> List[str]:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List[str] may be an out-of-date type now


self._load_root()
def filter_fn(a):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we give this a more meaningful name?


self._load_root()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preference to pull this out into its own method.

repository_versions: List[str],
spec_version: str,
supported_versions: List[str],
) -> Tuple[str, Optional[str]]:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice to introduce a SpecVersion class (basically a version/path tuple)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would make the int conversion lines a little more bulletproof

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree with @znewman01 if there are other use cases where you will need that tuple, but it seems as here the tuple doesn't mean versions/path, but version/warning where warning is even optional.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, not sure why we return the warning and not directly print it.
This will save us from the need to return it, make a variable, etc.

# 404/403 means current root is newest available
break
for next_version in range(lower_bound, upper_bound):
for version_repo in supported_version_repos:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that as-written this would accept /1.root.json, /2/2.root.json, 3.root.json which feels wrong.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we want to say:

for next_version in range(lower_bound, upper_bound):
    while supported_version_repos:
        if it's in supported_version_repos[0]:
            handle it!
            break
        else:
            supported_version_repos.pop()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added a test, this does ignore 3.root.json in your example. The complication is that both the supported_version_repos look and the next_version loop have to break early in some (different) cases, which is where the additional complexity comes in. I'm happy to simplify if there's an easier way to represent this.

json.dumps({"version": spec_version}).encode("utf-8"),
)

self._load_root(ordered_version_paths)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this: it's totally possible that v2 root format will look totally different, so it seems wrong to try to load them the same way.

I've thought about implementing TAP-14 as a wrapper around the Updater. Basically, it would expose the same interface and keep a dict of {MajorVersion: Updater}.

Then, its job is to (1) parse the supported-versions and (2) delegate appropriately. This keeps the complexity of TAP-14 out of the Updater itself, and provides a more natural way to support a very-different v2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree in general, but _load_root is unique in that it has to look in directories for multiple supported versions in order to ensure that no root metadata is missed. If the root format is changed, then this function could be updated as part of the client update to use the correct format for the current root file that is being checked.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've thought about implementing TAP-14 as a wrapper around the Updater. Basically, it would expose the same interface and keep a dict of {MajorVersion: Updater}.

This is a great idea. Legacy python-tuf implemented TAP 4: Multiple repository consensus on entrusted targets (mapping file) support with a wrapper around the updater (see MultiRepoUpdater). It could be useful to experiment with multiple wrappers around the updater and how they would interact, so that we can determine whether we need to come up with another pattern so that TAP implementations aren't mutually exclusive.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great point. We're hoping to take that approach in go-tuf for TAP 4 as well!

Interactions sound a little tricky...I think we're okay, we'd put the TAP-4 wrapper as the outermost layer?

We can cross that bridge when we come to it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my hunch is that ordering with TAP 4 at the outermost layer makes sense for TAP 4 + TAP 14.

Signed-off-by: Marina Moore <mnm678@gmail.com>
@mnm678
Copy link
Contributor Author

mnm678 commented Sep 27, 2022

Thanks for the feedback! I made some changes. I'm going to do a larger refactor that creates types for SpecVersion and RepositoryVersions pending the discussion in theupdateframework/taps#158

@JustinCappos
Copy link
Member

JustinCappos commented Sep 28, 2022 via email

@mnm678
Copy link
Contributor Author

mnm678 commented Sep 28, 2022

Does it make sense to have each piece of ?root? metadata on a repo always contain a list of the other spec_versions of metadata that are available on that repo? This way a client can better avoid being downgraded...

I don't want to require anything of existing v1 implementations, but this is something we could explore for v2++. Another, similar idea discussed in theupdateframework/taps#158 is to have the supported-versions.json, which lists the supported versions on the repo file signed by root keys.

All of this is just to ensure that clients can find new versions as soon as they are available. The client won't downgrade spec versions once they use a higher spec version with a given repo.

@trishankatdatadog
Copy link
Member

I'm still not convinced that there is a use case for clients dynamically choosing which major spec version they want to support.

@JustinCappos
Copy link
Member

I'm still not convinced that there is a use case for clients dynamically choosing which major spec version they want to support.

I read the linked comment. Are you arguing that a client should support exactly one spec version or that a client should not decide per repo?

I would say that a client supporting multiple spec versions may be necessary if it interacts with different repos that use different versions. I do agree with the latter, that a client should always use the latest version.

@trishankatdatadog
Copy link
Member

I would say that a client supporting multiple spec versions may be necessary if it interacts with different repos that use different versions.

Is there such a client? Why design for a need that does not yet exist? I think we should be careful not to overgeneralise.

Copy link
Collaborator

@MVrachev MVrachev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed the core TAP 14 logic in updater.py and in config.py.
I will review the test files next week.

@@ -85,6 +91,8 @@ def __init__(
fetcher: Optional[FetcherInterface] = None,
config: Optional[UpdaterConfig] = None,
):
self._spec_version = None # spec_version is the last used version by the client to get metadata
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Please provide a type for _spec_version as I suspect that mypy won't be happy.
Something like:
self._spec_version: Optional[str] = None

self._load_timestamp()
self._load_snapshot()
self._load_targets(Targets.type, Root.type)

def _set_spec_version(self) -> List[str]:
try:
version_bytes = self._load_local_metadata("spec_version")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where can we have a look into an example spec_version.json file?

"""Returns a list of all the repository versions and paths."""

try:
url = f"{self._metadata_base_url}supported-versions.json"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there recommendations or documentation somewhere on what format will the supported-versions.json file have?

return ordered_version_paths


def _get_repository_versions(self) -> List[object]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, you know the exact format of your return type (as we see below in the except cause) and I think it will be better if you describe it as:
List[Dict[Any]]


repository_versions_and_paths = self._get_repository_versions()

repository_versions = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, but still makes it more readable:

Suggested change
repository_versions = [
repository_versions: List[str] = [

repository_versions: List[str],
spec_version: str,
supported_versions: List[str],
) -> Tuple[str, Optional[str]]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree with @znewman01 if there are other use cases where you will need that tuple, but it seems as here the tuple doesn't mean versions/path, but version/warning where warning is even optional.

repository_versions: List[str],
spec_version: str,
supported_versions: List[str],
) -> Tuple[str, Optional[str]]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, not sure why we return the warning and not directly print it.
This will save us from the need to return it, make a variable, etc.

ordered_version_paths = []

for version in sorted(repository_versions):
path = [i["path"] for i in repository_versions_and_paths if version <= int(spec_version) and i["version"] == version]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is too complicated to be a one-liner.

@@ -296,7 +376,7 @@ def _persist_metadata(self, rolename: str, data: bytes) -> None:
pass
raise e

def _load_root(self) -> None:
def _load_root(self, supported_version_repos) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annotation is needed for supported_version_repos.

)
self._trusted_set.update_root(data)
self._persist_metadata(Root.type, data)
save_repo_version_dir = self._spec_version_dir
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see why you save the original content of self._spec_version_dir because you change self._spec_version_dir multiple times later, but I had to read it a couple of times.

I suggest instead passing the version_repo variable to download_metadata and with this you remove:

  • save_repo_version_dir variable
  • to_break flag
  • the if check for to_break at the end of the inner cycle and you can directly return after you add to supported_version_repos list

mnm678 added 4 commits October 3, 2022 11:46
Signed-off-by: Marina Moore <mnm678@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
@mnm678
Copy link
Contributor Author

mnm678 commented Oct 3, 2022

I updated this pr to reflect theupdateframework/taps#158, and to address the mypy type errors.

There is one remaining mypy error that I can't seem to resolve, so I'd appreciate some advice from reviewers :)

Copy link

@znewman01 znewman01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enjoy the comments I left before I realized this hadn't been updated to match the new TAP 😛

I think for mypy you can manually annotate the type of test_cases:

test_cases: List[Tuple[List[str], str, List[str], str]] = ...

It kinda seems like a bug in mypy, though. Not sure where it's getting that shape for test_cases...

@@ -205,10 +205,15 @@ def _fetch(self, url: str) -> Iterator[bytes]:
if path.startswith("/metadata/") and path.endswith(".json"):
# figure out rolename and version
ver_and_name = path[len("/metadata/") :][: -len(".json")]
# inside a version folder
if "/" in ver_and_name:
ver_and_name = ver_and_name.split("/")[1]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if there's multiple nested folders? ver_and_name = "/foo/bar.json"

I like

>>> "foo/bar/baz.json".partition("/")
('foo', '/', 'bar/baz.json')
>>> "foo/bar/baz.json".rpartition("/") 
('foo/bar', '/', 'baz.json')

For making intent a little clearer there.

You can also assert len(ver_and_name.split("/") <= 2

self.root.consistent_snapshot and ver_and_name != Timestamp.type
self.root.consistent_snapshot
and ver_and_name != Timestamp.type
and ver_and_name != "supported-versions"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a thing anymore, right?

Signed-off-by: Marina Moore <mnm678@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
@mnm678
Copy link
Contributor Author

mnm678 commented Oct 24, 2022

Thanks @znewman01, this now reflects the latest version of the TAP.

Signed-off-by: Marina Moore <mnm678@gmail.com>
Signed-off-by: Marina Moore <mnm678@gmail.com>
@lukpueh lukpueh mentioned this pull request Nov 17, 2022
@jku
Copy link
Member

jku commented Apr 23, 2024

TAP is now deferred.

I'll close this since there has been no action and it seems unlikely to get merged without major work. Please reopen if I'm being too hasty

@jku jku closed this Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement TAP 14
9 participants