-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tap 14 draft pr #2049
Tap 14 draft pr #2049
Conversation
Removed the folders "metadata.staged" and "metadata" and added all the metadata inside the "1.0.0" and "2.0.0" Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Updated the spec version inside the metadata files inside the "2.0.0" folder Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Added the all the metadata files alongside the "1.0.0", "2.0.0" and "targets" folders Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Reverted the directory structure to keep all metadata inside "metadata.staged" and "metadata" folders Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Created a new "TAP 14" folder wherein I added all the metadata inside the "1.0.0" and "2.0.0" folders and added the "targets" folder Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Added the test case which checks for the TAP 14 folder Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Added a test function that check the contents inside the TAP 14 folder. Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Added a function to select specification version folder to download metadata from Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Pull Request Test Coverage Report for Build 2621327940Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
Should I squash commits |
So currently I along with @mnm678 and @znewman01 are working on implementing the changes to how a client would be downloading the metadata according to the format in the TAP 14 specification. This means tinkering with the |
@@ -34,7 +34,7 @@ | |||
] | |||
}, | |||
"expires": "2030-01-01T00:00:00Z", | |||
"spec_version": "1.0.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep in mind that you can't just replace the value of any of the fields in the signed
dictionary without re-signing the files again or the files will fail verification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This I believe was made as a temporary change rather and the original state of repository_data/repository
was brought back again in 774a039
.
That being said, all of the metadata was copied over inside the repository_data/TAP 14/1.0.0
& repository_data/TAP 14/2.0.0
folders and metadata inside the 2.0.0
had the "spec_version"
changed to 2.0.0
in a similar manner (just to show that this was metadata belonging to 2.0.0).
That seems like something I'll have to take a look at, thanks.
Added the logic for the new client update process inside tuf/ngclient/updater.py. Added test functions for the new process inside tests/test_updater_ng.py. Also made changes to some test files. Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
8c5d8e0
to
e77d93e
Compare
I know this is still a draft but I'll leave a quick comment: I don't understand why the client needs this functionality and before this PR is made ready I would like to see a comment in #2040 that explains the real world use case: how is this useful for python-tuf users? Background for my confusion: If a repository starts offering multiple versions of the repository, why can't clients decide to "upgrade" using a plain old software update on the client, so that the client just switches from old to new repository URL? (I'm obviously assuming that the python-tuf version used supports both versions of the spec -- but so does this proposal I think) In practice, if a future TUF-enabled pypi.org made available a second repository using a newer TUF spec features I think it would be totally appropriate for pip, the client, to simply change the used repo URL to the new repository in a pip software update. What advantage is there in making this repository selection dynamic? |
Thanks for the question, will get back to you soon. But in the meantime, can we move this discussion back to the associated Issue from this PR, since you’re raising questions about the utility of the feature itself?
… On Aug 7, 2022, at 1:10 PM, Jussi Kukkonen ***@***.***> wrote:
I know this is still a draft but I'll leave a quick comment: I don't understand why the client needs this functionality and before this PR is made ready I would like to see an issue opened that explains the real world use case: how is this useful for python-tuf users?
Background for my confusion: If a repository starts offering multiple versions of the repository, why can't clients decide to "upgrade" using a plain old software update on the client, so that the client just switches from old to new repository URL? (I'm obviously assuming that the python-tuf version used supports both versions of the spec -- but so does this proposal I think)
In practice, if a future TUF-enabled pypi.org made available a second repository using a newer TUF spec features I think it would be totally appropriate for pip, the client, to simply change the used repo URL to the new repository in a pip software update. What advantage is there in making this repository selection dynamic?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.
|
Worked upon adding better tests in test_updater_ng.py and worked on the code structure for updater.py Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Made changes to _get_repository_versions() to read the supported-versions file and also removed _look() from fetcher.py and requests_fetcher.py Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! I have a lot of comments, but you're making progress
tuf/ngclient/updater.py
Outdated
@@ -85,6 +88,7 @@ def __init__( | |||
fetcher: Optional[FetcherInterface] = None, | |||
config: Optional[UpdaterConfig] = None, | |||
): | |||
self.spec_version = None # spec_version is the last used version by the client to get metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this private: self._spec_version
tuf/ngclient/updater.py
Outdated
url = f"{self._metadata_base_url}supported-versions.json" | ||
|
||
with self._fetcher.download_file( | ||
url, "length placeholder" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably want self.config.supported_versions_max_length
in place of "length placeholder"
. Need to add a line to tuf/ngclient/config.py
as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering my comment above, should supported_versions_max_length
have the length in bytes equivalent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it should use 'bytes' as the unit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What could be a possible value for supported_versions_max_length
in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think about an example supported-versions.json
file:
{"supported_versions": ["1", "2", "3", "4"]}
That's got 45 bytes in it. So let's maybe round up to 1000 bytes? That's a pretty huge margin of error but well short of anything that could cause issues
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What could be the approx len() of the list in case of 1000 bytes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, a list with 1000 elements would definitely serialize to have > 1000 bytes. The real threshold would probably be a couple hundred.
Why do you ask?
tuf/ngclient/updater.py
Outdated
) as target_file: | ||
repository_versions = json.loads(target_file) | ||
|
||
return repository_versions["supported_versions"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should just return repository_versions
. We don't need to put it inside a JSON dictionary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I can understand, supported-versions.json
is a JSON dictionary itself :
{ "supported_versions" : [VERSION, ...], //From the TAP14 page
So, If I return repository_versions
instead then won't that return the dictionary itself? What I was trying to do was index the [VERSION, ...]
list using the supported_versions
key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you're right! I misremembered the proposed update to the specification.
tuf/ngclient/updater.py
Outdated
repository_versions: List[str], | ||
spec_version: str, | ||
supported_versions: List[str], | ||
) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, we need to return the actual spec version too!
Maybe make the return type Tuple[Optional[str], Optional[str]]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about Tuple[[str], Optional[str]]
? spec_version
is supposed to be either ""
or [versions, ...]
whereas warning
can be None
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see return (str(spec_version), warning)
below, which suggests:
Tuple[str, Optional[str]]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I meant that, oops. Will add that then.
tuf/ngclient/updater.py
Outdated
spec_version: str, | ||
supported_versions: List[str], | ||
) -> str: | ||
"""Returns the specification version to be used.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure to mention the return type, which is a little funky. When does it return the message?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about?
Returns the specification version to be used and displays a warning if chosen spec_version
is lower than the highest repository version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! I might add "specification version to be used, following the rules of TAP-14"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, shouldn't I mention the return type in the function definition above and keep the docstring to just an explanation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point—you don't typically want to repeat the type signature in documentation. However, it's not obvious to me from reading what Tuple[str, Optional[str]]
means.
There are two ways to fix that:
-
Make two
NewType
s :SpecificationVersion = NewType('SpecificationVersion', str) WarningMessage = NewType('WarningMessage', str)
so the return type becomes
Tuple[SpecificationVersion, Optional[WarningMessage]]
. That's self-documenting, so there's no need to repeat it in the docstring. -
Explain what the
str
andOptional[str]
values are in the docstring
I might actually prefer (1), because I think in the medium term we want to make a full SpecificationVersion
class to encapsulate ordering (so you don't have to convert to/from int
in _get_spec_version
), parsing, and conversion to URL parts. However, (2) is much easier so it's fine to do that for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused. If SpecificationVersion
would essentially be derived from str
then won't there be TypeError
be thrown when I'm still comparing it with latest_repo_version
and performing int type functions on it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right—if you just make SpecificationVersion
as a NewType
, you can't compare them directly. It's the same as what you have now.
Eventually, you'd make a full class SpecificationVersion
which doesn't require conversion to/from int.
tests/test_updater_ng.py
Outdated
os.path.isdir(os.path.join(self.tap14_directory, folder)) | ||
) | ||
|
||
def test_get_spec_version1(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call this test_get_spec_version_supported
or something. And put the comment below in a docstring.
tests/test_updater_ng.py
Outdated
|
||
def test_get_spec_version1(self) -> None: | ||
# This uses the default SUPPORTED_VERSIONS variable from updater.py | ||
with self.assertRaises(exceptions.DownloadError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assertRaises
takes an optional second argument: a message. This is really nice way to tell readers of your test code what you're checking:
self.assertRaises(exceptions.DownloadError, "4 is not a supported version")
.
Go through and do that for every assertRaises
or assertEqual
call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fyi, test_get_spec_version1
now test_get_spec_version_supported
doesn't pass because SUPPORTED_VERSIONS
isn't defined correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine—in fact, really test_get_spec_version
should only handle major version 1. So let's update the test cases
tuf/ngclient/updater.py
Outdated
@@ -54,6 +56,7 @@ | |||
from tuf.ngclient.fetcher import FetcherInterface | |||
|
|||
logger = logging.getLogger(__name__) | |||
SUPPORTED_VERSIONS = ["1", "2", "3"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should actually be SUPPORTED_VERSIONS = [SPECIFICATION_VERSION]
for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I'm guessing that it'll have to change and be read from the disk?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No—SUPPORTED_VERSIONS
is a property of the client library (that is, python-tuf
).
The "last used specification version" is what we write to/read from disk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you push changes that address my comments before marking them as resolved? Makes it easier for me to track what we've been talking about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Will do that
tests/test_updater_ng.py
Outdated
("3", None), | ||
) | ||
|
||
def test_get_spec_version2(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call this just test_get_spec_version
|
||
def test_get_spec_version2(self) -> None: | ||
warningchecker = "Not using the latest specification version available on the repository" | ||
# Checks with different values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you have this many test cases, I recommend putting them in a list:
test_cases = [
(["1", "2", "3"], "3", ["3", "5", "6"], "3", False),
(["3", "5", "6"], "3", ["1", "2", "3", "4"], "3", True),
]
for repo_versions, spec_version, supported_versions, expected_version, should_have_warning in test_cases:
actual_version, warning = _get_spec_version(...)
self.assertEqual(actual_version, expected_version)
self.assertEqual(bool(warning), should_have_warning)
# Do something similar for error_test_cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then you can have a comment for each test case explaining why you have it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, should I add a msg
or leave a comment? I guess doing both would be repetitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, if you're going to put a msg
in the asserts you don't need a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad actually. I'll have to put comments if I'll be using the for loop.
tuf/ngclient/updater.py
Outdated
|
||
url = f"{self._metadata_base_url}supported-versions.json" | ||
|
||
with self._fetcher.download_file( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this fails, we probably want to return ["1"]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or [""]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, should I use download_bytes()
instead because afai can see, download_bytes()
just uses download_file()
inside of fetcher.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, good call—do that!
Made changes to updater.py and config.py Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
Added some more changes adhering to the reviews. Also worked on cleaning the code in the test file. This is the last pull request as part of GSoC'22 Signed-off-by: Abhisman Sarkar <abhisman.sarkar@gmail.com>
try: | ||
self._spec_version = self._load_local_metadata("spec_version") | ||
except OSError: | ||
self._spec_version = "1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be: None
@@ -122,12 +126,50 @@ def refresh(self) -> None: | |||
RepositoryError: Metadata failed to verify in some way | |||
DownloadError: Download of a metadata file failed in some way | |||
""" | |||
try: | |||
self._spec_version = self._load_local_metadata("spec_version") | |||
except OSError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe check the errno:
except OSError as err:
import errno # this should be at top, of course
if err.errno != errno.ENOENT:
raise
self._spec_version = ...
logger.warning(message) | ||
self._spec_version = ( | ||
f"{spec_version}/" | ||
if spec_version is not None or spec_version == "1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO if your ternary operator spans 3 lines, better to just do an if/else
logger.warning(message) | ||
self._spec_version = ( | ||
f"{spec_version}/" | ||
if spec_version is not None or spec_version == "1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spec_version is not None or spec_version == "1"
<- doesn't the latter condition imply the former? if spec_version == "1"
then it's definitely not None
|
||
# If supported-versions.json is not found, then look through the root directory to find supported versions | ||
except exceptions.DownloadHTTPError as e: | ||
return ["1"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, we need to decide how to represent:
- spec version 1, everything lives in the root
/*
- spec version 1, everything lives under
/1/*
These should be different. I think (1) should be None
and (2) should be "1"
.
That means we should return [None]
.
return repository_versions["supported_versions"] | ||
|
||
# If supported-versions.json is not found, then look through the root directory to find supported versions | ||
except exceptions.DownloadHTTPError as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth checking e.status_code == 404
; otherwise it could be a real error.
) | ||
|
||
def test_get_spec_version_supported(self) -> None: | ||
"""This uses the default SUPPORTED_VERSIONS variable from updater.py""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a good idea to check that SUPPORTED_VERSIONS == ["1"]
; it won't change for a long time.
"3 is selected as the spec version and no warning ensues", | ||
) | ||
|
||
def test_get_spec_version(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think now you need to account for the None
case for repository_versions
This is actually starting to get a little confusing: there's "version/path pairs" (which come from the repository versions, and then there are just "versions" (from supported versions). _get_spec_version
should pick a matching (version, path) pair.
It might be worth encoding that in the type system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe even just have a function from "path" to version and use that in _get_spec_version
.
SpecificationVersion = NewType("SpecificationVersion", str) | ||
WarningMessage = NewType("WarningMessage", str) | ||
|
||
repository_versions = [int(i) for i in repository_versions] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is gonna be subtle when we have to worry about None
too (just a warning)
#2114 claims that it supersedes this PR. Can we close here? |
Sure |
• Commits
fc5bd5c
to88c140a
are based around setting up a sample metadata structure inside therepository
folder (which is insiderepository_data
) for testing•
774a039
is for revertingrepository
back to its original state and8845ebd
stores the metadata inside a newTAP 14
folder•
df46e38
and0f50945
add test functions that check inside the TAP 14 folder• Currently working on implementing changes to the client update process inside the
updater.py
file