Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix crash when uploading a package through scoped API key #647

Merged
merged 2 commits into from
Jul 31, 2023

Conversation

gabm
Copy link
Contributor

@gabm gabm commented Jun 9, 2023

fixes #642

Problem

  • the uploader_id gets set to the user_id of the uploader when uploading package files
    • in case a scoped API key is used, that user_id points to an "anonymous user"
  • when listing packages, that anonymous user does not have a name (null) and provokes a crash

Solution

  • when uploading through an API key, always retrieve the owner_id and use that as uploader_id

@gabm gabm changed the title Fix crash when uploading a channel through scoped API key Fix crash when uploading a package through scoped API key Jun 9, 2023
@janjagusch
Copy link
Collaborator

@gabm, can you please rebase with the latest commit on main?

@gabm gabm force-pushed the fix-uploader-handling branch from dcbc366 to 3691ed6 Compare June 20, 2023 13:41
@gabm
Copy link
Contributor Author

gabm commented Jun 20, 2023

done

@janjagusch janjagusch added the bug Something isn't working label Jun 20, 2023
@janjagusch
Copy link
Collaborator

@gabm, can you fix the lint, please?

@gabm
Copy link
Contributor Author

gabm commented Jun 22, 2023

actually the linter was very helpful.. I fixed the problems.. pls have a look

@gabm
Copy link
Contributor Author

gabm commented Jun 22, 2023

I am actually not sure whether we also want to use the owner in other places... the theory goes like this

  • the user might be a "virtual user" corresponding to "downgraded" privileges
  • the owner is the "physical user" that issued the API key and therefore the "parent" of the "virtual user" above

In some places we want to use one and in other places the other. For example when uploading a package we want to set the owner as the owner of the pkgs, but in other places we need the reduced privileges of the virtual user...

I am not familiar enough to review all the other uses or get_user and assert_user, but some might be replaced the owner method in some places. But then again its "dangerous" to touch, as it also escalates the privileges again..

Copy link
Collaborator

@AndreasAlbertQC AndreasAlbertQC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gabm thanks for the PR. I appreciate the clarity of your fix. On the code itself, I mostly have nitpicky suggestions, I hope you don't mind.
One actually important thing though: Would you mind also writing a test that captures the bug (==fails with the old implementation, but works with your fix?)

quetz/authorization.py Outdated Show resolved Hide resolved
quetz/main.py Outdated Show resolved Hide resolved
quetz/main.py Outdated Show resolved Hide resolved
quetz/main.py Outdated Show resolved Hide resolved
quetz/main.py Show resolved Hide resolved
@AndreasAlbertQC
Copy link
Collaborator

I am actually not sure whether we also want to use the owner in other places... the theory goes like this

  • the user might be a "virtual user" corresponding to "downgraded" privileges
  • the owner is the "physical user" that issued the API key and therefore the "parent" of the "virtual user" above

In some places we want to use one and in other places the other. For example when uploading a package we want to set the owner as the owner of the pkgs, but in other places we need the reduced privileges of the virtual user...

I am not familiar enough to review all the other uses or get_user and assert_user, but some might be replaced the owner method in some places. But then again its "dangerous" to touch, as it also escalates the privileges again..

My intuition here would be that the current implementation before your fix, where an anonymous user is used, is the most robust in the sense that it never leaks permissions. If anything, it is problematic because you sometimes get too few permissions. Therefore, unless we are fixing a specific problem (as you did in this PR), I would not change the existing system for fear of accidentally leaking permissions for no gain.

@gabm gabm force-pushed the fix-uploader-handling branch 3 times, most recently from a5650af to 069e0ed Compare July 4, 2023 04:20
@gabm gabm force-pushed the fix-uploader-handling branch 2 times, most recently from 7a83441 to c3c1812 Compare July 17, 2023 06:14
@gabm
Copy link
Contributor Author

gabm commented Jul 17, 2023

thanks for those hints @AndreasAlbertQC . I added the tests as well.. pls have a look.

P.S.: I had a hard time working through the pytest fixtures, they are all over the place :/

@gabm gabm force-pushed the fix-uploader-handling branch from c3c1812 to 9db1f58 Compare July 17, 2023 06:20
Copy link
Collaborator

@AndreasAlbertQC AndreasAlbertQC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the additional work and tests @gabm ! Tests look good to me, I would just add a a try/finally around the yields in the fixtures to headaches if a test throws an exception. Apart from that, this is ready to go IMO

quetz/tests/api/test_main_packages.py Show resolved Hide resolved
quetz/tests/api/test_main_packages.py Show resolved Hide resolved
quetz/tests/api/test_main_packages.py Show resolved Hide resolved
quetz/tests/api/test_main_packages.py Show resolved Hide resolved
@beenje
Copy link
Contributor

beenje commented Jul 19, 2023

Hi! Thanks for working on this. I have the same issue :-)

I did a quick test and this branch indeed works for new uploaded packages.
What about existing packages? Should this PR include a migration script to fix the database?

Also just wondering if an alternative solution could not have been to show "anonymous/none" for the uploader.
There must be a reason for the user_id to be set to this user?

Copy link
Collaborator

@AndreasAlbertQC AndreasAlbertQC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @ivergara kindly pointed out to me, pytest already takes care of the exception handling for yielding fixtures such that the try/except block I asked for is not needed. My bad!

@AndreasAlbertQC
Copy link
Collaborator

The CI is fixed on main (pydantic dependency pin). @gabm rebasing your branch on main should fix the CI here.

@gabm gabm force-pushed the fix-uploader-handling branch from 9db1f58 to 8a4ae18 Compare July 24, 2023 04:02
@gabm
Copy link
Contributor Author

gabm commented Jul 24, 2023

done

@gabm
Copy link
Contributor Author

gabm commented Jul 24, 2023

What about existing packages? Should this PR include a migration script to fix the database?

For us its not a huge problem, we fixed it manually so far. Currently there is no straight forward way to fix it in the DB because the uploader information is lost iirc.

Also just wondering if an alternative solution could not have been to show "anonymous/none" for the uploader. There must be a reason for the user_id to be set to this user?

The full details are outlined here #642. In short: the current user model mixes "identity" and "authorization" to some extent. When you create an API key with a scope, then a new user without username is created and added as a member/maintainer to the packages/channels etc... When uploading a package with that API Key, it correctly resolves to that new, anonymous user that has the restricted authorization - thus preventing privilege escalation. Unfortunately, as this anonymous user has no identity (no name etc), the uploader field remains unset and that causes the error. The uploader field should be set with the identity of the "owner of the API Key". This identity should imho not be "anonymous"...

@beenje
Copy link
Contributor

beenje commented Jul 24, 2023

Also just wondering if an alternative solution could not have been to show "anonymous/none" for the uploader. There must be a reason for the user_id to be set to this user?

The full details are outlined here #642. In short: the current user model mixes "identity" and "authorization" to some extent. When you create an API key with a scope, then a new user without username is created and added as a member/maintainer to the packages/channels etc... When uploading a package with that API Key, it correctly resolves to that new, anonymous user that has the restricted authorization - thus preventing privilege escalation. Unfortunately, as this anonymous user has no identity (no name etc), the uploader field remains unset and that causes the error. The uploader field should be set with the identity of the "owner of the API Key". This identity should imho not be "anonymous"...

Thanks for the additional information about restricted authorization. I agree using the owner_id makes sense.

What about existing packages? Should this PR include a migration script to fix the database?

For us its not a huge problem, we fixed it manually so far. Currently there is no straight forward way to fix it in the DB because the uploader information is lost iirc.

I think it's possible to have a migration script:

  • we can get null users from users table
  • in the api_keys table, we have the link user_id / owner_id
  • In package_members table, we want to replace the user_id of null users with owner_id from the api_keys
  • In package_versions table, we also have to replace the uploader_id with the same owner_id

Or I guess we don't want to change package_members, right? Only package_versions.

@AndreasAlbertQC
Copy link
Collaborator

Thanks @gabm!

@beenje Do you want to break out the migration followup into a separate issue or PR? In the interest of somewhat timely turnaround, I'd like to merge this PR now so that we at least have the fix in the main.

@beenje
Copy link
Contributor

beenje commented Jul 31, 2023

Thanks @gabm!

@beenje Do you want to break out the migration followup into a separate issue or PR? In the interest of somewhat timely turnaround, I'd like to merge this PR now so that we at least have the fix in the main.

Yes, go ahead! I will try to create another PR for the migration.

@AndreasAlbertQC AndreasAlbertQC merged commit e2a5845 into mamba-org:main Jul 31, 2023
beenje added a commit to beenje/quetz that referenced this pull request Aug 2, 2023
Update package_versions uploader with API keys owner instead of the key anonymous user.
This is linked to mamba-org#647
and allow to update existing databases.
AndreasAlbertQC pushed a commit that referenced this pull request Aug 23, 2023
* Add migration script for scoped API keys

Update package_versions uploader with API keys owner instead of the key anonymous user.
This is linked to #647
and allow to update existing databases.

* Fix issue with sqlalchemy >=2.0

SQL query needs to be wrapped in sqlalchemy.text.
Using raw string as been removed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to list package version after uploading via channel-scoped API token.
4 participants