Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add authorization layer to server request handlers #165

Merged
merged 13 commits into from
Feb 9, 2022

Conversation

Zsailer
Copy link
Member

@Zsailer Zsailer commented Jan 8, 2020

Adds an authorization layer to Jupyter Server request handlers.

Background (inspired by the current authentication layer)

Currently, Jupyter Server has a simple way to enable authentication. It provides a thin layer of logic that can be easily configured to implement various different authentication methods above the server. It remains flexible and useful at the same time.

Specifically, each request handler is wrapped with an authenticated decorator which checks if the current_user is known by the server. If the user is not known, they can be redirected to a login handler and asked to sign-in. By default, authentication is turned off—i.e. no login handler is configured, all users are anonymous, and anyone can make requests to the server. To turn-on authentication, one simply adds a login handler, then authenticates users through the get_current_user() method. This method can be patched to implement whatever authentication system one would like.

JupyterHub, for example, leverages Jupyter Server's thin authentication layer to build a more robust authentication system on top of Jupyter Server. Hub serves its own login page+handler and patching the get_current_user() method in Jupyter Server handlers to check whether incoming requests came from users known by JupyterHub's authenticator. The single-user notebook servers inside Hub are basically Jupyter Servers with Hub's own authentication system patched-in.

Details (the "how"... adding a thin authorization layer)

This PR adds a thin layer of logic that makes authorization possible in Jupyter Server—allowing operators to plugin their specific authorization system into the server. It follows a similar approach to the authentication layer already in Jupyter Server.

Specifically, each request handler is wrapped with an authorized decorator that checks whether the current_user is authorized to make the following request. This is done by calling a user_is_authorized(user, action, resource) method before each JupyterHandler (i.e. the base class for all Jupyter server request handlers). Each request is labeled as either a "read", "write", or "execute" action:

  • "read" wraps all GET and HEAD requests.
  • "write" wraps all PUT, PATCH, POST, and DELETE requests.
  • "execute" wraps all requests to ZMQ/Websocket channels.

If user_is_authorized(...) returns True, the request is made; otherwise, a HTTPError(401) (401 means "unauthorized") error is raised, and the request is blocked.

By default, authorization is turned off—i.e. user_is_authorized() always returns True and all authenticated users are allowed to make all types of requests. To turn-on authorization, patch the user_is_authorized(...) method with your desired authorization method. Your patch should have the following signature:

def user_is_authorized(self, user, action, resource):
    """A method to determine if `user` is authorized to perform `action` 
    (read, write, or executed) on the `resource` type. 
    
    Parameters
    ------------
    user : usually a dict
        a user model with group, role, or permissions information.

    action : str
        the category of action for the current request: read, write, or execute.

    resource : str
        the type of resource (i.e. contents, kernels, files, etc.) the user is requesting.

    Returns True if user authorized to make request; otherwise, returns False.
    """
    ...

# Example logic where we patch in the method
from jupyter_server.base import JupyterHandler.

JupyterHandler.user_is_authorized = user_is_authorized

I've added a few examples of simple implementations of authorization on local, single-user servers in the examples/authorization folder.

Implications (the "why"...)

In short, authorization allows operators to implement finer-grained permissions/access control to running Jupyter Servers. For example, JupyterHub could leverage this layer to build group/role-based access to a shared Jupyter Server and even set different types of permissions for each user to that shared server.

My goal here was to build a layer that is flexible for all forward-looking use-cases. Looking ahead, I think this layer impacts discussions around real-time collaboration in Jupyter, shared Jupyter servers/services (i.e. hubshare, nbgrader, jupyterhub "projects", etc.), and security. I want to make sure we design this layer to be as flexible as possible. I think this remains flexible enough for these use-cases while also being useful.

To dos

  • add tests
  • add documentation

@ellisonbg @blink1073 @minrk @captainsafia @SylvainCorlay @yuvipanda @rgbkrk @echarles @kevin-bates @saulshanabrook @jhamrick

@blink1073
Copy link
Contributor

This looks great @Zsailer, very well thought out!

Copy link
Member

@kevin-bates kevin-bates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really cool Zach - thanks for starting this!

jupyter_server/base/handlers.py Outdated Show resolved Hide resolved
examples/authorization/README.md Outdated Show resolved Hide resolved
jupyter_server/services/config/handlers.py Outdated Show resolved Hide resolved
@Zsailer Zsailer closed this Jan 10, 2020
@Zsailer Zsailer deleted the authorization branch January 10, 2020 17:34
@Zsailer Zsailer restored the authorization branch January 10, 2020 18:02
@Zsailer Zsailer reopened this Jan 10, 2020
@Zsailer
Copy link
Member Author

Zsailer commented Jan 10, 2020

Oops. Closed this on accident!

@Zsailer Zsailer force-pushed the authorization branch 2 times, most recently from 1343a1e to fd7f44a Compare January 10, 2020 21:02
@Zsailer Zsailer added this to the Future milestone Apr 15, 2020
@Zsailer Zsailer marked this pull request as draft April 29, 2020 17:07
@meeseeksmachine
Copy link
Contributor

This pull request has been mentioned on Jupyter Community Forum. There might be relevant details there:

https://discourse.jupyter.org/t/hubshare-progress/4184/2

@SylvainCorlay
Copy link
Contributor

This is really cool!

I wonder if we could also use an authorization layer for the allowed kernel message types . For example, the owner of a single-server notebook may be able to launch notebooks that take execution requests, while others may only be able to e.g. use Voilà dashboards, which don't require execution requests.

@kevin-bates
Copy link
Member

@SylvainCorlay - that seems possible if we had a mapping from message types to "action" (read, write, execute), then you could modify this code with a call like:

if user_is_authorized(user, get_msgtype_action(mt), "kernel messages"):
    stream = self.channels[channel]
    self.session.send(stream, msg)
else:
    self.log.warning('Received message of type "%s", which is not allowed. Ignoring.' % mt)

jupyter_server/utils.py Outdated Show resolved Hide resolved
Copy link
Member

@kevin-bates kevin-bates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In looking at this again, it would be helpful to have a table of resource names indexed by action that describes the semantics of what authorization enables on the resource. E.g., read/write/execute semantics for kernels differ for contents, etc. (I guess we'll want these defined as enumerations somewhere so extensions know what's available?)

resource read write execute
kernels receive results execute code control lifecycle
contents read content write content n/a

@davidbrochart davidbrochart force-pushed the authorization branch 2 times, most recently from 9d8f806 to ad2fe46 Compare April 8, 2021 15:51
@Zsailer Zsailer changed the title [WIP] Add authorization layer to server request handlers Add authorization layer to server request handlers May 3, 2021
@davidbrochart
Copy link
Contributor

I started working on authentication in https://github.com/davidbrochart/jupyterlab-auth, which allows to authenticate to JupyterLab with a GitHub account. It plays well with RTC by plugging into Yjs awareness, using https://github.com/davidbrochart/jupyterlab/tree/yjs_awareness. This means that you can see who's doing what in your notebook in real-time.
I think the next step is giving different permissions to these users, which brought me here. I'd like to revive this PR and start a discussion on how we can connect authentication with authorization. As a minimal starting point, I was thinking we could have a file where each allowed user would be given a permission. Thoughts?

@codecov-commenter

This comment has been minimized.

@davidbrochart
Copy link
Contributor

Maybe we should try and align with the work being done in JupyterHub, about Role Based Access Control (JupyterHub RBAC documentation and PR).

@minrk
Copy link
Contributor

minrk commented Jan 14, 2022

Having put some more into applying authorization in .get_current_user, I don't think a 'strict' mode can really work, at least not in a remotely backward-compatible way. It cannot be assumed that calling .get_current_user means a handler is authenticated - it can be done e.g. purely for logging purposes on non-authenticated handlers (we do this on all requests already, even static files). There are also too many reasonable cases that should be protected by authentication without necessarily needing to define a resource for every page. I think even warning about it is going to be hard to not fill with false positives.

So I think the only really feasible goal is to provide an explicit tool like jupyter server check-authorization that produces a report on the registered Handlers and their authorization scopes that a deployment can review. I think that should be a separate PR.

I also put together the resources and action combinations into a table in the docs, adapted from #540.

With that said, I think this PR is ready to review again, and I don't have any plans for further implementation here right now until there's another round of feedback.

@minrk minrk mentioned this pull request Jan 20, 2022
6 tasks
minrk added 2 commits January 31, 2022 13:40
since it's a public API packages should import,
let's not nest it deep in services.auth.authorizer
@Zsailer
Copy link
Member Author

Zsailer commented Feb 4, 2022

Thank you, Min. I did a deep review this morning—this looks good to go.

We discussed this in yesterday's Jupyter Server meeting. We are planning to merge next Tuesday, giving people a couple more days for final review (I know @echarles was planning on doing a final pass). If nothing major pops up before then, we'll send it! 🚀

Thank you for carrying this across the finish line, @minrk. Great stuff here and I'm excited to see this land!

@Zsailer
Copy link
Member Author

Zsailer commented Feb 4, 2022

So I think the only really feasible goal is to provide an explicit tool like jupyter server check-authorization that produces a report on the registered Handlers and their authorization scopes that a deployment can review. I think that should be a separate PR.

Let's open an issue to track adding a check-authorization tool when this PR is merged.

@echarles
Copy link
Member

echarles commented Feb 4, 2022

I was planning to do the review today, but other things have taken me. Will do that tomorrow.

@echarles
Copy link
Member

echarles commented Feb 5, 2022

I have tested the PR with the examples and my own custom config, this works great.

I have scanned the annotated methods, and they all seem covered, but a tool like would bring more certainty, I have opened #692 to track such a check-authorization tool as discussed above.

Diff LGTM

@minrk There is a merge conflict. Once you resolve it I can approve these changes.

@minrk
Copy link
Contributor

minrk commented Feb 9, 2022

Thanks! Conflicts resolved.

@minrk
Copy link
Contributor

minrk commented Feb 9, 2022

Some tests have started failing, but the same tests are failing in main as well

@minrk
Copy link
Contributor

minrk commented Feb 9, 2022

I opened #697 to track defining actions for individual kernel messages, as proposed by @SylvainCorlay and @kevin-bates.

@echarles
Copy link
Member

echarles commented Feb 9, 2022

I had just relaunched. We will see.

@echarles
Copy link
Member

echarles commented Feb 9, 2022

+++++++++++++++++++++++++++++++++++ Timeout ++++++++++++++++++++++++++++++++++++
tests/services/kernels/test_api.py::test_kernel_handler_startup_error_pending[jp_server_config2] <- ../../../../../opt/hostedtoolcache/Python/3.9.10/x64/lib/python3.9/site-packages/jupyter_server/tests/services/kernels/test_api.py
Error: Process completed with exit code 1.

@echarles echarles mentioned this pull request Feb 9, 2022
@echarles
Copy link
Member

echarles commented Feb 9, 2022

Still failing. That specific CI issue is already tracked on #677 (comment)

Copy link
Member

@echarles echarles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Tested locally and works great. CI failure is not relevant and tracked in #677 (comment).

I will wait until tomorrow before merging if anyone has more feedbacks.

Kudos to @Zsailer and @minrk and all others who contributed and tested.

@Zsailer
Copy link
Member Author

Zsailer commented Feb 9, 2022

Yeah, these tests failures are unrelated. Our test suite is a bit flakey after changes from pending kernels. That will be addressed in a separate PR.

Thank you, @minrk. I'm thrilled to see this work finally land in main (2 years later!). Merging away 🚀 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.