Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Auth spec #375

Merged
merged 5 commits into from
Nov 28, 2024
Merged

New Auth spec #375

merged 5 commits into from
Nov 28, 2024

Conversation

AlexGodbehere
Copy link
Contributor

@AlexGodbehere AlexGodbehere commented Nov 14, 2024

Specify the data model used for the new Auth service. This defines ACLs in terms of templates, allowing a single grant to expand to multiple related grants on different services. The template language is able to look up information from other services allowing support for dynamically-defined ACLs.

docs/auth/sexpr.md Show resolved Hide resolved
Comment on lines +200 to +207
* `let` (*): `["let" bindings ...body]`. The `bindings` must be a list
with an even number of items; odd numbered items must be strings and
name new bindings. Each is followed by a value. A new set of bindings
is constructed starting from the current set. The first value is
evaluated using the current bindings and a binding for the first name
is added with the result. Each subsequent value is evaluated using the
new bindings and added to the set. Finally `body` is list-evaluated
using the new bindings and the result returned. (List)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this in different terms? I don't entirely understand and as such can't work out what line 294 is doing below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is 'variable' assignment (scare quotes because they aren't actually variable). It's equivalent to a block in Javascript with a set of const assignments at the top. Given [let [a VALUE] ...] the ... is evaluated, and anywhere inside we replace [a] with VALUE. So:

[let [a 42] [a]] -> [42]
[let [a 42 b "abc"] [b] [a]] -> ["abc" 42]
[let [a 42 b "abc"] [format "%s/%s" [a] [b]]] -> ["42/abc"]

In the example on L294, node is a variable created for the scope of the let. [id [principal] sparkplug] is the value assigned to that variable, which evaluates to an object like { group: "Core", node: "Node" } giving our assigned Sparkplug address. The body of the let contains a map, which iterates over a list built using the node variable we assigned to; the map creates another variable addr, and the expansion below shows the various stages in evaluating it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose given this is JSON we could actually use the syntax [let {a: 42} [a]]. I wonder if that would be less confusing?

Comment on lines +17 to +19
* Sparkplug Nodes are principals; their Sparkplug address information is
moved into the Auth service to sit alongside the Kerberos identity as
an alternative name.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on what this means? Does this contradict the UNS direction, should we choose to pursue it? With the UNS concept Sparkplug is only one of potentially many channels of data and we could expect other top-level namespaces (and their ingester pods) sending data into the /UNS/v1 namespace.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definitely doesn't contradict anything from UNS. All Sparkplug Nodes are principals, but not all principals are Sparkplug Nodes; a Node address is one of several identifiers a principal might have. Securing the UNS may mean introducing additional forms of identifier, I'm not sure.

The original driver for this was cmdesc over MQTT: given an escalation request, we need to identify the principal it originated from. Who is that? It's the Sparkplug Node which published the packet. (We have to trust that the MQTT ACLs are appropriate, obviously.) Subsequently it became clear to me that if we want any sort of auditing of 'is this Node allowed to publish a Device using the My_Important_Robot UUID?' then we again have to treat the Sparkplug Node address as a principal identifier analogous to the Kerberos UPN.

@amrc-benmorrow amrc-benmorrow changed the title Bmz/auth spec New Auth spec Nov 15, 2024
@amrc-benmorrow amrc-benmorrow self-assigned this Nov 15, 2024
If this is to be implemented we need a definite spec. This is a
beginning but it needs a lot more work.
I think this is the idea most likely to achieve what we want, but the
devil will be in the details. We must

* Avoid giving the Auth service too much work to do.
* Avoid making it impossible to implement a consuming service,
  remembering HiveMQ must use Java.
* Avoid bootstrap problems.
After careful consideration, I don't believe it is possible to require
consuming services to handle their own ACL expansions. This is because:

* We can only apply ReadPermission permission after full expansion. If
  we pass a consuming service an unexpanded ACL we are allowing them to
  see detail of other permission grants. The service would also have to
  explicitly ignore these unnecessary ACEs.

* Passing a consuming service a fully-expanded ACL is the minimum
  information that service needs to do its job. Requiring the service to
  have permission to perform its own lookups appears more secure, but
  isn't; passing the full ACL gives the service exactly the information
  needed and no more, without any additional configuration.

Additionally, requiring every consuming service to perform its own ACL
expansion makes the job much more complicated. The primary reason for
this was to avoid bootstrap problems with MQTT change-notify, but if we
are now using direct WebSockets instead this is no longer a problem. In
fact, a single WS to the Auth service is now simpler than a web of WSs
to each individual service required for ACL expansion.
* Remove `get` in favour of [binding key]. This means an indexed object
  must be let-bound, which I don't see as a problem. I really don't want
  to introduce evaluation of the call slot, it kills static analysis.

* Remove partial evaluation, and require base permissions to be
  explicitly declared (in the ConfigDB). We cannot defer evaluation to
  the consuming service as this would prevent us from implementing ACLs
  on ACLs.

* Start looking into static analysis of template definitions so we can
  track which sources of information apply to a given request from a
  consuming service, and which ACEs will need expanding.
@amrc-benmorrow amrc-benmorrow changed the base branch from main to feature/auth-rework November 21, 2024 09:59
@amrc-benmorrow amrc-benmorrow marked this pull request as ready for review November 21, 2024 10:00
@amrc-benmorrow
Copy link
Contributor

@AlexGodbehere I can't request a review from you... are you happy with this as the spec?

@amrc-benmorrow amrc-benmorrow merged commit 0027dc1 into feature/auth-rework Nov 28, 2024
1 check passed
@amrc-benmorrow amrc-benmorrow deleted the bmz/auth-spec branch November 28, 2024 08:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants