-
Notifications
You must be signed in to change notification settings - Fork 466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Claims-based authorization #419
Comments
I've handled this out-of-band, in the transport layer. If communication occurs via an HTTP API, then the access token can be included in an Authorization header. If communication occurs over a web socket, there isn't a standardized way to include credentials, but I've done it by including the token as a field (in JWT form) in the data packet. In either case, the recipient can validate the token prior to accepting the changes included in the payload. This implies the presence of some kind of authorization service whose role is to validate the credentials associated with each change, which in turn suggests some kind of client-server setup. I can't imagine that you'd want to be broadcasting tokens around a peer-to-peer network. |
@MeneDev I think your approach is really interesting but there are a few problems to solve. First is, as @Steve-OH points out, that you might not need to actually change Automerge to do this, but can simply unbox the changes, check their signatures, and then decide whether to pass them into I think more interestingly the question is what to do if you get a change that is no good. Right now, every change in Automerge depends on every change that was locally visible when the change was made, and each local node has only a single linear history. That means that if you reject a change from a source you won't be able to apply any further changes from it ever, nor will you be able to accept changes from people who do. That might not be a problem in your case -- perhaps quarantine / excision of misbehaving nodes is desirable? -- but it does limit the flexibility of the approach. |
The "decide whether to pass them into Therefore finding solutions to the problem of changes which depend on changes which we don't want to accept would be very useful. Even if there's no way to allow that it would be useful to be able to roll back a single change - that is, to expunge it from the backend and emit a diff which removes it from the frontend. At the moment to check a change you have to apply the change, examine the result and if it is invalid you have to throw the whole document away and rerun the history up to the offending change. |
I don't see a problem with authentication (who did it), but with authorization (is that "who" allowed to do it). I can sign each change and validate the resulting signature before passing it to The problem is decenteralized authorization. Say I have a document I invite the both of you, @Steve-OH and @pvh and Malcom. Each of you get a key to sign the change. @Steve-OH adds a change to note "I'll bring my new favourite board game", @pvh adds "I'll organize food". Malcom changes the time. He's not allowed to do that, but he produces the corresponding change nonetheless and signs it with his valid key – he has a valid key, because I also invited him and he's allowed to change the notes field. I believe this could be accomplished with something akin to JWT: a token that contains claims, claims that become permissions when the token has a valid signature. In this case one claim could be There are certainly other ways, but a claims-based authorization allows decentral validation: whenever anyone receives a change, the signature is checked, when it's valid a callback can decide if the change is authorized by the token. In the example: when the change concerns the date field, then require the id of the document to be in the Since we can pass around changes in a peer-to-peer fashion, we need to keep the tokens and the signatures with the changes, so we can send it again. There are implications that are interesting and beyond my current understanding of the problem though, for example a token has usually a due date. Once the token has exceeded it becomes invalid, in turn changes should be rejected, but rejecting a change just because it's old would mean that we don't have a CRDT anymore. However that problem also applies to signatures. Things would also get very complicated once we'd allow changing tokens (and thereby permissions) for a document. So yes, this approach certainly has limitations, but it would also open up many possibilities that wouldn't be possible without a change to automerge. But it would only add options and not take away anything. |
Yes, I think this is a very interesting line of reasoning to work down and I'm excited about the promise of the approach you're exploring here. I do want to suggest that as a foundational concept we should always allow local editing of any information and make agreement optional (though convenient and by default). Consider, in your example, what we would do if you lost your signing keys. Now the document's time would become immutable! There would be no way for us to agree to your new key. This would be no big deal, perhaps, for a short-lived document, but could be a big problem if the document was a shared calendar for a workplace. The extremely shorthand way I describe this change in perspective is "attention, not permission". I can scribble all over my copy of the building codes, but the inspector won't care when she comes to check if I've got the wiring right. Still, it is useful and important for me to do just that, and it's hard to predict who will participate in a system in the future: perhaps I might want to work on a proposal for a new version of that document with some collaborators. My edits may (or may not) ever be accepted upstream but there's no reason I shouldn't be able to edit, annotate, or otherwise alter my version. |
Since you've invited @pvh and me to your network... The more I think about it, the more I come to the conclusion that authorization in a true peer-to-peer network is equivalent to the Byzantine generals problem. And the solution to that problem is well known: A two-peer network cannot tolerate any "traitors," A three-, four-, or five-peer network can tolerate one "traitor." Six through eight peers can tolerate two "traitors," and so on. In general, the number of traitors cannot be more than 1/3 the number of peers. You can make the system more robust against |
@Steve-OH you're right, to make that work you either need byzantine quorum (as far as I know that's what usually happens in blockchains, but I'm by no means an expert) or a trusted authority. In this approach that trusted authority does two things: issue the certificates that are used for signing, sign the tokens that contain the claims. The benefit of this "almost peer-to-peer" network is that the requirements on availability of the server-side infrastructure are much lower. Even if all central servers are down, no one would even notice unless you want to invite someone new to your party. It degrades a little, but the core keeps working. The difference for users between 95% uptime and 99.9…% become much less relevant. You can also have "true" microservices that don't require any centralized communication that may become a bottleneck. From my minimal understanding of blockchains, I think this approach of something like a middle-ground between client-server architecture and blockchain. |
@pvh good thoughts, I need to think about them a bit more. From the top of my head:
That is correct, I'm not sure if I would consider that a problem that automerge has to solve – in fact I would like automerge to not know about the concept of signing keys at all. By extend this also covers that scenario. If it's important that this must not happen, there are ways: multiple owners to increase redundancy or a central server that stores the keys. But I think that really depends on the domain of the concrete project that uses automerge.
I think that would be domain specific, too. The problem here is that there is a canonical "upstream" version, the one that everyone, including the inspector, has agreed upon. Once we start building, that's what we'll do. At the same time you work on a draft version. You can make suggestions, but you cannot decide if these suggestions are legal. However, the verification process is done on a concrete version of a document, not on single changes, so I think it's a slightly different usecase. Still I can imagine a scenario where this approach is beneficial. When you are working on a big building and have several departments that work on different parts of the building, and each of those departments have a supervisor. Inside a department everyone may change things, however, when the department wants to commit to the upstream version, the supervisor must sign each change accordingly. But the token of each supervisor only allows to perform changes in their respective part of the building. If you – as a non-supervisor – want to, you can change anything. But if you want your changes in a different department to make it to the upstream, you have to talk/give your changes to a different supervisor. The team that is responsible for assembling the final version of the document from each department can now rely on each incoming changes from each supervisor to only relate to their part of the building, a guarantee that could help to prevent mistakes or malicious behavior. This could be implemented in a project with a And one thing not to forget regarding your concern that documents become immutable: the documents / the changes are not encrypted. It's still possible to create a copy with new keys. It's however, not the same document anymore. |
I think it's quite promising to have a callback in
If these caveats are acceptable, I'd welcome proposals for what the API should look like. |
Apologies for weighing in six weeks after the fact (I was offline for a while), and also apologies for repeatedly spamming this repo to talk about a different library! 😬 I've been working almost exclusively on this problem for the last 18 months or so. Some thoughts: I don't think it makes sense to try to solve this within Automerge. Instead, I'd suggest determining identity (authentication) and permissions (authorization) elsewhere, and then using that information to inform who you're willing to talk to, and what changes from them you're willing to accept.* The best solutions I've found for purely peer-to-peer authentication & authorization are applied in a library I've been working on that I think is almost ready to release: localfirst/auth. In brief,
There's lots more interesting stuff: How do you securely invite someone to your group? How do you handle potentially conflicting concurrent membership changes? How do you enforce read permissions? For answers to these questions and so much more, please see this (draft) blog post. *If someone makes an invalid change (e.g. one they don't have permission to make), the most straightforward solution is to refuse to merge with them until their document is back in a valid state. To continue @pvh's analogy, I'm free to edit my copy of the city's building codes on my own computer. However, I shouldn't reasonably expect anyone to pay attention to my copy, and I also shouldn't expect to be able to cleanly merge in an updated version of the official codes. |
Thanks for spamming 😅 I read the blog article and I think I agree with everything in it. Especially the wording of attention vs privilege seems to be an improvement in the semantics.
That's not what I have in mind. All I want is a callback that allows logic outside of automerge to ignore changes – an attention-callback, to use your wording. Whether you use claims for that or "a chain of permission grants" is an implementation detail. What is important is that the logic outside of automerge understands the operation that will be applied. And for that I see two ways:
to determine if we want to pay attention or ignore the operation. I looked at the code and I could only find authorization with regards to the administration of the team or infrastructural actions like changing keys. (Where "team-wide document" means a copy everyone has that we expect to be mergeable with all the other copies inside the team) From what I understand that also isn't the scope of |
@MeneDev — I agree, and I don't object to adding this callback API to Automerge. That would be useful and probably sufficient for many use cases. I personally would hesitate to use that approach for an application of any complexity. Yes, you could intercept Automerge's low-level operations, and then guess the implicit semantics of the change based on JSON paths, but that feels messy and brittle to me. I prefer to use changes with explicit semantics — Redux-style named actions + payloads — that I can then reason about in a very straightforward way. So instead of detecting that Bob wants to change (There's a similar problem when it comes to using You could then use Automerge to store your named actions + payloads — I went a long ways down that path with a library called localfirst/state (formerly "cevitxe"). But what I've only recently come to realize is that a Redux-style event stream isn't that far off from being a full-fledged CRDT itself — all you need to add is a way of recording and resolving concurrent actions. (easy! 😅) The good news is that I've built that exact machinery for localfirst/auth — that's the "signature chain", a git-style DAG of signed, hash-chained actions — and I've refactored it out into a standalone library called CRDX that you can use to create a CRDT with your own semantics, validation rules, and conflict-resolution logic.⚠ This is maybe where I'm at with Automerge: I think its biggest selling point is it works with plain old JSON objects. And I think its biggest limitation is that it works with plain old JSON objects. To the extent that a big JSON blob works for you as a way of storing state, Automerge is a good choice. If event streams or Redux or the like would make more sense for your app, using something like CRDX might put you on more solid footing. Or you could roll your own simple custom CRDT, turns out there's no magic involved. This is the approach that @jlongster took with Actual and that @vjeux et al took with Excalidraw. ⚠CRDX is unreleased at the time of writing, docs are patchy, API may change, use at your own risk |
Based on some of my research and conversations with @ept and @HerbCaudill, it seems that conditionally rejecting / ignoring writes (i.e. to preserve application invariants or to deny an unauthorized write) are fundamentally at odds with eventual consistency. One solution is to coordinate such writes using consensus or a "lead actor". But, if you really want to follow Automerge's eventual consistency model, all writes causally dependent on a rejected write must also be rejected (as explained by @pvh above). Unfortunately, actors that have not synchronized in a long time may cause a very old write to be rejected, and to ensure eventual consistency, any actor who has "seen" the rejected write must subsequently reject any causally dependent writes. Thus, it is possible that days after a given state has settled, a single actor can cause a vast number of writes to be rejected due to a single rejected write. In my opinion, this trade-off is simply not acceptable for most applications; the entire reason for convergence and consistency is to provide the user with a reasonably stable application state. Somewhat ironically, preserving eventual consistency while rejecting writes would very likely confuse users and either cause data loss or a large number of conflicting writes. All of that said, it appears that decentralized authorization can be achieved with Automerge if you never want to revoke access to a document in the future. Finally, in my experience, there are many applications where a vast majority of writes can be eventually consistent, but unfortunately, there are few applications where all writes can be eventually consistent. In my view, Automerge is a tool to augment performance and provide a well-designed "offline mode" for eventually consistent writes. Still, for most apps, another tool or database will be required to provide durability and strong consistency guarantees. I would love to be wrong about any of this. Sadly, if I am correct, it seems that rejecting writes is not something that Automerge should do if it does not embrace some form of consensus for those writes. What do you think? Is this outside the scope of Automerge? Should this issue be closed? |
I think, broadly, what we're seeing is a pull to transition from "CRDT-as-mechanism-for-guaranteed-consensus" to "CRDT-as-engine-of-document-versioning". All of us around the lab have been thinking about this for a while now and I think that once the performance & sync work clears off our plate this is going to be an area of major focus. I'm not entirely sure what the trade-offs are for your application, but in a general sense, I think the short-term solutions are 1) don't sync with people you don't trust, with traditional documents if you let a stranger write to your files you're going to have a bad time and 2) think about the fact that late-arriving data probably shouldn't blow away work that depended on it, but we could at least detect it and surface the problem: "Hey, you probably want to audit this set of changes since they all depend on work done by someone we now know to be suspicious." Do either of those help in your situation, @bminer? |
@pvh - Thanks for your response! I find it really interesting to discuss this stuff. Regarding (1): sometimes you sync with people you trust, and then after some time passes, you don't want to trust them anymore (i.e. an employee who leaves to work for a competitor or a customer who stopped paying their bill). If you can afford to always allow reads and writes, then I think Automerge works well. Regarding (2): if late-arriving data conflicts with a large number of writes, what do you do about it? If you can afford to always accept the write, then Automerge again works well. I like your suggestion of detecting a suspicious write and perhaps asking the user to undo it if needed. Still, this does not prevent future writes and requires manual user intervention. I suppose if a malicious person continues to write to a document, actors could coordinate to copy the document from an agreed-upon snapshot to a new document and continue writing it from there. Still, this requires consensus among actors, which @ept has expressed might be outside the scope of Automerge. At a high level, this issue is about authorization, which implies that unauthorized writes need to be rejected. Maybe this is something Automerge should not support? I don't know. Thoughts? |
All other ideas I have about rejecting writes involve consensus in some form or another. For example, (perhaps outside of Automerge) actors can form a consensus about a document and agree to ignore writes from an actor after a particular logical timestamp. 🤷 |
Right -- what I'm saying is that right now the way you can solve this is to stop syncing with them. In the future we may be able to get clever-er, but maybe this solution is good enough? I think on the whole that if you think about this as a git repository you'd have most of the same dynamics. If someone has made commits and you have decided you don't want them around anymore, you just have to stop pulling from their repo / stop letting them write to yours. If someone else you know already accepted writes from them and then built on them... you'll have to audit those commits to make sure there wasn't any malfeasance but you can't really just drop them. We've got some upcoming research into this kind of thing (though not necessarily exactly down this vector) which might provide the necessary infrastructure to do what you want to do, but on the whole the more we discuss this the more it seems to me that the best place to solve these problems is up in the sync layer. I think the peer-to-peer document-identified-by-public-key approach typified by Hypermerge is a bust in this regard. I think what you want is something that feels more like git remotes except... not... bad. |
As elaborated by Herb Caudill in his blog post (still unpublished, I believe), at https://draft.herbcaudill.com/words/20210228-establishing-trust-in-a-world-without-servers, the concept of "authorization" doesn't fit at all well with local-first distributed collaboration. You either have to replace authorization with some other concept (e.g., you can simply disregard the contributions of someone who is "unauthorized"), or continue using a centralized authorization server, which is necessarily outside the scope of Automerge. |
@pvh - Yeah, I agree; this problem can be solved at the sync layer, but...
This is exactly my point. How should one handle this case? Some actors could accept a sync from an unauthorized actor (perhaps even "honestly" due to timing / network issues) and then propagate those changes through to other actors. It seems to me that a consensus among actors is required to prevent further syncing with an unauthorized actor. I suppose if you used a set of trusted, centralized sync servers instead of peer-to-peer syncing, this problem fades away, as in your example of a central Git repository. The more I think about this, the more I feel like writes should simply never get rejected. You can try to block actors by attempting to stop syncing with them, but this is not foolproof. Maybe this is good enough? Thoughts? |
The nice thing about automerge is you always have the full history of every document. As with a physical document, anyone can write anything they want... you just don't have to pay attention. |
@ept Does authorization belong to Automerge? Isn't it solvable by public/private key cryptography outside of the Automerge? What is the current vision? Thank you. |
@steida Cryptography is good for authenticating the author of a change, by signing it with the author's private key. That can be easily be done outside of Automerge, but it's only the first step. What we're discussing here is a more difficult issue: once you've authenticated the user who made a change, how do you determine whether that user is actually allowed to make that change? If they are not allowed to make a change, we need to ignore the change. This seems to require some degree of application-specific "business logic" (e.g. some fields are editable but others are not), which is harder to do outside of Automerge without some API support. |
Thank you for the explanation. As I see it, if someone gets access to Automerge data, they can do whatever they want. The only thing we need to know is whether allowed changes come from specified users, I suppose.
If someone makes a change that they are not supposed to do, remove their public key from the encrypted document so they lost access to it. But I just started thinking about that so I am likely overlooking something. |
That doesn't prevent a malicious user corrupting the document, and would require manual intervention. It seems reasonable to me that app authors would want a way to specify which changes they want to accept from collaborators and which they want to ignore. Just like in a centralised web app, where the server can perform validation of user input and reject the request if the user does not have permission to do the thing it's trying to do.
Users can update their own copy of a document in whatever way they want, and we can't stop that. But we can allow users to have rules that determine whether to accept or ignore such changes. |
Great input so far, thank you all for that! My current stance is that there are 3 possibilities:
(3) would be great, but it's a huge change to automerge and I don't intent to pursue it in this issue.
Initially I only wanted to do (1), but it would be quite limiting and I think that (2) is more interesting. However, I also think that (1) and (2) require the same callback and that anything regarding (2) above is a domain-specific implementation detail and out of scope for this ticket. |
I've looked into this again and found that automerge now has a My current proof of concept looks like this: function verifyingPatchCallback(patch, doc, newDoc, isLocal, binChanges) {
for (let change of binChanges.map(decodeChange)) {
let ops = change.ops
for (let op of ops) {
// TODO resolve full path
if (op.key === 'restricted') {
throw new Error("Can't touch this")
}
}
}
} I would favour something like function verifyingPatchCallback(patch, doc, newDoc, isLocal, binChanges) {
for (let change of binChanges.map(decodeChange)) {
let ops = change.ops
for (let op of ops) {
if (op.key === 'restricted') {
return ROLLBACK
}
}
}
// optional in js, mandatory in rust
return OK
} And the places actually calling the callback would become something like this: if (patchCallback) {
const result = patchCallback(patch, doc, newDoc, false, changes)
if (result === ROLLBACK) {
// return the unmodified document
return doc
}
} In the rust implementation Do you think that makes sense? What about the wording, is it a "transaction"? |
Hi @MeneDev, unfortunately the patch callback is called after the internal state of Automerge has already been updated, so by this point it's too late to stop the update from happening. Moreover, while Automerge's external API is immutable, its internal state uses mutable data structures (it would be too slow otherwise), so the rollback is not quite as straightforward as discarding the updated state. The Rust implementation does have a rollback feature, but I believe this is primarily for locally originated changes, and I'm not sure it can be used to rollback remote changes (someone correct me if I'm wrong). Therefore, while the general idea of having an authorisation callback makes sense, some groundwork in the internals probably needs to be laid before it can be used in practice. |
Ok, thanks for clarifying!
Martin Kleppmann ***@***.***> schrieb am Do. 15. Sept. 2022
um 12:35:
… Hi @MeneDev <https://github.com/MeneDev>, unfortunately the patch
callback is called after the internal state of Automerge has already been
updated, so by this point it's too late to stop the update from happening.
Moreover, while Automerge's external API is immutable, its internal state
uses mutable data structures (it would be too slow otherwise), so the
rollback is not quite as straightforward as discarding the updated state.
The Rust implementation does have a rollback feature, but I believe this is
primarily for locally originated changes, and I'm not sure it can be used
to rollback remote changes (someone correct me if I'm wrong). Therefore,
while the general idea of having an authorisation callback makes sense,
some groundwork in the internals probably needs to be laid before it can be
used in practice.
—
Reply to this email directly, view it on GitHub
<automerge/automerge#419 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABJGCW476272KQDTTNJHGQLV6L3W3ANCNFSM5BMSIL6Q>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I'm also interested in rejecting "bad" changes. In the Rust implementation we could achieve this by adding a return type to the various methods of the |
Sorry to ping, but several months have gone by. @ept - Any ideas / new research that might address this issue? |
@bminer Nothing new on this particular issue, I'm afraid — we have been focussing on other areas. |
I watched Brooklyn Zelenka presenting UCAN at Local-First Conf and that brought me back here. It has a similar idea to what has been outlined in this issue. Back when I was experimenting with that idea, I hit a wall at some point and I just wanted to share the problem and my solution. The idea of separating the tokens/certificates from the payload opens the system to something I'd call a "foresight attack". From the outside the attacker plays nice for a while, but may performs the attack at any time after gaining trust, even when the trust is lost later on. Examples would be a seemingly good employee who secretly prepares malice, maybe a XZ Utils co-maintainer. With a "true microservice" to authorize a change, the attack is trivially possible after permissions have been revoked from the attacker. Preparation
Attack Even when the position of trust has been lost, for example after the working contract has ended, and all permissions have been revoked, the attacker can still revert any of the backups with malicious changes and sync them. Prevention I've played around with some ideas like forcing key rotation after each time permissions are changed, using wall clock or logical (I think that was just pointless) time to add a due date to the signature, revoking changes, but they all either beak the CRDT contract, don't work at all, are hard to implement, expose end users to odd UX, or a combination of those. A simple solution: if an authority (central server or quorum) grants a permission, it must also distribute the payload, and once a payload is signed, it is eternally valid. Note that this still allows the payload to be encrypted. Sidenote: I haven't looked into details of UCAN yet. The video doesn't seem to mention something like a expiration, the website uses a unix timestamp like JWT. But my notes on this are from 2023-01-20 and I've been meaning to write this down here for some time 🙈 |
I'm thinking about implementing a system based on automerge with claims-based authorization (like JWT).
While I can sign changes from
getChanges
and verify the signature before callingapplyChanges
to ensure the authenticity, I don't see a way to implement authorization with the current API.The basic idea is to augment each change with a signed token that proves that the actor (or associated user) is allowed to perform that change. I think automerge should stay agnostic to the concrete implementation, but providing an API to enable that would open up interesting opportunities, especially for decentralized software.
From my current understanding I think this requires at least:
applyChanges
that gets called for each decoded change and can decide wether to apply or decline the change based on the decoded change and the path from the root of the document to the object being updated.In addition it might be interesting to
This meta-data could be used to associate the change with the respective signed token. I think the meta-data might be optional and an external association could be implemented by the consuming project. On the one hand it would make the API more convenient, but looking though the implementation, especially the columnar once, that would require a lot of changes and require the client project to implement custom decoders.
The text was updated successfully, but these errors were encountered: