Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

private dm events #17

Closed
wants to merge 6 commits into from
Closed

Conversation

Giszmo
Copy link
Member

@Giszmo Giszmo commented Jul 9, 2022

@Giszmo
Copy link
Member Author

Giszmo commented Jul 10, 2022

Several issues with the nip so far:

  1. forbidding tags was meant to avoid referencing events or pubkeys as those would leak data but tags may also contain POW and other stuff that isn't relevant to privacy.
  2. the persistent shared secret derived account means that timing analysis (which two accounts were active with other kind events at times this DM conversation was advanced) would allow attribution to specific users.
  3. relays can still match who submits events and who asks for them

For (1), the tags could be left without mention by the nip.

For (2) I extended my POC implementation by a message counter that gets XORed on top of the shared secret before hashing. I chose XOR so that logic being omitted is compatible with applying it with counter = 0. But that leaves the client with having to query more and more pubkeys.

For (3), the client would have to query these messages from new TOR endpoints per message or not query by pubkey and filter locally on receive.

@Giszmo
Copy link
Member Author

Giszmo commented Jul 10, 2022

All mentioned issues are addressed by a massive simplification of the protocol. The new protocol is:

  • Pretend it's nip-04
  • Put recipient only on first messages or not if they are following you already. False recipients are encouraged on subsequent messages.
  • Query all your follows' kind-18 events regardless of the advertised recipient.
  • Query all kind-18 events that advertised you as recipient.
  • Query all events from pubkeys that previously sent you events using some privacy tools such as TOR.
  • Or ... query all kind-18 events as long as it's low volume.

@fiatjaf
Copy link
Member

fiatjaf commented Jul 10, 2022

Would this work if the kind was still 4? That would increase the privacy even bigger by mixing these recipientless events into the existing NIP-04 environment, and also provide a smoother transition from NIP-04 to NIP-18.

@Giszmo
Copy link
Member Author

Giszmo commented Jul 10, 2022

Would this work if the kind was still 4? That would increase the privacy even bigger by mixing these recipientless events into the existing NIP-04 environment, and also provide a smoother transition from NIP-04 to NIP-18.

I thought about that but ... how does the sender know what the recipient will support? I was about to write a nip to extend kind 0 to show supported nips.

@Giszmo
Copy link
Member Author

Giszmo commented Jul 10, 2022

Actually ... a client supporting both nip4 and 18 would not know if kind-18 would work, so ... I guess using the same kind is totally valid. The transition will be the problem. We could do the initial handshake with nip4 and secretly signal nip18 support? So if the recipient understands that weird and hacky <!--nip18--> in the end of my content, it will reply without recipient? Or should it straight go to sending a serialized object where the nip-4 recipient might also get confused?

Giszmo added 2 commits July 10, 2022 19:31
using kind 4 allows for a smoother transition
@Giszmo
Copy link
Member Author

Giszmo commented Jul 10, 2022

So TIL that comments in markdown work using this syntax [//]: # (Use nip-18 next, please). Astral does not render it.

@fiatjaf
Copy link
Member

fiatjaf commented Jul 11, 2022

I like the <!--nip18--> hack, it's cool, doesn't interfere if the peer doesn't support NIP-18, and also serves as an ad of something the other peer might be interested into looking at, or complain to their client development team: "add support to nip-18 please!" -- but I think it's better to ignore my opinions on this topic.

@Giszmo
Copy link
Member Author

Giszmo commented Jul 11, 2022

@fiatjaf if I ignore your opinion, I'm all alone here :D

So I just learned that decryption with wrong keys isn't guaranteed to fail. You may also get gibberish. Having this nip18 comment in the message would allow to test for it, to tell gibberish from content.

Also, if decrypt might work with gibberish, the client might spend a lot of time and RAM decrypting long messages. Maybe it's possible to peek into the beginning of the message to see if it's the nip18 comment hack. Once clients support advertising supported nips, we can remove this hack.

@jb55
Copy link
Contributor

jb55 commented Jul 11, 2022

Clients should query for all relevant kind 4 events. In the beginning, this might be simply all {"kinds":[4]} and later only those from follows.

This doesn't seem reasonable. This could be an insanely huge amount of data in the future.

@Giszmo
Copy link
Member Author

Giszmo commented Jul 11, 2022

Clients should query for all relevant kind 4 events. In the beginning, this might be simply all {"kinds":[4]} and later only those from follows.

This doesn't seem reasonable. This could be an insanely huge amount of data in the future.

It's actually quite limited. If you follow some 5k users, yes, you get on average 5k messages for each relevant message but that's very manageable and most users have way fewer follows. A problem are the cold-calls where you get a message from somebody you are not following. Those can still advertise you as a recipient via the tag for a first message. And if your client doesn't care about the users' privacy, it can remain leaky and enjoy the extra privacy from others advertising false recipients, giving your users a bit of plausible deniability.

@Giszmo
Copy link
Member Author

Giszmo commented Jul 11, 2022

nip-18 clients could even create pretend-conversations! Say a spammer cold-called me. Now, on my next message to my brother my client could advertise sending to the spammer. If spammer spams me again and I keep advertising them as recipient, outside observers would assume I was chatting with them, not with my brother and would not know I'm dropping their events.

As a rule, clients could keep their last nip-4 recipient sticky when sending to nip-18 recipients.

@jb55
Copy link
Contributor

jb55 commented Jul 11, 2022 via email

@Giszmo
Copy link
Member Author

Giszmo commented Jul 11, 2022

How is a global query on kind 4 quite limited. You're asking each client to sync all DMs for all users across all relays. Maybe I am misunderstanding this.

It would not be a global query. It would not be {"kinds":[4]} once that becomes uncomfortably big but I would use this for now, as it's just the most private way and would make alternatives available later.

In the future it would be

  • {"kinds":[4],"#p":[myPubKey]} for cold-callers' first messages. Once I accept their message request, my client should add them to my (private) correspondence.
  • {"kinds":[4],"authors":[allMyFollows]} for my regular correspondence
  • {"kinds":[4],"authors":[oneColdCaller]} submitted via TOR, one at a time for non-follows. This one can be slapped on the second query if you trust the relay to not share your queries.

What you are proposing is almost the same as what I had implemented and proposed here, too. I went with

channelPrivKey = H(shared_sec, counter)
channelPubKey = pk(channelPrivKey)
tag = ["p", channelPubKey] // obsolete as both query for authors: [channelPubKey]

Clients have to query an increasing amount of pubKeys this way and fiatjaf's main criticism is that the sender is being faked, making it hard to keep spammers at bay. If I understand you correctly, your proposal has both those issues, too.

Sending from your publicly known (real) ID only reduces the anonymity set somewhat. Not all send DMs and less so every day or in certain hours but rotating accounts are a burden for clients and relays with queries getting really big.

It does not resolve that you have to trust your relay as both ends would be unique in sending/querying the same events. In this regard, my approach is better.

@rcoder
Copy link

rcoder commented Jul 15, 2022

👋 No one explicitly invited me here, but I've been corresponding with at least some of y'all on-network anyway, so:

Alternate semi-wild idea:

What if folks interested in receiving DMs actually announced a "mailbox" channel uuid tag in their account metadata? They could subscribe to said messages and handle accordingly (validate, drop, time-delay, etc.) and filters could drop anything that didn't include the tag.

Those channels could also be handed out selectively as "privileged" mailbox identifiers for friends-of-friends or any other audience that wasn't a) global, and b) revocable w/o invalidating a public key. ("Revocation" in this sense just means ignoring the new messages including that tag, though relays could be given informational notice that a revoked topic tag was invalid.)

@Giszmo
Copy link
Member Author

Giszmo commented Jul 15, 2022

@rcoder everybody is invited to comment on this. Thanks for sharing your thoughts although I am not sure I understand you right. Announcing a "malibox channel uuid tag" doesn't look very private if done through "account metadata" aka kind 0 event. How can we scrub the public record of who's conversing with whom? Can we do that with your proposal? If so, please explain in more detail.

@rcoder
Copy link

rcoder commented Jul 15, 2022

My intent with the UUID mailbox was simply to avoid global DM listening: people who were interested in an initial "cold call" could subscribe to that UUID, and rotate/ignore it at needed.

Once a channel key is established you can choose another arbitrary UUID for ongoing messages; those subscriptions, like any that you don't want a relay to be able to deterministically attach to only two nodes, would need to have chaff and/or a probabilistic filter in the mix.

I'm inclined to agree with the assertion that simply dragging a net with every pubkey you've ever followed as a potential DM sender is not particularly scalable, or privacy-preserving.

Broadcast is better from a privacy POV, but I'm also interested in using Nostr over low-bandwidth/high-latency channels. That means anything that cuts down on excessively-wide filter matches is a win.

@Giszmo
Copy link
Member Author

Giszmo commented Jul 16, 2022

But ... the "UUID mailbox" would be linked to your profile. Is this just about signalling readiness? What's the point and the point of rotating it? Network observers would learn about all my inboxes and who wrote to those, so for the first contact, it's as good as them writing to my pubkey. It's just making data more obscure, not secret.

With "channel key" you refer to the initial approach or what jb55 said? Those channel keys, especially if you rotate them with the ongoing correspondence are even more keys to listen for than all your follows ever plus some actual correspondence. Think about restoring your account from only the privkey.

Probabilistic filters are no gain at all for privacy.

Discourse: Why are Probabilistic Filters (PF) not a privacy improvement here? Let's say you follow 200 pubkeys and have 200 private correspondences. You could use a PF with 3% false positives? Now of the 100k pubkeys in the network 3k match your filter as fp and 400 cause you added them. Now you recalculate your filter to add another 20 pubkeys. If you are not careful, you now match a completely different set of 3k fp and anybody who knows all your PFs can trivially learn your actual pubkeys of interest. If on the other hand you are careful to keep the same fp as before, you could have chosen a PF with fp of E-10 with 3000 dummy pubkeys added. Or you could just have queried a plain list of those actual and dummy pubkeys. PF doesn't change the privacy aspects. It only helps to reduce the data transferred. To reduce data, adding 3000 keys and reducing the fp rate is bad but that's another issue.

@rcoder
Copy link

rcoder commented Jul 16, 2022

Re-reading the full thread I think it's fair to say that my goals for private messaging channels are fairly different from what's being discussed. Given that, it's probably best I pull back and not further complicate things. I'm happy to kick the tires on whatever emerges here, and have no commitment to NIP-04 or my half-baked ephemeral channel idea.

I will say that I'm at least as concerned with deliverability and chatter/overhead in highly-constrained networks as I am with ideal metadata security. A bit of payload protection is enough for my purposes because I need to be able to make routing and retention decisions based on metadata, even for messages whose contents are private.

But that's my use case, not the general one, and I don't need to drag this thread sideways.

@Giszmo Giszmo mentioned this pull request Apr 25, 2023
@AsaiToshiya
Copy link
Collaborator

@paulmillr Is this also related to #658?

@paulmillr
Copy link
Contributor

yes

@staab
Copy link
Member

staab commented Dec 14, 2023

Closing this in favor of #746 and #686

@staab staab closed this Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants