CodexQR and Codex32 QR for computer readable Codex32 strings #66
Replies: 7 comments 13 replies
-
I think 3 bits is fine. If we're dropping the codex32 checksum anyway it already requires post-processing to get the original string back so I think there's no harm in messing up the characters.
I like the idea of having a "standard" and "compact" version. I think "standard" should max everything out and "compact" should minimize them. And clarify that the 'compact' version is only for ephemeral uses; it shouldn't be printed etc. But TBH I rarely use QR codes and don't have a good intuition for them. |
Beta Was this translation helpful? Give feedback.
-
...
Yeah. I didn't read your whole post, and am trusting you when you say that the QR checksum is much stronger than codex32 even at low settings (which isn't that surprising to me; QR codes seem to work really well in pretty hostile environments .. at least, scanning them works well). But I agree that the header, or at least the identifier and maybe threshold, is fine to just write in plaintext somewhere on the QR device. And then the user would need to manually type it when scanning the QR code. Which sucks ... but sucks much less than making them create a far larger QR. |
Beta Was this translation helpful? Give feedback.
-
If we put the digest in the share, should this be a share we expect the user to have? We can't really do that, reliably, so I assumed we wanted to put it in a share we don't expect the user to have (and you'd need threshold-many shares to recompute it to check it). We still can't guarantee that the user won't have a particular share, but it's an easier thing to make probable.
I think we agreed elsewhere that the padding should be random, under normal circumstances. When generating by hand it will be rkandom, and somewhat difficult/annoying to skew because of the way tho dice worksheet works. Then, if we want the ID to be something specific, we again have trouble in the re-sharing case where the ID is supposed to change. |
Beta Was this translation helpful? Give feedback.
-
Digest Flag
Besides "special" padding or IDs, changing the For the QRs however, they have 1 free bit that can flag for a digest and still encode the threshold if it's restricted to 2 or 3. It's not intuitive but the threshold could be written as A, C, D, E, F... for 2, 3, 4, 5, 6... when a digest and bip32 fingerprint ID is present. Digest share index
A reason to put digest in a share the user may have: If you did this by hand (roll random data, put the characters on a computer to create a digest, write it into the digest share.) It's more labor to interpolate another share to destroy the digest share. A 30, 35 or 40-bit digest would be less labor than 4 bytes as random data and digest won't mix in the same character. 40 is easiest as it's a direct 5-byte conversion. Does it matter if they have the digest share or not? The digest share's index could even depend on the secret for 5 more bits of security. Schrodinger's digest share. If it matters, then extend "R" to include all the randomly generated share data, not just the m - 32 random bits of the digest share. Now it always requires T − 1 share values to help brute-force search. From SLIP39:
The digest could be hardened by a KDF.
Parity bit padding is hand computable. Generate (or have) 128 random bits, then count the ones, and even position ones. It triples the (non-index & non-payload) bandwidth of compact QRs to 3-bits so is worth using if they may ever want to draw QRs of their shares. 3-bits allows encoding all thresholds plus some weak protection against combining wrong QR sets. (even w/o a digest) Non-random Digest or Padding are the same. So let padding be random.There may be an option with a 5 byte digest (8 characters) that avoids mixing random and non-random data in the same character complicating the dice worksheet. It feels like it doesn't matter (??) which share bits are non-random, the total security drop will be the same. So they can move around across and within shares without changing the T - 1 brute force situation. Ex: ordinary T=2 codex32 has 260-bits of security on a 128-bit secret, which is why padding could be non-random for no T-1 privacy loss. How about generate T * 26 - 8 characters randomly and the final 8 of the T-th share is optionally the digest of the previous shares concatenated and first 27 characters of this final share... No funny business, just type the complete and incomplete strings into sha256sum and convert the first 5 bytes to bech32. Can the digest be codex32 instead of SHA256?Can you tell me what's wrong with setting these 6-8 characters using a truncated codex32 checksum (with data set to all random characters generated) as the "digest"? It seems you lose len(digest) security at T - 1 no matter what, so why not make it a hand computable digest? This seems like it protects against malicious tampering if adversary lacks T shares? Or could structured errors be introduced to one share that adversary knows won't change the digest? In that case, sha256 is better, unless the ID is always fingerprint. Recovering the identifier from compact QRs and detecting Invalid combinations
For compact QRs, to recover an ID it must be either standardized, included in the digest and brute forced for or encrypted by the digest. Perhaps the ID is XOR'd over the digest, using the known 12-bits of the 32 to confirm the set combination is valid and/or shares are untampered and the first 20-bits to decrypt and recover the ID. That way we get full 32-bit protection against bad combos and malicious tampering when using full shares and 12-bit when using compact QRs. For encryption to work in the 8 character digest example above, the digested data would have to be limited to just index and payload characters so the QRs can compute it.
We decided the threshold and ID alone must not be the only nonce when re-sharing. The user gives a "unique string" and the wallet gives a monotonic counter + serial number / install date. Then the bip32 fingerprint ID is encrypted by the unique string for the default ID. IDs may rarely collide, but the share payloads are unique and secure if either wallet or user created a nonce. Combining two shares with the same ID that are different sets can be overwhelmingly detected by the digest. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
My intuition is we want to replicate the codex32 share experience as closely as possible, given the lack of space, that means knowing the threshold and some info about set membership from the first scan. This seems in line with principle of Least Astonishment. Also remember: any codex32 features sacrificed to fit in 137.3 bits stay in the 25x25 "standard" QR as these directly encode the string. We previously justified culling thresholds to 0, 2 or 3 saying "higher can draw standard QR." Similar could be said for custom* IDs. Agreed? A flag for digest or fingerprint ID is simple, but users may not expect these have no immediate wrong set rejection ability like codex32 shares they came from do. It's unhelpful to learn you screwed up only After you've scanned QR 3 of 3, (possibly flown hours); a descriptor, PSBT, address or empty wallet can tell you this. So I think the early warning of the first 3 options are more "share-like" than flags and more likely to avoid pain and astonishment. Although maybe 60% detection rate of wrong QRs is so low it'd be better without. It can reach 80% by requiring a digest and dropping the threshold but less share-like with software needing 2 scans and the fingerprint to know the threshold and 1 QR can't convert back to a share (needs 2 + ID or 1 + threshold + ID). Digest generation is more work to implement and incompatible with hand generation. Skip digest complexity and make due with 60%? I think if users know they're supposed to rely on themselves to visually verify the 4 character QR label matches the previous scan, then extra error detection when they goof up can only help. Right? The publicly available out of band fingerprint is the magic letting these detect wrong QRs without typing the ID. *Custom IDs that are also fingerprint IDs can be found with seed grinding on desktop wallets; few seconds finds the desired leading 20-bits in fingerprint. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Stumbled across this:
https://help.blockstream.com/hc/en-us/articles/10426338118169-What-is-a-SeedQR-
It's easy to type a 128-bit codex32 share, but may not be without a keyboard.
There's no reason codex32 shares cannot be encoded in compact QRs, but we also need to at minimum store the threshold, identifier and index.
BIP39 SeedQR has two levels:
Standard: Uses the index of each recovery phrase word (based on the BIP39 wordlist) and concatenates them into one long stream of digits. This specification can be scanned by any QR code reader and fairly easily converted back to your list of words.
Compact: Expresses the index of each word in binary instead, making the QR code 35-40% smaller. This specification is much harder to read and interpret without the proper software. Blockstream Jade allows export of this type of SeedQR due to these benefits.
For codex32 shares:
Standard: might as well be encoding the entire uppercase codex32 string in a low error correction QR, just like addresses.
Compact: is more interesting, here we have 3-bits for the threshold, 20-bits for the identifier and 5-bits for the index, plus 130-bits, 260-bits or 515-bits for the payload.
128-bit shares:
The most compact representation is 20 bytes if the codex32 checksum is dropped, which is safe to do since QR has its own.
20-bytes means it can be encoded by taking threshold+identifier+index+payload as a list of base32 values with length 32 and then converting to bytes.
256-bit shares:
3-bits threshold, 20-bits identifier, 5-bits index, 260-bits payload = 288 bits = 36 bytes
Here going from a list of base32 values per character to bytes ends up needing 37 bytes.
512-bit shares:
Here also, using 3-bits for the threshold allows encoding this in 68 bytes. While using a base32 character list to bytes requires 69 bytes.
Do you prefer serializing the threshold as 3 bits or as 5 bits to keep the character mapping intact, knowing it adds 10 plus pixels to the 256-bit and 512-bit CodexQR?
Error Correction level:
The codex32 checksum allows restoring ~27% of the data for 128-bit seeds, 17% for 256-bit seeds and 11% for 512-bit seeds.
Using Level H for 128-bit, Level Q for 256-bit and Level M for 512-bit gives better redundancy than codex32.
I am unsure what SeedQR is using.
Should we copy SeedQR, use the levels above that are at least as good at correcting errors as the original codex32 or use the least error correction that easily scans?
Conclusion
While this probably shouldn't be a replacement for writing the codex32 string, it is much faster and less error prone to enter. And time spent drawing easily pays for itself after just a couple scans.
It looks like it's possible to decode QR codes without errors back to binary, as well as draw one without any error correction from binary. I believe the reed solomon coding is GF(2^8) so not as paper computer friendly (paper computer impossible?)
Either way, whenever a share is imported, it'd be a nice option to get the QR displayed. Once a codex32 secret is recovered, all CodexQRs could be generated for supplementing the backups.
Should I add this functionality to Bails?
I looked into OCR are there were too many dependencies, but QR scanning is easy. It seems like a nice option to side step the unintelligible handwriting problem of importing other people's shares.
Beta Was this translation helpful? Give feedback.
All reactions