Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds a longer, 8,192-word word list as words2.json #18

Merged
merged 5 commits into from
May 29, 2023

Conversation

sts10
Copy link
Contributor

@sts10 sts10 commented May 12, 2023

As requested in #17 .

List information

List length               : 8192 words
Mean word length          : 7.07 characters
Length of shortest word   : 3 characters (add)
Length of longest word    : 10 characters (worthwhile)
Free of prefix words?     : false
Free of suffix words?     : false
Uniquely decodable?       : true
Entropy per word          : 13.000 bits
Efficiency per character  : 1.840 bits
Assumed entropy per char  : 4.333 bits
Above brute force line?   : true
Shortest edit distance    : 1
Mean edit distance        : 6.965
Longest shared prefix     : 9
Unique character prefix   : 10

Word samples
------------
causal revealed resulted habits radio eagles
plea hopes fertilizer demanded spot global
receptions forum arthritis symbols gives bowls
statements colleague implies recruiting struggled campaign
melody succeeded blocking skill proceeded salaries

Why should Buttercup consider using/offering a longer word list? As argued, Buttercup's current 1,700 word list is a bit short compared to other password managers' lists. 8,192 words would bring Buttercup closer to the norm (e.g. KeePassXC and BitWarden, which both use 7,776-word lists).

Using 8,192 words means that each word from this longer list will give a passphrase an additional 13 bits of entropy. Thus, a 4-word passphrase from this longer list will have 52 bits of entropy (13 * 4), compared to just 42.9 bits from a 1,700-word list.

Why 8,192 words, specifically? As discussed, a length of 8,192, or 213, words should work nicely with binary random number generators, which I'm assuming Buttercup uses. Also it gives exactly 13 bits of entropy per word, which makes entropy/strength calculations a little easier. And it's a few hundred words longer than the standard of 7,776 words.

Why not more words? We could of course go with a longer list: Enpass's word list is either 14k or 11k, 1Password's is around 18k, and NordPass uses at least 123k words(!). As mentioned elsewhere, I'd nominate my Orchard Street Long List (17,576 words) if we wanted 14+ bits per word.

License

This list uses words from Wikipedia, so it's licensed under Creative Commons Attribution-ShareAlike 3.0 Unported License.

Disclaimer/things to check for

I haven't thoroughly checked this list for strange words, so let me know if you find any we should swap out.

@sts10
Copy link
Contributor Author

sts10 commented May 24, 2023

I'll note here that #19 is probably a safer choice, as it proposes the KeePassXC 7,776-word list (which is based on the EFF long word list) as words2.json, rather than my own list. I'll leave both PRs open for now -- your call!

@perry-mitchell
Copy link
Member

@sts10 Thanks so much for these!

While I understand the concepts discussed so far, I'm by far not knowledgeable enough on the matter to make the right choice. I'm satisfied in leaving that call to you, as you've shown your expertise on the matter quite clearly.

Which would you see as being the better option for Buttercup users? With the assumption that a stronger random phrase is always better (using defaults). If the differences are negligible I'd again just suggest that it's within your right to pick.

@perry-mitchell
Copy link
Member

I'll merge and release asap after the choice is made :)

@sts10
Copy link
Contributor Author

sts10 commented May 27, 2023

Ooh, tough choice, but I'll back myself here and vote to merge this PR rather than #19.

A concrete advantage of this 8,192 list is that it contains prefix words, which allows it to have some shorter, more common words compared to the EFF/KeePassXC list proposed in #19 (this 8,192-word list is still uniquely decodable though).

@perry-mitchell
Copy link
Member

Great, thank you!

@perry-mitchell perry-mitchell merged commit ba77b37 into buttercup:master May 29, 2023
@sts10 sts10 deleted the words2-longer-list branch May 29, 2023 19:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants