Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Password Hashing API #21766

Closed
paragonie-scott opened this issue Jul 11, 2018 · 40 comments
Closed

Password Hashing API #21766

paragonie-scott opened this issue Jul 11, 2018 · 40 comments
Labels
crypto Issues and PRs related to the crypto subsystem. security Issues and PRs related to security.

Comments

@paragonie-scott
Copy link

paragonie-scott commented Jul 11, 2018

In a few words: The new scrypt API in v10.5 is dangerously designed and will cause people to make security mistakes. A new API is needed for the 98% problem (i.e. password hashing). Additionally, the documentation for the old API should make it abundantly clear that it's not meant for non-expert users, and to instead use whatever the new API is called.


Copying my comment (with slight corrections) from #20816


This is how PHP does password hashing:

// All in one go:
$hash = password_hash($userProvidedPassword, PASSWORD_DEFAULT);

// Later:
if (password_verify($userProvidedPassword, $hash)) {
    // Everything works out. Optionally, do this too:
    if (password_needs_rehash($userProvidedPassword, PASSWORD_DEFAULT)) {
        // Someone upgraded PHP and there's a new default algorithm. We should rehash the password.
    }
}

This is almost an optimal UX for PHP developers for password storage: Salts are explicitly generated by the kernel's CSPRNG, encoded, and included in the password hash.

A step further would be making PASSWORD_DEFAULT--, well, the default.

You should only need one input for generating a new password hash: The password.
You should only need two inputs for verification: The challenge and the attempt (or the hash and password, respectively).

All other parameters should be optional.

Note: Key derivation is a separate concern, where you don't want the salt stored alongside the password hash, but that's a smaller corner of the cryptography usability market than password hashing.

Let's look at how another library handles both corner cases:

Key Derivation

#define PASSWORD "Correct Horse Battery Staple"
#define KEY_LEN crypto_box_SEEDBYTES

unsigned char salt[crypto_pwhash_SALTBYTES];
unsigned char key[KEY_LEN];

randombytes_buf(salt, sizeof salt);

if (crypto_pwhash
    (key, sizeof key, PASSWORD, strlen(PASSWORD), salt,
     crypto_pwhash_OPSLIMIT_INTERACTIVE, crypto_pwhash_MEMLIMIT_INTERACTIVE,
     crypto_pwhash_ALG_DEFAULT) != 0) {
    /* out of memory */
}

Password Hashing

#define PASSWORD "Correct Horse Battery Staple"

char hashed_password[crypto_pwhash_STRBYTES];

if (crypto_pwhash_str
    (hashed_password, PASSWORD, strlen(PASSWORD),
     crypto_pwhash_OPSLIMIT_SENSITIVE, crypto_pwhash_MEMLIMIT_SENSITIVE) != 0) {
    /* out of memory */
}

if (crypto_pwhash_str_verify
    (hashed_password, PASSWORD, strlen(PASSWORD)) != 0) {
    /* wrong password */
}

Usable cryptography API design is a nontrivial undertaking, and getting it wrong will mean years (or even decades) of clean-up.

I would propose, at minimum, an alternative API that has a similar user experience to PHP's password_hash() that transparently wraps scrypt as part of the Node.js core, and then recommend that API for users who want to store password hashes (which is roughly 98% of the use case when someone reaches for bcrypt, scrypt, or Argon2).

Or you could always just incorporate the design of @joepie91's scrypt-for-humans into the Node.js core. Designing cryptography APIs for humans is a good idea.

@joepie91
Copy link
Contributor

(Continuing on from the previous thread.)

@ChALkeR While I understand that strictly speaking it cannot be removed due to stability concerns, this was merged less than a month ago, and the security impact of this issue is significant. It is very unlikely that the feature has seen any real-world adoption yet, especially considering that there have been userland options available for a long time, and the addition of this feature to core has not been widely publicized.

Taking into account these data points, I think removing the API is justifiable even under the stability constraints of Node.js core.

Documentation warnings, while better than nothing, are a substandard option; these warnings are often ignored, and they rarely get included in third-party documentation about a feature (eg. tutorials, screencasts, and so on). They are not likely to be an effective deterrent.

All things considered, removal-for-the-time-being of the feature seems the most reasonable course of action here.

@vsemozhetbyt vsemozhetbyt added the crypto Issues and PRs related to the crypto subsystem. label Jul 11, 2018
@vsemozhetbyt
Copy link
Contributor

cc @nodejs/crypto

@ChALkeR
Copy link
Member

ChALkeR commented Jul 11, 2018

@joepie91 Do you think those concerns could be resolved with non-breaking changes?

E.g.: warnings in docs, making salt optional and suggesting to avoid supplying it, introducing a safe verification API, even more doc fixes?

@ChALkeR ChALkeR added the security Issues and PRs related to security. label Jul 11, 2018
@joepie91
Copy link
Contributor

I don't see a reasonable way in which this could be fixed in a non-breaking manner, personally; documentation warnings 'rot away' in the copy-pasting mill of tutorials and such, and an optional salt parameter would likely still cause confusion as to whether it's "more secure" if provided; I've found the idea that "more explicit options means safer" to be rather common amongst developers.

But most critically: the current return format is totally incompatible with a safe verification API. For a safe verification API to work as suggested, all the options and the salt need to be encoded into the output of the hashing function (such that the verification function can reproduce the hash), but as far as I can tell it only returns a 'raw' derivedKey right now. Changing the return format would be a breaking change either way.

As an aside regarding the option of removing the unsafe API: 10.x is not yet in LTS, so the constraints on what can be changed API-wise are a little less strict, if I understand correctly.

Also, perhaps it's a good idea to ping the Security WG on this as well?

@ChALkeR ChALkeR added the tsc-agenda Issues and PRs to discuss during the meetings of the TSC. label Jul 11, 2018
@ChALkeR
Copy link
Member

ChALkeR commented Jul 11, 2018

/cc @nodejs/security-wg

@tniessen
Copy link
Member

tniessen commented Jul 11, 2018

I completely understand the issue, and the usability of the new API has already been criticized in the PR. However, this API was not designed to provide a comfortable way to hash passwords but to provide a full-featured scrypt implementation. If a crypto API claims to implement scrypt, I expect to be able to specify the salt and additional parameters and to get the raw, unfiltered hash out of it.

We can probably add a new API that is designed for easy-to-use password hashing similar to what PHP does, but it has downsides. For example, PHP includes metadata in the output, something you probably would not expect, and which can easily break interoperability with other libraries.

Many of these arguments apply to other crypto APIs: Users have to set certain parameters, derive IVs for cipher objects, pick cipher algorithms etc., and I still wouldn't want to remove all those features and just use defaults everywhere. If you look at the crypto commit log, you will see that we are doing a lot to prevent people from improperly using these APIs, but there are no guarantees. Again, we could add "simpler" APIs for each and every little crypto task (encrypt, encryptWithAuthentication, ...), but even having those does not justify removing the feature-complete ("complex") APIs.

Summary: I am not opposed to a new API that is designed for easy-to-use password-hashing, but having a full-featured scrypt API is justified.

@paragonie-scott
Copy link
Author

We can probably add a new API that is designed for easy-to-use password hashing similar to what PHP does, but it has downsides. For example, PHP includes metadata in the output, something you probably would not expect, and which can easily break interoperability with other libraries.

Specifically, they use a crypt(3)-compatible output.

Summary: I am not opposed to a new API that is designed for easy-to-use password-hashing, but having a full-featured scrypt API is justified.

I've started on an implementation of a password-hashing wrapper to submit in a pull request (which will use scrypt by default). I hope to have a first draft ready to be reviewed this evening.

@tniessen tniessen removed the security Issues and PRs related to security. label Jul 11, 2018
@tniessen
Copy link
Member

@ChALkeR I am removing the security label. Based on my understanding, the only security risk is improper use of this API, which applies to most if not all existing crypto APIs. Feel free to add the label again if the security risk extends beyond that.

@paragonie-scott
Copy link
Author

Based on my understanding, the only security risk is improper use of this API, which applies to most if not all existing crypto APIs.

Correct, but with a caveat worth noting: cryptography API design is very important to application security in a lot of ways that aren't always immediately obvious.

Node.js cryptography API security is, therefore, emphatically a security issue that affects Node.js.

Whether or not that fact is sufficient to qualify for the security label as is used internally by Node.js is something I will not speak to, because I do not know how your team manages labels/delegation.

As long as we're all clear on that note, let's carry on.

@deian
Copy link
Member

deian commented Jul 11, 2018

I couldn't agree more with @joepie91 and @paragonie-scott. "Experts" can always write their native add-on and use the underlying crypto spaceship that is OpenSSL (to quote Matthew Green), they can also get at low-level implementations indirectly (e.g., we can define something like crypto-subtle) but the front-and-center APIs should address the 99% use case and make it impossible to misuse.

Adding red tape warnings to the docs is a good start and maybe best be can do without breaking compat (unless there are usage metrics that say it's not really used), but we should really try to set up a process for crypto and security APIs to avoid repeating OpenSSL/WebCrypto pitfalls.

@drifkin
Copy link
Contributor

drifkin commented Jul 11, 2018

I opened an issue in the security-wg nodejs/security-wg#339 to discuss some of the broader issues that are being brought up here.

I personally think that we should be very careful adding features to crypto that are likely to be misused. I know that there's a lot of precedent here with the other exposed OpenSSL APIs, but if we want to continue in that direction, IMO we should strongly discourage average developers from using these low level APIs. Developers are commonly told to "just use scrypt", so by providing this API, I suspect this will lead to a lot of people unknowingly creating security vulnerabilities in their applications.

@ChALkeR
Copy link
Member

ChALkeR commented Jul 11, 2018

@tniessen in this repo, I use security label for everything that is security-related, raised security-related concerns (including those in API or user code) and/or can have significant security impact on the ecosystem. See https://github.com/nodejs/node/issues?q=label%3Asecurity.

Vulnerabilities in Node.js itself do not belong here and are discussed in a private repo, the label is not for those.

I.e. security label !== «vulnerability», it is needed to group issues that might have security impact (including that on the ecosystem via API design, e.g.: Buffer(), #5258, #6821, #7152, #8182, #9434, #9531, #11968, #14537, and many more).

I think that having a security label on this issue is justified, as the issue itself is about the fact that introduction of the scrypt API in #20816 could potentially have negative security impact on the ecosystem, and about ways to solve that.

tl;dr: the «security» label does not have a description «vulnerabilities in Node.js core», it has a description «Issues and PRs related to security». This issue is related to security.

@ChALkeR ChALkeR added the security Issues and PRs related to security. label Jul 11, 2018
@joepie91
Copy link
Contributor

Developers are commonly told to "just use scrypt", so by providing this API, I suspect this will lead to a lot of people unknowingly creating security vulnerabilities in their applications.

This is precisely what I'm worried about. I would be less concerned if there were eg. a high-level crypto.scrypt that was safe to use and an additional low-level crypto.scryptLowLevel or so; because in that case, the obvious choice ("the one that's named 'scrypt', right?") would be the one that people would pick.

This issue really isn't so much about the implementation specifics, as it is about the API that is presented to the user, and what options are made to look like the obvious ones. The pit of success is a relevant consideration here.

A compounding issue is that a lot of developers have the tendency to prefer things provided by the ES specs or by Node.js core over external modules, even if the external module is a justifiably better option; while for something like a HTTP server this will just mean that developers are wasting their time, for something security-critical like this it means that people will be writing insecure code.

It is therefore important to ensure that the core libraries are as safe as possible, to match the expectations that users have of them.

@mcollina
Copy link
Member

Almost none of the crypto APIs could be really considered safe for the crypto-unaware developer, but they are exposing underlining crypto functions. This is also true for APIs that might seems innocent but if misused could easily create DoS vectors. Node core is full of those: people advertises their usage (badly) all the time.

I think the doc must show a safe way to use this API, and this is fixable right now.

@joepie91
Copy link
Contributor

joepie91 commented Jul 12, 2018

"Showing a safe way to use this API", by which I suspect you mean "examples of using it safely", are not sufficient; this has been tried by other languages for decades and turned out to simply not work in practice (due to aforementioned 'tutorial rot', in part).

This is a large part of why eg. PHP is now just implementing safe-by-default APIs that cannot be misused this way. Safe-by-default APIs do not allow for the correct usage to become decoupled from the API.

@tniessen
Copy link
Member

tniessen commented Jul 12, 2018

I couldn't agree more with @mcollina, this problem is not restricted to scrypt or crypto. Improper use of APIs will always create security issues, our task is to help minimize such problems. As I said before, e.g. in nodejs/security-wg#339, I am not opposed to a safer/high-level API, but people can (and will) still use "unsafe" APIs, some of them improperly. And that still is not a reason to remove such APIs as @joepie91 suggested.

@joepie91 I would love to discuss a safer API with you via IRC, Slack or the like, let me know if you are up for it.

@antsmartian
Copy link
Contributor

Almost none of the crypto APIs could be really considered safe for the crypto-unaware developer, but they are exposing underlining crypto functions. This is also true for APIs that might seems innocent but if misused could easily create DoS vectors. Node core is full of those: people advertises their usage (badly) all the time.

I agree with @mcollina. We can't remove an API just because developers can use it bad. Rather, strong documentation is required, which should be good enough for this use case.

@bricss
Copy link

bricss commented Jul 18, 2018

Please do not bake new APIs like in a bakery. Its better to update documentation.

@paragonie-scott
Copy link
Author

paragonie-scott commented Jul 18, 2018

@bricss:

Its better to update documentation.

Interesting assertion. Why do you believe this to be true?

@joepie91 claimed the following above:

Documentation warnings, while better than nothing, are a substandard option; these warnings are often ignored, and they rarely get included in third-party documentation about a feature (eg. tutorials, screencasts, and so on). They are not likely to be an effective deterrent.

@afk11
Copy link

afk11 commented Jul 18, 2018

@bricss: it's precisely this reason why a fit-for-purpose API should be exposed ASAP - an unsafe option was added, and crypto primitives are extremely difficult to remove when people rely on the algorithm being available. https://xkcd.com/1172/

so we should always provide a fit for purpose API for people when the 99% of the use the API for password hashing, since developers frequently make up parameters when they come to use them (it's a foot gun)

Encapsulating such concepts in a simple API for password hashing (as PHP has done) means you can also transition to new algorithms easily.

edit: hit shift and enter, submitted before I was done

@ChALkeR
Copy link
Member

ChALkeR commented Jul 18, 2018

So, the proposal that I have, for the current branch (10.x):

  • Rename .scrypt to something else, e.g. .scryptUnsafe 😉 (or something else, this is just what first came up)
  • Add a safe API as suggested, call it .scrypt, review that through @nodejs/security-wg.
  • Have calls to the old API by the name .scrypt polymorphically redirect to .scryptUnsafe for 10.x only, but print deprecation warnings.
  • Related doc improvements.

Outright removal might be too breaking, but the above might be fine. /cc @nodejs/tsc — I believe that's somewhere among the lines that we discussed today (i.e. deprecating in 10.x, removing in 11.x), is that correct?

Also /cc @nodejs/security-wg (and specifically @drifkin) for input (nodejs/security-wg#339).

Also, please don't remove the tsc-agenda label yet, I want to bring this up again (with updates) next week.

@joepie91
Copy link
Contributor

@ChALkeR That seems like a reasonable solution to me. It would avoid hard-breaking while still making it very difficult to miss the unsafety of the API.

@paragonie-scott
Copy link
Author

@tniessen That's not as convincing of an argument as you may believe it to be.

The current use of createCipher really is orthogonal to our discussion here, but the I would encourage the same deprecate-and-rename approach in that situation too. We need a higher-level crypto API that doesn't require a cryptography engineer to implement safely, but the focus of this issue in particular is passwords/scrypt.

@mhdawson
Copy link
Member

Won't necessarily help for this specific issue, but in the discussion in the TSC meeting today we talked about requiring sign-off from the security working group for new security-related APIs (as mentioned earlier in this discussion). Not everybody was in attendance but from those who were there, no objections were voiced. I'll draft a PR to add something along these lines to our onboarding docs tomorrow and we can see what the broader collaborator community thinks.

@ChALkeR
Copy link
Member

ChALkeR commented Jul 18, 2018

@tniessen there is already an ongoing discussion about the crypto module API in general, in the Security Working Group repo (nodejs/security-wg#339). Renaming it could be discussed as an option there, but I personally think that it is not justified atm (for several reasons) and would prefer less disruptive changes. I will chime in and take a look at what could be done.

Also do note that we have a history of fixing things that were designed unsafe, including Buffer and the http module.

I would prefer to keep the discussion in this specific issue limited to what to do with the scrypt API, because it is special in the sense that it was just recently added, did not receive wide adoption yet, and could likely be fixed without major fallout. Also that's what caused this issue to be filed initially.

Re: crypto module in general – major changes are not possible until 11.0, so we have enough time to wait for the input from the Security Working Group.

@drifkin
Copy link
Contributor

drifkin commented Jul 19, 2018

👍@ChALkeR: I think given that reverting is considered semver major, this is a very good solution. As @joepie91 says above this will make it really unlikely that someone unknowingly uses the "unsafe" version.

Just a reminder that for the high-level API, we'll need more than just crypto.scrypt(), since we'll need a corresponding verification function in order to prevent timing attacks from users manually checking hash equality (originally brought up in #20816 (comment)). But we can work out the full API details once we decide whether this is the way to go. And from nodejs/security-wg#339 (comment), it sounds like @tniessen and @joepie91 have already been discussing this together, though in a slightly different context.

@lirantal
Copy link
Member

lirantal commented Jul 19, 2018

The unsafe kind of naming convention has also been employed by the React team with the dangerouslySetInnerHTML (ref: https://reactjs.org/docs/dom-elements.html#dangerouslysetinnerhtml) but I believe this should be scoped to some methods and not the entire module (as in, not to rename the module itself to crypto-unsafe)

@bricss
Copy link

bricss commented Jul 19, 2018

And I still believe that it's better to update the documentation, than rename or spawn methods in API.
If by default we assume that developers do not read the documentation and ignore all warnings in it, then probably we don't need documentation at all.

Documentation should lead to clarity. Developers should read and understand what is going on behind the wheel, in any way, and not blindly use some magic wand API.

Just take a look at Haskell example: http://hackage.haskell.org/package/scrypt/docs/Crypto-Scrypt.html

Plus we already have high-level lib for it: https://www.npmjs.com/package/scrypt-for-humans

@paragonie-scott
Copy link
Author

If by default we assume that developers do not read the documentation and ignore all warnings in it, then probably we don't need documentation at all.

Developers aren't homogenous. Some will read the official documentation directly (and keep themselves up to date), others will understand it filtered through podcasts, vlogs, blogs, and technical discussions on websites like Reddit or HN.

What mechanism do you have to ensure all the secondary and tertiary sources of developer understanding beyond the official documentation get updated in a timely fashion? How do you inform every Node.js developer on the planet that they need to refresh their current understanding of the language and recent changes?

Because arguing for "just update the documentation" just kicks the can down the road, so that the other developers can be "blamed" for writing insecure code, and doesn't actually improve security.

Plus we already have high-level lib for it: https://www.npmjs.com/package/scrypt-for-humans

The author of the high-level library is @joepie91 who I am quite certain disagrees with your argument in this discussion. (He can speak for himself on this point, of course.)

@joepie91
Copy link
Contributor

And I still believe that it's better to update the documentation, than rename or spawn methods in API.
If by default we assume that developers do not read the documentation and ignore all warnings in it, then probably we don't need documentation at all.

This argument doesn't make a lot of sense. The point of documentation is to explain, and I never called into question the effectiveness of using documentation for that purpose. What I'm calling into question is the effectiveness of using it to warn, which is pretty well-known to just.. not work.

Therefore, people ignoring warnings in documentation (which they frequently do, especially when writing their own material, since it is in their eyes not necessary for understanding the documented thing), does not in any way translate to "we don't need documentation at all"; it still fulfills the purpose of explanation just fine. We just shouldn't rely on it for the purpose it's bad at, ie. "warning people".

Documentation should lead to clarity. Developers should read and understand what is going on behind the wheel, in any way, and not blindly use some magic wand API.

I fully agree. Unfortunately, that's not what developers actually do, with copious security issues as a result (that people other than the developer suffer from!), and there is no reason to believe that this will ever change given human nature. Therefore everybody in the chain, including the core team, has a responsibility to strive for harm reduction even for the cases where developers just plain aren't paying attention.

Essentially, "developers should just pay attention" is an ideological argument, not a pragmatic one.

Plus we already have high-level lib for it: https://www.npmjs.com/package/scrypt-for-humans

As @paragonie-scott already pointed out, I'm the author of that library. It currently wraps node-scrypt, and is no substitute for a safe API in core, given that many developers assign special importance to implementations provided by core and/or standards (even when that is not justified).

In fact, the way I found out about this addition to core in the first place, was because somebody said "it's in core now!" when I suggested to somebody that they should use scrypt-for-humans. Just to illustrate the point made above.

@Trott
Copy link
Member

Trott commented Aug 13, 2018

@ChALkeR Should this still be left on the TSC agenda? If so, will you be able to attend this week's meeting or is this likely to need to be discussed at a future meeting instead?

@fhinkel
Copy link
Member

fhinkel commented Aug 29, 2018

Ping @ChALkeR

@mhdawson
Copy link
Member

@ChALkeR Should this still be left on the TSC agenda? If so, will you be able to attend this week's meeting or is this likely to need to be discussed at a future meeting instead?

@targos
Copy link
Member

targos commented Sep 17, 2018

Ping @ChALkeR

@ChALkeR ChALkeR removed the tsc-agenda Issues and PRs to discuss during the meetings of the TSC. label Sep 19, 2018
@ChALkeR
Copy link
Member

ChALkeR commented Sep 19, 2018

@Trott @fhinkel @mhdawson @targos
I don't think this needs to be discussed at the meeting now, so I de-labeled it.
I think that I will add the label again when this will need further discussion.
Sorry for the lack of response here :-/.

@bnoordhuis
Copy link
Member

I'm closing out this issue because the discussion seems to have stalled without reaching consensus. One way forward is to propose a generic, higher-level API - i.e., covering more than just scrypt.

crypto.scrypt() isn't going anywhere at this point: it's been out for some time now and it's in use by applications and libraries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
crypto Issues and PRs related to the crypto subsystem. security Issues and PRs related to security.
Projects
None yet
Development

No branches or pull requests