feat/remove-pydantic-dependency #195

MasterKale · 2024-01-05T19:32:30Z

An experiment in simplifying the list of this package's dependencies. PR is currently a WIP while I continue to explore this.

Fixes #196.

MasterKale · 2024-01-08T19:35:40Z

OK - I can work with UTF-8 as user_handle...

I'll be sure to document this behavior too, I promise 🙏

webauthn/helpers/parse_authentication_credential_json.py

jwag956 · 2024-01-09T01:26:50Z

I am still struggling to get this to work with my sample front-end. I don't understand this code in def options_to_json():

        if isinstance(options.user.id, bytes):
            _user["id"] = bytes_to_base64url(options.user.id)
        else:
            _user["id"] = options.user.id

generate_registration_options already has done:

user=PublicKeyCredentialUserEntity(
            id=user_id.encode("utf-8"),
            name=user_name,
            display_name=user_display_name,
        ),

So it will always be 'bytes' and then convert it to base64?

MasterKale · 2024-01-09T01:51:33Z

So it will always be 'bytes' and then convert it to base64?

Right, this is current behavior that's confusing and definitely not desirable as per another comment I left over in #180:

You know what, I just realized that I've been base64url-encoding user.id the entire time, which means userHandle is coming back as the base64url-encoded version, not the string passed in as user_id into generate_registration_options(). I thought I'd fixed that... 🤦‍♂️

I've stashed a couple of approaches I plan on taking after merging this PR to fix this. I'll either A) always treat user.id and user_handle as UTF-8 strings, or B) allow UTF-8-string or bytes as values for user.id, base64url-encode user.id if it's bytes, and then treat that user identifier as UTF-8 when handling it later during authentication as user_handle.

tl;dr: for this PR the behavior will remain as-is. I'll fix this behavior afterwards as it's very confusing.

jwag956 · 2024-01-09T03:46:08Z

Sorry to be a pain here - this affects the front-end API and changing that needs careful consideration and of course breaks backwards compatibility. So front-ends (mostly JS I presume) must take user.id and b64decode into a Uint8Array to pass into credentials.create - then take the response from authentication (user_handle) and send it back as plain utf-8? Or send it back as b64encoded and have RP code do the decode (which will break if you change it).

I suppose I am asking if this could be 'fixed' (either way) PRIOR to releasing. (different PR is fine - just prior to release) and and that you give this a major release change (2.0).

As always - thanks for all the effort and consideration you put into the package... we are all very very appreciative.

MasterKale · 2024-01-09T06:38:04Z

So front-ends (mostly JS I presume) must take user.id and b64decode into a Uint8Array to pass into credentials.create - then take the response from authentication (user_handle) and send it back as plain utf-8? Or send it back as b64encoded and have RP code do the decode (which will break if you change it).

To get the most out of userHandle during authentication an RP using this library would have to know to base64url-decode user.id during registration after calling options_to_json(generate_registration_options(...)) and before calling navigator.credentials.create().

If a user of this library does base64url-decode user.id before calling WebAuthn's navigator.credentials.create(), then the bytes they get later for userHandle after a call to navigator.credentials.get() would be the same UTF-8 bytes they specified as the string argument user_id when calling generate_registration_options().

If they do not base64url-decode user.id before calling WebAuthn's navigator.credentials.create(), then the bytes they get later for userHandle after a call to navigator.credentials.get() would always be the base64url-encoded UTF-8 bytes they specified as the string argument user_id when calling generate_registration_options(). The RP would need to base64url-decode userHandle to bytes, and then encode those bytes to UTF-8 to get the same UTF-8 bytes they specified as the string argument user_id when calling generate_registration_options().

This...is potentially going to make migrating from v1.x to v2.0 of py_webauthn a little messy. I think the simplest migration path would involve base64url-encoding the string they pass in for user_id when calling generate_registration_options(). This is anticipating my following up this PR with a subsequent change to always treat the string value user_id as UTF-8 bytes and stop base64url-encoding it - it's not necessary, the value is already a string.

To reiterate, a goal of the upcoming v2 is for the UTF-8 bytes that an RP specifies for user_id (as a basic str, i.e. user_id="user_id_here") when calling generate_registration_options() be the same bytes that get returned for userHandle after a call to navigator.credentials.get(). I could then advise users of py_webauthn to UTF-8 encode these bytes on the front end when preparing JSON to submit to the server. That would at least make it much easier to talk about how one value becomes the other.

All of this said, I honestly don't think userHandle should have much value to RPs. Even during conditional UI, when no username is provided, the credential ID that corresponds to the public key that can successfully verify the signature coming out of WebAuthn's .get() should then be used internally to figure out which user to sign in. userHandle on the front end isn't signed over and so it can't really be trusted. I don't think this gets me off the hook to figuring out a path forward around this, admittedly...

I suppose I am asking if this could be 'fixed' (either way) PRIOR to releasing. (different PR is fine - just prior to release) and and that you give this a major release change (2.0).

I don't think I want to put much effort into trying to solve this with the current v1.x version of py_webauthn as I think it's kind of a breaking change all in its own. I'd rather bundle it with a more "drastic" change like this PR and break many things at once instead of coordinating one per major release.

jwag956 · 2024-01-09T22:46:36Z

As always - thank you for your detailed response..Part of the confusion for me is that string (unicode) != UTF-8 in python - and in python, converting a string to UTF-8 ("my string").encode() results in bytes - which isn't JSON encodable (you have to cast it back to string). So your goal of having user.id be a 'string' seems good but does anything care if it encoded as long as what is returned as part of authentication is the same as was passed into registration?

All that aside - I have everything working with this PR - with just the one modification to b64decode in my signin code.
Thanks.

MasterKale · 2024-01-10T03:38:00Z

Part of the confusion for me is that string (unicode) != UTF-8 in python

I went digging into Python's Unicode support and it's my understanding "Unicode support" in Python commonly refers to UTF-8, based on https://docs.python.org/3/howto/unicode.html#encodings:

UTF-8 is one of the most commonly used encodings, and Python often defaults to using it. UTF stands for “Unicode Transformation Format”, and the ‘8’ means that 8-bit values are used in the encoding. (There are also UTF-16 and UTF-32 encodings, but they are less frequently used than UTF-8.)

As for the results of your testing...

All that aside - I have everything working with this PR - with just the one modification to b64decode in my signin code.

Thank you for taking the time to test this PR. I'll go ahead with it, but it won't go out immediately as v2.0. I'll be making a couple of other breaking changes (like the user ID stuff we've talked about in here) before rolling all of that into v2.0.

jwag956 · 2024-01-10T20:59:26Z

In python 3 all strings are unicode. It is true that utf-8 is the default encoding (e.g. "hi there".encode()) will convert the unicode string "hi there" to a UTF-8 encoded series of bytes. One thing I noticed was the W3C spec calls user.id "A user handle is an opaque byte sequence with a maximum size of 64 bytes, and is not meant to be displayed to the user."

Since unicode may or may not be 1-1 with bytes - I wonder if it would be better for the py_webauthn API to take bytes - not string - in order to avoid an issue with upon encoding it is too long. Then I suppose encoding that using b64 encoding in order to create a JSON serializable value and having the front-end convert to a uint8array (as I know mine does today) . The same in the reverse upon authentication response. Of course - this behavior is pretty much exactly what py_webauthn and pydantic 1.0 did :-)

I don't really care either way - my user_handle is a 64 bytes hex string. At a minimum it is probably worth documenting the restrictions on passing in an arbitrary unicode string.

MasterKale added 30 commits January 5, 2024 10:49

Remove Pydantic as dependency

4432d9e

Expand upon VS Code settings

c6d5fa1

Refactor structs into dataclasses

cf7a3e9

Remove references to Pydantic in code

cbed385

Run Black over everything

b134c58

Rewrite auth credential parsing

fc60fe8

Restore typing_extensions for mypy

b7d7631

Fix mypy config

5b43364

Remove unused type

f5dd32c

Fix incorrect instance check

8bf3c71

Don't initialize an Optional field

bd28d8a

Update VS Code settings

9535b9c

Tweak exception messages

2c3357e

Appease mypy with this one weird trick

5b6af12

Use InvalidJSONStructure in parse_client_data_json

e9cbfc9

Create encode_cbor to abstract cbor2 use

9dd6d53

Abstract more cbor2 use

b8ff3b6

Add comment

61b6044

Update tests

dc3bb68

Parse registration responses

d58f95e

Refactor options_to_json

ad5be88

Create byteslike_to_bytes for bytes shenanigans

8ef29d3

Handle things like memoryviews

0bb1201

Remove print statement

5c86c3d

Remove Pydantic from examples

dbb22a9

Allow user.id and user_handle to be str

d403696

Don't assume base64url on user_handle

48c94b1

Update token binding test

9db31a6

Refactor check for empty attStmt on None

05ac887

Refactor SafetyNet JWS parsing

d5ea72b

MasterKale added 7 commits January 8, 2024 11:46

Properly raise on non-dict JSON parse

0f0ea3a

Raise credential type verification up

21433cb

Tweak error messages

2a2ac63

Add registration response parsing tests

defb54c

Clean up unused imports

c6f30b7

Add tests for parsing authentication response

043aa42

Add comment about user_handle

c3c53dd

jwag956 reviewed Jan 8, 2024

View reviewed changes

webauthn/helpers/parse_authentication_credential_json.py Show resolved Hide resolved

MasterKale added 4 commits January 8, 2024 14:44

Remove json_loads_base64url_to_bytes

4bf9cf5

Raise when required string values are empty

8f971be

Revert PublicKeyCredentialUserEntity id Union

963ba2e

Revert typing on user_handle

6c0c544

MasterKale marked this pull request as ready for review January 8, 2024 23:34

MasterKale added this to the v2.0.0 milestone Jan 8, 2024

MasterKale added 4 commits January 8, 2024 15:51

Add .git-blame-ignore-revs

f65c30d

Run Black on setup.py

64a0579

Update to the latest cryptography and pyOpenSSL

6fd8ee6

Ignore another formatting commit

939c486

MasterKale merged commit a3fb0a5 into master Jan 11, 2024
4 checks passed

MasterKale deleted the feat/remove-pydantic-dependency branch January 11, 2024 00:02

This was referenced Jan 11, 2024

Fix validation of credantials with pydantic v1 #186

Closed

Fix handling of optional fields such as user_handle. #180

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat/remove-pydantic-dependency #195

feat/remove-pydantic-dependency #195

MasterKale commented Jan 5, 2024 •

edited

Loading

MasterKale commented Jan 8, 2024

jwag956 commented Jan 9, 2024

MasterKale commented Jan 9, 2024

jwag956 commented Jan 9, 2024

MasterKale commented Jan 9, 2024

jwag956 commented Jan 9, 2024

MasterKale commented Jan 10, 2024

jwag956 commented Jan 10, 2024

feat/remove-pydantic-dependency #195

feat/remove-pydantic-dependency #195

Conversation

MasterKale commented Jan 5, 2024 • edited Loading

MasterKale commented Jan 8, 2024

jwag956 commented Jan 9, 2024

MasterKale commented Jan 9, 2024

jwag956 commented Jan 9, 2024

MasterKale commented Jan 9, 2024

jwag956 commented Jan 9, 2024

MasterKale commented Jan 10, 2024

jwag956 commented Jan 10, 2024

MasterKale commented Jan 5, 2024 •

edited

Loading