Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF-8 encoded data should not by encoded with latin-1 #476

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jmadotgg
Copy link

I had problems non ascii letters like umlauts and ß in the requests url path, even though they where uri encoded and also decoded properly before the latin-1 encode step in line 120.
I know this might be a naive fix, maybe you can come up with a better overall approach or guide me towards it. I also don't know if utf-8 encoded uris are out of scope for this library.

@jmadotgg jmadotgg marked this pull request as draft December 13, 2024 14:38
@mmerickel
Copy link
Member

mmerickel commented Dec 14, 2024

Can you show an example url and a stack trace?

generally speaking a utf8 path that is properly encoded will work fine with webob.

The Latin-1 encoding is used within PEP3333 to marshal bytes around as Unicode code points but they are rarely interpreted that way. That being said in HTTP a lot of things like headers don’t support utf8.

@jmadotgg
Copy link
Author

@mmerickel Thank you for your quick response, I will get back to this the upcoming week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants