-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix crash on multipart/form-data post #1743
Merged
Merged
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -67,6 +67,7 @@ Georges Dubus | |
Greg Holt | ||
Gregory Haynes | ||
Günther Jena | ||
Hu Bo | ||
Hugo Herter | ||
Igor Pavlov | ||
Ingmar Steen | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in case of
None
, you really cannot be sure if is it safe to decode or not. Better leave data as is.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I'm also thinking about this, but most post() use cases do not return bytes. When user call post(), maybe he always want something same returned from either multipart or url-encoded data.
If the user cares about raw data (bytes), he may call multipart() directly and process the post data himself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you post a form with fields like textboxes in a browser like Firefox, e.g.
The browser usually do not set Content-Type for subpart of the post.
Files are not affected by this commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @kxepal, we should not decode data if we do not know content-type
it would very hard to reason about exception if one occurs from this code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a hard decision, and I am in an open mind about this. There are three ways for this situation:
Each has their own advantages and disadvantages. I'm looking at the code which is processing application/x-www-form-urlencoded data and it is:
Notice that this piece of code assume charset to be utf-8 when no charset is provided through Content-Type header (notice that a %NN encoded character is really a byte). It always decode data into string. So I suggest using the same strategy for multipart/form-data format.
As I have said, a lot of browsers do not send Content-Type header for sub parts of the form data - in most times, they are indeed encoded into utf-8. There is nothing a developer can do about this. If multipart/form-data post data is parsed into bytes, a developer is forced to check the data type of post() every time if he wants to accept both format. To decide to not decode a bytes object is easy, but the user may be suprised to see that the return type for multipart/form-data and application/x-www-from-urlencoded is so different. And he would also have a hard time when some tools or browsers actually provide the Content-Type header.
After we have a conclusion maybe we should add it into the document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be fine in all cases. Browsers just are another HTTP clients with own specifics.
They actually do this for simple input fields, not file inputs. I'm worry about "in most times" part of your post, but in anyway, there are no reasons here to make any preferences for browsers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, according to RFC 7578
It really SHOULD be considered as "text/plain"... And if "text/plain" is decoded to unicode with the default encoding as utf-8, it should be same for content without a content-type header.
I'm also testing the simple HTML page with Firefox, Internet Explorer and Edge, they all send the text without a content-type header - even when the input field contains non-ASCII characters.
Anyway, if you do not change your mind, I don't mind to change the logic to what you are considering.
@kxepal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI
https://tools.ietf.org/html/rfc7578#page-5
and also these chapters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love RFC references! Thanks for them. I guess RFC-7578#4.4 is pretty clear instructs what to do in this case so can follow it.
@fafhrd91 are you ok with as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am ok