-
Notifications
You must be signed in to change notification settings - Fork 7.3k
Strange behavior in querystring parsing #60
Comments
decodeURIComponent is an ECMAScript method that tries to decode the %xx hex values into UTF-8 characters. Not every byte is a valid UTF-8 character, so that's probably the wrong method for your purposes. |
WIzKid: Hey, QS author here. You can already change the escape/unescape functions if necessary. Just do this:
For most use cases, (en|de)codeURIComponent is what's needed, so it's a pretty sensible default. As to trimming, I think you're right, that's probably unnecessary. I don't recall the case that prompted that, but I think it was backward compatibility with some other broken awful server-side language (php or ruby or perl or python, probably). I'm ok with taking it out. For your case, though, if you just replace the escape/unescape functions, you'll get %20 or + chars instead of whitespace. |
Ok, what definitely IS a bug, though, is that decodeURIComponent can throw, and it's not obvious that it needs a try/catch around any URL parse. Crashing your server due to an ugly request URL is very much Doing It Wrong. Adding that to my todo list. |
isaacs: I've added a raw-argument to the parse-function to make it not trimming and converting stuff to integers. If you want to use it you can find it at: http://github.com/WizKid/node/commit/26262f205900f5ab122a5365486139bf9cd40333 |
I'm not entirely convinced that converting to Number is a bad thing in any case, since it toString()s just fine if you treat it like a string. The trimming can be taken out completely, IMO, but I'll run it by the list to see if anyone is depending on that behavior for some reason. If you just swap out the encode/decode functions in your module you'll get what you want, and I'd prefer to do that rather than add a "raw" param to the parser. (Three args is already a lot, and this is an edge case.) I definitely also need to fix the throwing problem, because that's going to unexpectedly bite folks. I'd really like to have it parse to a ByteString, but node doesn't have one of those yet. |
I just couldn't find anything in the manual that parse did convert stuff to integers. So as a user of the API I would be pretty surprised if I parsed the querystring containing username=1234 and then wanted to check if the length of the username was valid by using params['username'].length which would return undefined. I just feels like everyone would need to use toString on everything all the time to make sure it is a string. I don't say that you should change it. I just wanted to tell you that it feels strange for me. |
Ok, that's a valid point. I'm convinced. Numericizing numbers will be removed. So, to close this issue, we've got:
Anything else? |
Sounds perfect for me. Then I could just locally overload the escape/unescape metod to use escape/unescape instead of using escapeURIComponent/unescapeURIComponent. |
Fixed in isaacs/node-v0.x-archive@d27d47f. Not the ideal fix, since it'd be better to use a binary string, but hey, at least it's not gonna crash the server. |
isaacs, LGTM but could you add a test case? |
I've had a similar problem where I'm using (fab) and it breaks in the beginning of the url routing due to the above issues. But since I'm using latest version from git, shouldn't the breaking issue be solved by now? Also, since I'm a n00b to node, can I require in and override the encode/decode stuff anywhere, or do I have to put it inside the module that explicitly use it? Cheers, |
This issue is fixed, afaik |
Thanks! |
From what I can see it hasn't been put in the main node repo. Just in Isaacs private one. At least http://github.com/ry/node/blob/master/lib/querystring.js haven't got the change in it. |
Are there any disadvantages to using 'unescape' instead of 'decodeURIComponent' in the querystring module? That seems to fix this problem |
No, you're probably right. I was just showing my ignorance :) But as stated, everything now works fine at lower levels and the issue doesn't come up anymore. |
I'm playing around with node.js and trying to make a simple bittorrent tracker.
The problem I've ran into is that the querystring parser don't like the requests a bittorrent client does.
One part of the url is info_hash=%92%5e%1b%94%98yx%f3%97%ea%db%fa%10d%be%fe%0a5%8c%0a
The first problem I have is that decodeURIComponent in the querystring lib don't like %92 and some other escaped characters. encodeURIComponent(unescape("%92")) gives %C2%92 so maybe it is that bittorrent clients don't encode stuff properly. I tried to look around to see if %92 is correct or wrong but I couldn't find any good sources. But this problem I can work around.
My second problem is val = val.replace(trimmerPattern, ''); in the querystring lib. Why trim querystring parameter keys and values? If I send in a string with a whitespace in the beginning or end I probably want that. Or I think there at least should be a way to say that you want the raw string and no conversion should be done (no whitespace cleanup and no conversion to integers).
If you think it would be a good idea to add a raw option I could add it.
The text was updated successfully, but these errors were encountered: