-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix userinfo in case of some special characters #144
base: master
Are you sure you want to change the base?
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,19 +5,25 @@ use warnings; | |
|
||
use parent 'URI::_generic'; | ||
|
||
use URI::Escape qw(uri_unescape); | ||
use URI::Escape qw(uri_unescape uri_escape); | ||
|
||
our $VERSION = '5.29'; | ||
|
||
sub _uric_escape { | ||
my($class, $str) = @_; | ||
if ($str =~ m,^((?:$URI::scheme_re:)?)//([^/?\#]*)(.*)$,os) { | ||
my($scheme, $host, $rest) = ($1, $2, $3); | ||
my $ui = $host =~ s/(.*@)// ? $1 : ""; | ||
my $port = $host =~ s/(:\d+)\z// ? $1 : ""; | ||
if (_host_escape($host)) { | ||
$str = "$scheme//$ui$host$port$rest"; | ||
} | ||
if ($str =~ m,^((?:$URI::scheme_re:)?)//(.*:.*@)?([^/?\#]*)(.*)$,os) { | ||
my $scheme = $1; | ||
my $ui = $2 || ''; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could we have a more descriptive name for this variable? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I realize this variable was not your creation. :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure thing, I have renamed to |
||
my $host = $3; | ||
my $rest = $4; | ||
my $port = $host =~ s/(:\d+)\z// ? $1 : ""; | ||
if ($ui) { | ||
# escape /?# symbols as they are used | ||
# in subsequent regex for path parsing | ||
$ui = uri_escape($ui, '/?#'); | ||
} | ||
_host_escape($host); | ||
$str = "$scheme//$ui$host$port$rest"; | ||
} | ||
return $class->SUPER::_uric_escape($str); | ||
} | ||
|
@@ -26,8 +32,8 @@ sub _host_escape { | |
return if URI::HAS_RESERVED_SQUARE_BRACKETS and $_[0] !~ /[^$URI::uric]/; | ||
return if !URI::HAS_RESERVED_SQUARE_BRACKETS and $_[0] !~ /[^$URI::uric4host]/; | ||
eval { | ||
require URI::_idna; | ||
$_[0] = URI::_idna::encode($_[0]); | ||
require URI::_idna; | ||
$_[0] = URI::_idna::encode($_[0]); | ||
}; | ||
return 0 if $@; | ||
return 1; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a discussion with @genio on IRC:
It would be nice to avoid
(.*@)
or(.*:.*@)
as it's greedy and could be prone to bugs.We could take inspiration from https://metacpan.org/dist/Mojolicious/source/lib/Mojo/URL.pm#L52-64
After gathering the host and port part, ^([^@]+@) should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, but what if user wanna use '@' in password, using
[^@]+@
will cause that the everything before first '@' symbol will be matched and password parsed incorrectly.like the similar approach is already used in LWP/Protocol/http.pm.
And I already mentioned what issue it might cause in my PR.
For my standpoint
(.*:.*@)
is pretty lazy, but okay, I agree that it could be better.@oalders, what you think about
([^:]+:.*@)
for matching user info?This will match everything except
:
as username cannot contain colon, then match:
as separator, match everything after first colon as password or nothing as according to rfc3986#section-3.2.1 it might be empty, and finally match@
as second separator.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, the regex which Mojo/URL.pm#L52-64 is used also cannot parse username/password if it contains any of
/#?
symbols :)I know, it's really rarely usecase, but unfortunately I faced with it already two times :d
For instance, this regex with username and password which contain all special characters in unescaped form.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then it should be URL-encoded as
%40
. I'm not sure if there's any way around that. Something has to serve as a stopping point. If the RE is too greedy, it might end up stopping with a@
inside of the path, likehttp://host/path/directory@foobar/file
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm.
Now I see, it'll match this path incorrectly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, actually I can revert this change as the most important part is in lib/URI/_server.pm#14.
There are another issue, if password contain multiple '@' symbols before any of '/#?' symbol URI parsing fails again :d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the password was inserted directly into the URL that way without percent-encoding, that's not really the URI module's fault. The password has to be encoded properly first. That's the point of escaping characters.