-
-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GeoIP2 trys to write characters outside of ISO-8859 #127
Comments
Forgot about loading: Of course we also need to unescape the characters on loading/updating the data file. So e.g. in |
Here is a better approach without the need of decoding and HTML::Entities as extra dependency. We simply store the escaped name in the hash: --- b/wwwroot/cgi-bin/plugins/geoip2_city.pm.orig 2019-04-19 12:36:02.026552366 +0200
+++ a/wwwroot/cgi-bin/plugins/geoip2_city.pm 2019-04-19 12:35:52.906632209 +0200
@@ -29,6 +29,11 @@
#use strict;
no strict "refs";
+sub html_escape {
+ my $s = shift;
+ $s =~ s/([^\x00-\x7F])/sprintf "&#x%X;", ord($1)/ge;
+ return $s;
+}
#-----------------------------------------------------------------------------
@@ -369,7 +374,7 @@
my $countrycity=$record->country()->iso_code().'_'.$record->city()->name();
$countrycity=~s/ /%20/g;
if ($record->most_specific_subdivision()->name()) { $countrycity.='_'.$record->most_specific_subdivision()->name(); }
- $_city_h{lc($countrycity)}++;
+ $_city_h{html_escape(lc($countrycity))}++;
} else {
$_city_h{'unknown'}++;
}
@@ -409,7 +414,7 @@
my $countrycity=($record->country()->iso_code()).'_'.$city;
$countrycity=~s/ /%20/g;
if ($record->most_specific_subdivision()->name()) { $countrycity.='_'.$record->most_specific_subdivision()->name(); }
- $_city_h{lc($countrycity)}++;
+ $_city_h{html_escape(lc($countrycity))}++;
} else {
$_city_h{'unknown'}++;
} |
manuelm
added a commit
to manuelm/awstats
that referenced
this issue
Apr 26, 2019
* Correctly convert dns names to ip4 and ip6 address using getaddrinfo (fixes eldy#120, eldy#121, obsoletes eldy#115) * Only lookup if the IP is of type public (fixes eldy#122) and catch further lookup errors (obsoletes eldy#123) * Store and display the GeoIP City output HTML escaped (fixes eldy#127) * Code to perform and cache the actual lookup is consolidated * General code improvement and readability
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This is related to #69 but targets GeoIP2 modules. Mainly GeoIP2::City. GeoIP2 names are UTF-8 encoded. Therefor they can contain characters outside of the ISO-8859 range. Since AWStats data files are stored in ISO the GeoIP output has to be escaped. As in #69 I suggest html escaping them, either using
HTML::Entities
or a stripped down version which only escapes high bytes:Regarding
geoip2_city.pm
the function call should be added inSectionWriteHistory_geoip2_city
so thatXMLEncodeForHisto($_)
becomesXMLEncodeForHisto(html_escape($_))
(orXMLEncodeForHisto(encode_entities_numeric($_))
).For
geoip2.pm
this isn't strictly necessary as I'm not aware of any international countries names that contain characters outside the ISO range.195.38.12.0
is an ipaddress for testing. GeoIP2::City tries to storePL_Zamość_Lublin
. With escaping this gets converted toPL_Zamość_Lublin
.The text was updated successfully, but these errors were encountered: