-
-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
turn off unicode #77
Comments
Michael Bayer (@zzzeek) wrote: hi there - im reviewing your patches, thanks for them ! So far this particular one I can't accept:
|
Changes by Michael Bayer (@zzzeek):
|
Anonymous wrote: because strings in the compiled source code are unicode, like u'\xxxx', just removing "unicode" from default_filters does not work, it will causes DecodeError if the data is multibyte string. so, strings must stay like in template source code, such as "我们", and add "# -- encoding:utf-8 --" in compiled source code. In lexer.py, it try to decode all source code into Unicode, so we need a parameter to turn it off. Then removing "unicode" from default_filters will not cause DecodeError. Instead of using Unicode, it must be more complicated, but speeds up a bit. I have used it this way and work fine. If you are interesting in it, I will refine the code and submit it again. |
Changes by Anonymous:
|
Michael Bayer (@zzzeek) wrote: can you please attach a template file illustrating what you're referring to ? if the idea is just, "unicode is too slow, just pass through utf-8 directly without processing", that historically has not worked with our particular approach (we tried). Like I pointed out in my example, the patch does not work. |
Anonymous wrote: I have updated the patch, and pass all the test cases, including two chinese templates, one using unicode, the other one using utf-8 directly for better performance. If unicode is not neccessary, Can Mako turn off unicode at default or no unicode at all? |
Michael Bayer (@zzzeek) wrote: this part of the patch:
should be calling upon the |
Michael Bayer (@zzzeek) wrote: oh also can we call the flag "disable_unicode=True" |
Michael Bayer (@zzzeek) wrote: ...which would also replace default filters with |
Anonymous wrote: I have updated the patch: add default filters to %call tag. replace disable_unicode as "disable_unicode" set default_filters as ["str"] while disable_unicode is True. |
Michael Bayer (@zzzeek) wrote: thanks. Committed a modified version in d5f83e6 which retains identical Mako behavior if the flag is off, which is the default setting for both Template and TemplateLookup. Also added new documentation for this mode. Since not using unicode is against Mako's general philosophy, the docs warn against using this flag unless users are absolutely sure they want it (if anyone reports UnicodeDecode errors with this flag, they're using it wrong and will be urged to stop using it), and it's almost certain that this feature will not be available in the Python 3000 version since Py3K standardizes on unicode strings everywhere. |
Changes by Michael Bayer (@zzzeek):
|
Migrated issue, originally created by Anonymous
if the input and output are not uincode, then decode and encode cause some overhead, add a choice to turn unicode off could improve the performance a bit.
add a argument in Lookup and Template:
... ,using_unicode = True, ...
when turn off unicode, the compiled module source is saved with the proper charset, and adding
in head, escape is not needed.
Attachments: unicode.patch
The text was updated successfully, but these errors were encountered: