Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

py3: several string conversion fixes #24223

Closed
embray opened this issue Nov 16, 2017 · 52 comments
Closed

py3: several string conversion fixes #24223

embray opened this issue Nov 16, 2017 · 52 comments

Comments

@embray
Copy link
Contributor

embray commented Nov 16, 2017

This fixes several (hardly exhaustive) string conversions around Sage such that functions and methods that returned str on Python 2 also return str on Python 3 (and the same for functions and methods that take str as arguments) using the new string utilities from #24222.

This also incorporates/replaces the fixes from #23812.

CC: @fchapoton @jdemeyer

Component: python3

Author: Erik Bray

Branch/Commit: e9a582e

Reviewer: Jeroen Demeyer

Issue created by migration from https://trac.sagemath.org/ticket/24223

@embray embray added this to the sage-8.1 milestone Nov 16, 2017
@embray
Copy link
Contributor Author

embray commented Nov 16, 2017

comment:1

With this, plus a few other small fixes (including fixes from other existing tickets merged in) I was able to get the Sage REPL up and 1 + 1 working.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 20, 2017

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

34ef6dcAdd HAVE_GMPY2 compile-time constant
158d984Force absolute import in have_module()
7f06e71Fix documentation
3ac146cAdd a Cython compile-time constant for Python major version.
f71fd41Add new string conversion utilities.
a5d5219Add support for optional error handling.
627d4c7Add new string conversion utilities.
25d6c32Numerous string conversions from bytes to str and from str to bytes for Python 3

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Nov 20, 2017

Changed commit from 44a5540 to 25d6c32

@embray
Copy link
Contributor Author

embray commented Nov 20, 2017

comment:3

Rebased on current version of #24222.

@jdemeyer
Copy link

comment:4

Something is wrong: there are still the files src/sage/misc/string.*

@jdemeyer
Copy link

comment:5

This is again a ticket with many small independent changes where a "partial positive review" like what I proposed on #24025 would make sense. I didn't like how we handled that, so how should we proceed here?

@embray
Copy link
Contributor Author

embray commented Nov 21, 2017

comment:6

Replying to @jdemeyer:

Something is wrong: there are still the files src/sage/misc/string.*

Oops--part of the problem here is I'm moving between two different working trees, one for Python 2 and one for Python 3, so it can get confusing. I'll clean this up.

@embray
Copy link
Contributor Author

embray commented Nov 21, 2017

comment:7

Replying to @jdemeyer:

This is again a ticket with many small independent changes where a "partial positive review" like what I proposed on #24025 would make sense. I didn't like how we handled that, so how should we proceed here?

I'd be fine if you proposed your changes and tossed it back to me as "needs review", rather than going immediately to "positive review".

@jdemeyer
Copy link

comment:8

It seems strange to me to use locale.getpreferredencoding() for strings which do not depend on the locale. I would propose:

  1. Use locale.getpreferredencoding() for strings which actually depend on the locale.

  2. Use sys.getfilesystemencoding() for strings which actually depend on the file system encoding.

  3. Use UTF-8 otherwise.

@embray embray modified the milestones: sage-8.1, sage-8.2 Dec 12, 2017
@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Dec 21, 2017

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

64567aaMake type checks explicit
a358aeaRequire a const char*--this is more consistent for use in cases where
64a1c2dUse a less unusual character in the tests
1bf6709Use r-strings for the docstrings since they contain slashes
2bb8396Clean up these declarations:
6d73a41Use more specific type declarations here.
56f1c02Numerous string conversions from bytes to str and from str to bytes for Python 3

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Dec 21, 2017

Changed commit from 25d6c32 to 56f1c02

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Dec 21, 2017

Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:

a285af4Add new string conversion utilities.
5b3719aAdd support for optional error handling.
bf5f893Make type checks explicit
1bd27fcRequire a const char*--this is more consistent for use in cases where
bd2d931Use a less unusual character in the tests
b32f000Use r-strings for the docstrings since they contain slashes
ccf7bcdClean up these declarations:
dec9f3aUse more specific type declarations here.
a9ebae6Numerous string conversions from bytes to str and from str to bytes for Python 3

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Dec 21, 2017

Changed commit from 56f1c02 to a9ebae6

@jdemeyer
Copy link

comment:12

Is this needs_review again?

@embray
Copy link
Contributor Author

embray commented Jan 2, 2018

comment:13

Yeah, IIRC this was just a bunch of miscellaneous string conversions from some other tickets, but redone using the sage.cpython.string utilities.

@fchapoton
Copy link
Contributor

comment:14

3 failing doctests in integer.pyx

@embray
Copy link
Contributor Author

embray commented Jan 5, 2018

comment:16

Interesting--when I added str_to_bytes I predicted we might later want to allow Python 2 unicode objects to be supported somehow, but put it off because I didn't think there was much code in Sage that already was expected to handle unicode (except in a few places like unicode art, etc.). But the tests that are failing here clearly show that there are some places it's expected to work.

So I see at least three options:

  1. Modify str_to_bytes to treat unicode appropriate on Python 2
  2. Add some new function like unicode_to_str that's analogous to str_to_bytes (I'm not crazy about this)
  3. Just special case unicode on Python 2 where we do want to support it.

I lean towards 1. but I'm open to ideas.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 28, 2018

Changed commit from 00f2776 to 7d68d8c

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 28, 2018

Branch pushed to git repo; I updated commit sha1. New commits:

7d68d8csome details of imports

@jdemeyer
Copy link

Reviewer: Jeroen Demeyer

@embray
Copy link
Contributor Author

embray commented Jan 30, 2018

comment:34

This does not work:

-from sage.cpython.string import FS_ENCODING, str_to_bytes
+from sage.cpython.string cimport FS_ENCODING, str_to_bytes

FS_ENCODING is can't be cimported.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 30, 2018

Branch pushed to git repo; I updated commit sha1. New commits:

80b0d1ffix import

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 30, 2018

Changed commit from 7d68d8c to 80b0d1f

@fchapoton
Copy link
Contributor

comment:36

import corrected

@embray
Copy link
Contributor Author

embray commented Jan 30, 2018

Changed branch from public/python3-trac24223 to u/embray/python3/string-fixes

@embray
Copy link
Contributor Author

embray commented Jan 30, 2018

comment:37

I rebased and squashed some of these changes.


New commits:

cc214fdNumerous string conversions from bytes to str and from str to bytes for Python 3
f7e1153Use str_to_bytes for 'unicode' (which will be str on Python 3 and unicode on Python 2)
a4d6fe3A few additional string conversion fixes for sage.libs.ecl
d03a8d4py3: fix pickling of integers and other minor bytes/str fixes
e9a582eimport -> cimport where appropriate

@embray
Copy link
Contributor Author

embray commented Jan 30, 2018

Changed commit from 80b0d1f to e9a582e

@vbraun
Copy link
Member

vbraun commented Jan 31, 2018

comment:38
[sagelib-8.2.beta4] Error compiling Cython file:
[sagelib-8.2.beta4] ------------------------------------------------------------
[sagelib-8.2.beta4] ...
[sagelib-8.2.beta4] from sage.rings.finite_rings.finite_field_prime_modn import FiniteField_prime_modn
[sagelib-8.2.beta4] from sage.rings.finite_rings.finite_field_givaro import FiniteField_givaro
[sagelib-8.2.beta4] from sage.rings.finite_rings.finite_field_ntl_gf2e import FiniteField_ntl_gf2e
[sagelib-8.2.beta4] from sage.libs.pari.all import pari
[sagelib-8.2.beta4] from sage.libs.gmp.all cimport *
[sagelib-8.2.beta4] from sage.cpython.string cimport FS_ENCODING, str_to_bytes
[sagelib-8.2.beta4] ^
[sagelib-8.2.beta4] ------------------------------------------------------------
[sagelib-8.2.beta4] 
[sagelib-8.2.beta4] sage/libs/singular/singular.pyx:40:0: 'sage/cpython/string/FS_ENCODING.pxd' not found
[sagelib-8.2.beta4] 
[sagelib-8.2.beta4] Error compiling Cython file:
[sagelib-8.2.beta4] ------------------------------------------------------------
[sagelib-8.2.beta4] ...
[sagelib-8.2.beta4]     lib = SINGULAR_SO
[sagelib-8.2.beta4] 
[sagelib-8.2.beta4]     if not os.path.exists(lib):
[sagelib-8.2.beta4]         raise ImportError("cannot locate Singular library ({})".format(lib))
[sagelib-8.2.beta4] 
[sagelib-8.2.beta4]     lib = str_to_bytes(lib, FS_ENCODING, "surrogateescape")
[sagelib-8.2.beta4]                            ^
[sagelib-8.2.beta4] ------------------------------------------------------------
[sagelib-8.2.beta4] 
[sagelib-8.2.beta4] sage/libs/singular/singular.pyx:778:28: 'FS_ENCODING' is not a constant, variable or function identifier
[sagelib-8.2.beta4] Traceback (most recent call last):
[sagelib-8.2.beta4]   File "/home/buildbot/slave/sage_git/build/local/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 1179, in cythonize_one_helper
[sagelib-8.2.beta4]     return cythonize_one(*m)
[sagelib-8.2.beta4]   File "/home/buildbot/slave/sage_git/build/local/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 1161, in cythonize_one
[sagelib-8.2.beta4]     raise CompileError(None, pyx_file)
[sagelib-8.2.beta4] CompileError: sage/libs/singular/singular.pyx

@fchapoton
Copy link
Contributor

comment:39

Volker, did you use the latest branch ?

@embray
Copy link
Contributor Author

embray commented Feb 1, 2018

comment:40

He didn't. Setting this back to positive review since this otherwise fixes Jeroen's issue.

@vbraun
Copy link
Member

vbraun commented Feb 2, 2018

Changed branch from u/embray/python3/string-fixes to e9a582e

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants