Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port to Python3 #97

Closed
touilleMan opened this issue Mar 6, 2015 · 14 comments
Closed

Port to Python3 #97

touilleMan opened this issue Mar 6, 2015 · 14 comments

Comments

@touilleMan
Copy link

Hi there,

Diesel seems a really interesting project, such a shame it is not available for python3 !
So I've forked the repo (https://github.com/touilleMan/diesel/tree/py3k) and started to port the code ;-)

So far, all the tests are passing for Python 2.6, 2.7, 3.3 and 3.4 using the same codebase, but it seems those tests are not really exhaustive and there is still some work to do !

Now I'm trying to make work all the examples available.

My main point of concern is how to handle the strings:

  • in Python2, str is an array of bytes (bytes type is an alias of str), to use unicode we must is unicode type
  • in Python3, str is an array of unicodes characters and only bytes an array of bytes

Considering Diesel, all the code is using classical str types (i.e. array of bytes), this seems pretty logical to me given there is no standard way to determine the encoding of data coming from a socket.

The downside of this is to make the code compatible with Python3, all the strings needs to be explicitly converted to bytes, leading to a lot of small changes, an code more complex (numerous conversions unicode => bytes).

Example:

# Python2 only
send('AUTH', self.password)
send("HTTP/%s %s %s\r\n" % (('%s.%s' % version), resp.status_code, resp.status))
# Python2&3
send(b'AUTH', self.password.encode())
send(("HTTP/%s %s %s\r\n" % (('%s.%s' % version), resp.status_code, resp.status)).encode())

I've started to see if the core.send/core.receive function could be used to handle encoding, but - although it would be a nice solution - I don't believe this could lead to a easy drop-in replacement.

def send(data, priority=5, encoding='ascii'):
    if isinstance(data, builtins.str):
        data = data.encode(encoding)
    return current_loop.send(data, priority=priority)

def receive(spec=None, encoding='ascii'):
    data = current_loop.input_op(spec)
    if isinstance(data, builtins.bytes):
        data = data.decode(encoding)
    return data

Do you have any advice/idea about the best way to handle this ?

@dowski
Copy link
Member

dowski commented Mar 15, 2015

Hi, thanks for your interest in diesel! Python 3 is definitely on my radar, although I've been very much distracted from diesel work lately.

You're right about test coverage - it's far from exhaustive. I think we have some dependencies that aren't Python 3 friendly too (e.g. twiggy).

The string/bytes change is definitely a significant one. I think diesel only dealing in bytes is the right general approach. I think we should start with that and not include encoding in send/receive just yet.

Thanks again for the work you've been doing! I'll take a closer look at it, but I'm already more hopeful about diesel's Python 3 future than I was!

@dowski
Copy link
Member

dowski commented Mar 15, 2015

@msabramo pinging you here in case you are interested in this effort.

@dowski
Copy link
Member

dowski commented Mar 15, 2015

@touilleMan one thing I just realized looking at your branch - it looks like your work is branched off of master. There's a diesel4 branch (https://github.com/dieseldev/diesel/tree/diesel4) where I had started doing some forward looking work. I'm hoping to incorporate Python 3 support there. If it's too much trouble to base your work there, I'm sure we can figure something out.

@touilleMan
Copy link
Author

Thank you for your answer... and a big thank you for this awesome project ! ;-)

Considering the bytes/string approach, I've been considering using unicode everywhere like what werkzeug does (see http://werkzeug.pocoo.org/docs/0.10/unicode/): basically you define an encoding (default is "utf-8" of course) werkzeug should use for encoding output and decode input. On top of that werkzeug internally consider as bytestring the bytes (str on python2 or bytes on python3) AND the 'latin-1' encodable unicode (unicode on python2 or str on python3).

The pros for the upper layers are simplicity and python2/3 compatibility (it "just works").
But this obliged the core to handle the encoding on it own and create strange behavior involving str, newstr, bytes, newbytes etc...
In fact during my refactoring of the diesel's http/wsgi modules, I've opened two issues about that on the werkzeug project : pallets/werkzeug#704 and pallets/werkzeug#705 !

Besides, services built on top of diesel already have their own way to deal with encoding (mongodb's bson handle the encoding, pyredis -I've patch diesel redis service to do the same - uses encoding, encoding_errors and decode_responses params to configure the behavior) etc...

This comfort me thinking diesel should only deal with bytes and be considered as a "socket with benefits".

@touilleMan
Copy link
Author

Yeah, I noticed about the diesel4 branch, but I wasn't sure about it current state (stable/work in progress etc...). I'd be interested if you can tell me more about this branch and the goals of diesel4.

For my part, my p3k branch seems pretty encouraging: the core work is done and I'm now trying to make run each example (those example are really precious to understand diesel and I'm thinking about turning some of them into unit tests). Here is an array of my current advancement:

test working
chat.py ok
chat-redis.py ok
child_test.py ok, unit_test
clocker.py ok, unit_test
combined.py ok
combined_tls.py ok
consolechat.py ok
convoy.py ### KO ###
crawler.py working on it
dhttpd*
dispatch.py
dreadlocks.py
echo.py
event.py
fanout.py
fire.py ok, unit_test
forker.py
http_client.py
http_pool.py
http.py
keep_alive.py
kv.proto
newwait.py
nitro_echo_service.py
nitro.py
queue_fairness_and_speed.py
queue_simple.py ok
redis_lock.py ok, unit_test
redispub.py
resolve_names.py
santa.py ok
signals.py
sleep_server.py
snakeoil-cert.pem
snakeoil-key.pem
stdin.py
stdin_stream.py
synchronized.py
test_dnosetest.py
thread_pool.py ok
thread_simple.py ok
timer_bench.py
timer.py ok
udp_echo.py
web.py ok
zeromq_echo_service.py
zeromq_first.py
zeromq_receiver.py
zeromq_sender.py
zeromq_test.py

As you said, the best would be to add python3 support in diesel4 considering the amount of changes involved. I should finish my work on the http/wsgi module first then port my work on the diesel4 branch.

@dowski
Copy link
Member

dowski commented Mar 15, 2015

I just read through the diff of all of your changes so far. Excellent work!

Regarding diesel 4, I have some of the goals on the wiki.

https://github.com/dieseldev/diesel/wiki

As you can see, Python 3 support is right at the top! I was last very active with moving forward with diesel 4 back around the holidays when I had more free time available. It's great to see someone else jumping in and moving things forward when I'm not.

@touilleMan
Copy link
Author

That was quicker than expected ! I've merged my changed into diesel4 branch.
Now I will continue my work on this branch.

Considering your roadmap, I saw you're seems to consider dropping python 2 support ("Python 3 (only?)"), I think it would be better to keep python 2 (maybe stick to python 2.7 only to get rid of the of the annoying lacks of python 2.6, e.g. https://github.com/touilleMan/diesel/blob/diesel4/diesel/pipeline.py#L13) given pypy's python3 support is still experimental and jit compilation along with async framework is a real argument for people seeking performances.

After all, the best way to draw attention for this project is to release benchmarks where diesel breaks nodejs ;-)

@dowski
Copy link
Member

dowski commented Mar 16, 2015

Fantastic! Feel free to submit a pull request whenever you are ready.

Re: Python 2 - that was mostly due to me not knowing what it would look like to support both versions and having limited time to work on the project. I was willing to sacrifice Python 2 support for the future of Python 3, but you are right that it's ideal to support both. I'm definitely in favor of only supporting 2.7+ though.

@touilleMan
Copy link
Author

Hi again,

I had to slow down a bit those days (let say my daytime job fulfilled my needs in coding !), but I manage to finish converting the examples.

However, I didn't touch the examples involving nitro and zeromq given I have no knowledge in those thechnologies... Given I didn't touch there service implementation neither, I'm pretty sure the conversion to python3 has broke them. You should have a look there whenever you have the time.

Another bug I've found is in the example/dhttpd, with python3 it crashes at startup:

dhttpd
Traceback (most recent call last):
  File "dhttpd", line 18, in <module>
    app = DieselFlask(__file__)
  File "/home/emmanuel/projects/diesel3/diesel/web.py", line 35, in __init__
    Flask.__init__(self, name, *args, **kw)
  File "/home/emmanuel/projects/diesel3/venv3/lib/python3.4/site-packages/flask/app.py", line 319, in __init__
    template_folder=template_folder)
  File "/home/emmanuel/projects/diesel3/venv3/lib/python3.4/site-packages/flask/helpers.py", line 741, in __init__
    self.root_path = get_root_path(self.import_name)
  File "/home/emmanuel/projects/diesel3/venv3/lib/python3.4/site-packages/flask/helpers.py", line 631, in get_root_path
    loader = pkgutil.get_loader(import_name)
  File "/usr/lib/python3.4/pkgutil.py", line 467, in get_loader
    return find_loader(fullname)
  File "/usr/lib/python3.4/pkgutil.py", line 488, in find_loader
    return spec.loader
AttributeError: 'NoneType' object has no attribute 'loader'

According to this pallets/flask#1011, it seems to be a bug between flask and python3.4 (the issue says it is solved with python3.4.1, but I can't confirm given I didn't test that)

Anyway, I guess I'm not far from sending you my pull request ;-)

@dowski
Copy link
Member

dowski commented Apr 1, 2015

Yeah, I can work on the nitro/zeromq stuff. In fact I was starting to do some work on some of the zeromq code the other night and wound up merging in your branch locally so I was up to speed on the Python 3 changes.

Looking forward to the pull request when you get to it. I totally understand how work can leave not much left for programming when evening comes around though!

@Kojoley
Copy link

Kojoley commented Apr 5, 2015

Hello there,
Good news for Diesel. Twiggy Python 3.x compatibility wearpants/twiggy#26 is in their master branch wearpants/twiggy@88f1da5 (even if https://caniusepython3.com/project/Twiggy still reports that it is not)

@msabramo
Copy link
Contributor

msabramo commented Apr 5, 2015

Yep. I worked on that PR. IIRC, @wearpants put out a release on PyPI recently with that support. See https://pypi.python.org/pypi/Twiggy

I think the caniusepython3.com thing is a bug, which I reported at jezdez/caniusepython3.com#40

@wearpants
Copy link

Yup latest release of twiggy has python 3 support
On Apr 5, 2015 10:54 AM, "Marc Abramowitz" notifications@github.com wrote:

Yep. I worked on that PR. IIRC, @wearpants https://github.com/wearpants
put out a release on PyPI recently with that support. I think the
caniusepython3.com thing is a bug, which I reported at
jezdez/caniusepython3.com#40
jezdez/caniusepython3.com#40


Reply to this email directly or view it on GitHub
#97 (comment).

@dowski
Copy link
Member

dowski commented Apr 14, 2015

Thanks to the great work by @touilleMan, Python 3 support has come to diesel!

@dowski dowski closed this as completed Apr 14, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants