Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyCall + Distributed: cannot pickle 'traceback' object #897

Open
Djoop opened this issue Apr 26, 2021 · 3 comments
Open

PyCall + Distributed: cannot pickle 'traceback' object #897

Djoop opened this issue Apr 26, 2021 · 3 comments

Comments

@Djoop
Copy link

Djoop commented Apr 26, 2021

Hi, I posted on Discourse but no answer so far, so maybe it's better here?
I have trouble getting the error type/message when running PyCall with @distributed (when exceptions are raised in Python). It seems that it's the stack trace which poses problem, but I don't know if it is a known limitation or a bug? From PyCall or from Distributed?

julia> using Distributed

julia> addprocs(1)
1-element Vector{Int64}:
 2

julia> @everywhere using PyCall

julia> fetch(@spawnat 2 py"""
       raise ZeroDivisionError
       """)
┌ Error: Fatal error on process 2
│   exception =
│    PyError ($(Expr(:escape, :(ccall(#= /home/me/.julia/packages/PyCall/BD546/src/pyfncall.jl:43 =# @pysym(:PyObject_Call), PyPtr, (PyPtr, PyPtr, PyPtr), o, pyargsptr, kw))))) <class 'TypeError'>TypeError("cannot pickle 'traceback' object")
│    
│    Stacktrace:
│      [1] pyerr_check
│        @ ~/.julia/packages/PyCall/BD546/src/exception.jl:62 [inlined]
│      [2] pyerr_check
│        @ ~/.julia/packages/PyCall/BD546/src/exception.jl:66 [inlined]
│      [3] _handle_error(msg::String)
│        @ PyCall ~/.julia/packages/PyCall/BD546/src/exception.jl:83
│      [4] macro expansion
│        @ ~/.julia/packages/PyCall/BD546/src/exception.jl:97 [inlined]
│      [5] #107
│        @ ~/.julia/packages/PyCall/BD546/src/pyfncall.jl:43 [inlined]
│      [6] disable_sigint
│        @ ./c.jl:458 [inlined]
│      [7] __pycall!
│        @ ~/.julia/packages/PyCall/BD546/src/pyfncall.jl:42 [inlined]
│      [8] _pycall!(ret::PyObject, o::PyObject, args::Tuple{PyObject}, nargs::Int64, kw::Ptr{Nothing})
│        @ PyCall ~/.julia/packages/PyCall/BD546/src/pyfncall.jl:29
│      [9] _pycall!
│        @ ~/.julia/packages/PyCall/BD546/src/pyfncall.jl:11 [inlined]
│     [10] #pycall#112
│        @ ~/.julia/packages/PyCall/BD546/src/pyfncall.jl:80 [inlined]
│     [11] pycall
│        @ ~/.julia/packages/PyCall/BD546/src/pyfncall.jl:80 [inlined]
│     [12] serialize(s::Distributed.ClusterSerializer{Sockets.TCPSocket}, pyo::PyObject)
│        @ PyCall ~/.julia/packages/PyCall/BD546/src/serialize.jl:14
│     [13] serialize_any(s::Distributed.ClusterSerializer{Sockets.TCPSocket}, x::Any)
│        @ Serialization /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:649
│     [14] serialize(s::Distributed.ClusterSerializer{Sockets.TCPSocket}, x::Any)
│        @ Serialization /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:628
│     [15] serialize(s::Distributed.ClusterSerializer{Sockets.TCPSocket}, ex::CapturedException)
│        @ Distributed /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/clusterserialize.jl:205
│     [16] serialize_any(s::Distributed.ClusterSerializer{Sockets.TCPSocket}, x::Any)
│        @ Serialization /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:649
│     [17] serialize(s::Distributed.ClusterSerializer{Sockets.TCPSocket}, x::Any)
│        @ Serialization /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Serialization/src/Serialization.jl:628
│     [18] serialize_msg(s::Distributed.ClusterSerializer{Sockets.TCPSocket}, o::Distributed.ResultMsg)
│        @ Distributed /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/messages.jl:78
│     [19] #invokelatest#2
│        @ ./essentials.jl:708 [inlined]
│     [20] invokelatest
│        @ ./essentials.jl:706 [inlined]
│     [21] send_msg_(w::Distributed.Worker, header::Distributed.MsgHeader, msg::Distributed.ResultMsg, now::Bool)
│        @ Distributed /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/messages.jl:174
│     [22] send_msg_now
│        @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/messages.jl:118 [inlined]
│     [23] send_msg_now(s::Sockets.TCPSocket, header::Distributed.MsgHeader, msg::Distributed.ResultMsg)
│        @ Distributed /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/messages.jl:113
│     [24] deliver_result(sock::Sockets.TCPSocket, msg::Symbol, oid::Distributed.RRID, value::RemoteException)
│        @ Distributed /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/process_messages.jl:95
│     [25] macro expansion
│        @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/process_messages.jl:286 [inlined]
│     [26] (::Distributed.var"#105#107"{Distributed.CallMsg{:call_fetch}, Distributed.MsgHeader, Sockets.TCPSocket})()
│        @ Distributed ./task.jl:406
└ @ Distributed /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/process_messages.jl:99
Worker 2 terminated.ERROR: 
ProcessExitedException(2)
Stacktrace:
  [1] try_yieldto(undo::typeof(Base.ensure_rescheduled))
    @ Base ./task.jl:705
  [2] wait
    @ ./task.jl:764 [inlined]
  [3] wait(c::Base.GenericCondition{ReentrantLock})
    @ Base ./condition.jl:106
  [4] take_buffered(c::Channel{Any})
    @ Base ./channels.jl:389
  [5] take!(c::Channel{Any})
    @ Base ./channels.jl:383
  [6] take!(::Distributed.RemoteValue)
    @ Distributed /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/remotecall.jl:599
  [7] #remotecall_fetch#143
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/remotecall.jl:390 [inlined]
  [8] remotecall_fetch(f::Function, w::Distributed.Worker, args::Distributed.RRID)
    @ Distributed /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/remotecall.jl:386
  [9] #remotecall_fetch#146
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/remotecall.jl:421 [inlined]
 [10] remotecall_fetch
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/remotecall.jl:421 [inlined]
 [11] call_on_owner
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/remotecall.jl:494 [inlined]
 [12] fetch(r::Future)
    @ Distributed /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/remotecall.jl:533
 [13] top-level scope
    @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Distributed/src/macros.jl:99
@marcobonici
Copy link

Any update on this @Djoop ?

@Engrammae
Copy link

Any update on this? @Djoop @marcobonici

@Djoop
Copy link
Author

Djoop commented Sep 6, 2024

I just tried with julia1.10, PyCall v1.96.4, it seems that even though the stacktrace changed it still raises a ProcessExitedException(2), while without distributing you would get a PyError of class 'ZeroDivisionError'. And it seems to be the same even if using a try/catch inside the spawnat call. I guess that this is not the expected behavior?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants