-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Channels #12042
Added Channels #12042
Conversation
cc : @andreasnoack , @jakebolewski , @JeffBezanson , @malmaud |
Looks really good! I'm excited for this. What do you think about implementing a |
Another cool thing would be if channels could be backed by shared memory if all the workers are on the same machine. Then we might be able to approach Go speeds for most common uses of channels. Of course that could be implemented any time in the future, but I thought it might be worth considering now if we're going to talk about design. |
Yes, Go's |
Channels via shared memory It should be easy enough to do at least for bitstypes channels. However, this would involve using a cross-platform library to provide portable synchronization primitives (or we roll our own). AFAIK libuv only provides for mutexes within a process. |
function put!{T}(c::Channel{T, true}, v::T) | ||
store = c.store | ||
d = store.take_pos - store.put_pos | ||
if (d == 1) || (d == -(store.szp1-1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
push! / pop! / shift! / unshift! should generally already take care of all of this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented a circular buffer to optimize unnecessary memmove
s on highly active channels. The flipside is that currently the buffer does not shrink from its peak size, but that can be implemented.
When we have this merged, would it be reasonable to implement what we discussed when you where here ? For the record :
all of this gets rid of the dependency of remoteref on gc finalizers. Finalizing a remoteref could then either : decref OR error saying that you should fetch (or close) a remoteref explicitely. A lot of time is spent right now in the explicit gc() call in the remote ref constructor because of the memory pressure issue. Then, if you want a more long lived thing that you can fetch multiple times you use a channel and you control the lifetime explicitely. |
It might also be a good time to add a type parameter to remote refs and be able to specify it explicitely in remotecall for performance (and it would default to Any). |
RemoteRefs now have a type parameter. Will ensure I would like to keep the distributed gc mechanism that we have for the simple reason that it is currently very user friendly. Just like folks do not worry about any gc issues in single processes mode, the distributed gc ensures the same for distributed mode too. It is totally transparent and there are lots of cases where The explicit call to The lifetime of channels must be explicitly managed only if they have been created with |
Well we really can't have this be the default IMO. It's extremely inefficient and is useless to 99% of the use case of remoteref which is an RPC call you don't want to block on ? I don't care if it's called a remoteref or something else, but the return type from remotecall cannot be something with a complicated lifetime implementation since it is such a basic primitive of our system. Essentially, what people want most of the time is : r = @async remotecall_fetch(...) because this way it doesn't incur distributed gc and you are still not blocking. This should be the default way of doing things that's all I'm saying. Complex lifetime should be opt-in and something you go out of your way for. The problem with the Please correct me if I'm wrong, because I'm not very familiar with our rpc system. |
Also what are the use cases you have in mind for remoteref that are fetched multiple times and would not be covered by a channel ? |
What if we had both a |
@andreasnoack examples involved |
@malmaud not required! As I said, unless you are explicitly calling |
OTOH,
Better documentation, and more examples are required to explain these subtle differences between these calls and their effect on performance. |
I don't think we should pretend that distributed gc is transparent if it is inefficient, people will just get bitten and use a workaround. That's not usable. Calling gc() manually is never fine, it is extremely slow (runs gc twice, one of those being a full collection). Maybe I didn't express myself clearly. In fact, what I don't believe should be the default is mutable remoterefs. If it is immutable it is always a bad idea to fetch it twice (and we could actually just destroy the ref on fetch and cache the result to allow fetching twice). If the remoteref is immutable then we can make the optimization I'm talking about to have efficient distributed gc without finalizers. Yes, So my question is essentially : why do we need mutable remoteref to be the default ? We should at least have immutable ones for efficiency, make it the default, and have people ask for the mutable one when they want it. In my mind that's what I call a channel of length 1 but it could be called remoteref (the difference being explicit lifetime). In other words, remotecall is a nice short name, so it should be the good way of doing things, not a very general thing that does way too much for most of use cases and thus ends up being inefficient. |
We will remove the gc() call. Explicitly finalizing RemoteRefs will solve the lifetime issue and not trigger a full gc(). The other issue is the design of DistributedArrays and SharedArrays, where everytime the distributed array or the shared array object is sent to any worker, all the remote refs of all the parts are sent over. Changing these implementations to use an explicitly managed channel which stores the references, will drastically reduce the number of "add_client" messages that we saw on @andreasnoack test code the other day. I am uncomfortable having a As I understand the mutable/immutable difference is relevant only for use cases where the result of an RPC needs to shared across processes. With a ref count implementation you save on a "delete_msg" from each worker, right? |
Well, the problem is the gc() call was added for a reason, probably because otherwise those would pile up. So now users would have to be aware of it and finalize manually ? Then there is no real point in having those finalizers as convenience anyway. When you say user code might fetch multiple times the same ref, is it expecting the same value everytime ? (in most cases) If yes, then we should have ImmutableRemoteRef be the default and then we can cache the result on the receiver side and decref on (the first) fetch, while still allowing multiple fetch without going over the network. So the immutable thing also "solves" the multiple fetch problem. I'm sure everything I want is doable now, but it's more about what is the default (and encouraged) behavior. I would argue that the only time you want a shared cell that holds some data everyone can modify, you are quite aware of its lifetime and it would be fine to have channels for this case. |
A background task pushes "del_msgs" whenever local refs are collected by the local gc as a matter of course. The explicit gc was added to handle the case where the local gc is not collecting refs due to a lack of memory pressure (given that they are only 24 bytes each). You would explicitly call finalizers only when you feel that the local gc is not collecting ref objects in time. Yes, an immutable with a local cache would work. |
I wanted to try out this branch, but I got an error with julia> addprocs(1)
ERROR (unhandled task failure): TypeError: subtype: expected Type{T}, got Symbol
in tmerge at ./inference.jl:1156
in abstract_call_gf at ./inference.jl:726
in abstract_call at ./inference.jl:882
in abstract_eval_call at ./inference.jl:934
in abstract_eval at ./inference.jl:961
in abstract_interpret at ./inference.jl:1120
in typeinf_uncached at ./inference.jl:1549
in typeinf at ./inference.jl:1339
in typeinf at ./inference.jl:1289
in abstract_call_gf at ./inference.jl:725
in abstract_call at ./inference.jl:882
in abstract_eval_call at ./inference.jl:934
in abstract_eval at ./inference.jl:961
in typeinf_uncached at ./inference.jl:1622
in typeinf at ./inference.jl:1339
in typeinf_ext at ./inference.jl:1283
in anonymous at task.jl:13 |
bfecea1
to
2e55431
Compare
@malmaud Sorry about that, some issue with a few precompile lines that were working so far. I have commented them out for now, you ought to be able to try it now. |
b5800db
to
835ec94
Compare
@JeffBezanson , could you please review this? I'll add documentation and can merge after a review.
Channels are about as fast as produce-consume for inter-task communication, with the added advantage of a) length > 1 and b) being type-aware.
|
The Channel type looks really good. We should probably remove On the RemoteRef side, things are starting to feel complicated: we have channels, remoterefs which include channels (plus RemoteValues), and remote channels. Something seems amiss here. This is also reflected in the two flavors of It might be simpler, and in line with @carnaval 's suggestion, to make RemoteRefs immutable, and only support put! and take! if they point to a Channel. This could make things more efficient when a RemoteRef only needs to point to one value, which is a common case. RemoteRefs could also have a no-DGC option, making them behave like ChannelRef. Although I'm not sure if a manually-managed ref should be an option or a different type. |
99e05fb
to
5caf298
Compare
|
What's the motivation for the change in the default Channel size? That puts us at odds with Go, where channels are unbuffered by default. |
Given the name "channel" I felt that the default of size 1 was non-intuitive. No other reason. |
That does make sense, although if we ever implement same-node channels via shared memory, I think we'd have to allocate the maximum buffer size at the time the channel was created. Obviously that's not great if the default size is |
A saner default then? 32? |
Or 42. |
Any integer less than a million sounds good to me. |
I'm ok with adding Channels in 0.4 but the totality of this change feels like too much right now. We should use I find the name Is it possible for multiple processors to talk to the same remote channel? It's not clear to me how that works. For example, you could serialize a ChannelRef to another processor, but I don't see where references are added. |
I see your point. Instead of
For example, this will allow users to implement remote dictionaries as an
Multiple processors can talk to the same remote channel, or in the modified form, a |
Have reworked the interface. Channel constructors are now
and of course The remote references type hierarchy is now
Registered Refs are created with
A Objects pointed to by a Some of the changes in this PR are not directly related to Channels but came about while implementing the same. Will submit them as separate PRs in order to make this a bit more easily reviewable. |
One of my colleagues, Jim Cownie, who once worked on Transputers, noted that the
MPI now recognises that |
I totally agree that |
A A solution would be to spawn individual tasks that take from individual references and write to a common channel. The caller reads only from this channel and processes all values written to it. |
+1 to Go-style select. |
+1 on select (or something comparable) On Friday, July 31, 2015, Stefan Karpinski notifications@github.com wrote:
|
Channels need a mention in NEWS. |
I added a mention few hours ago. On Tue, Aug 11, 2015 at 6:54 PM, Viral B. Shah notifications@github.com
|
Is this PR obsolete at this point? |
Yes. Ideas discussed here should be incorporated in a new PR. Closing this one. |
Edit: Current design iteration starts at : #12042 (comment)
RemoteRef(T::DataType, pid::Integer, sz::Int=1)
API
put!
,take
,fetch
,isready
,wait
have the same action as for RemoteRefsopen(cref::ChannelRef, T::Type=Any, sz::Int=1)
creates a remote channel and returns a cref with the channel id filled in.put!
,take
,fetch
,isready
,wait
work withChannelRef
s for remote access to a channel.open
will have to be explicitly closed with aclose(addr::ChannelRef)
to be garbage collected.RemoteRefs
continue to be collected by the distributed gc mechanism we currently have.