-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: spec: non-reference channel types #28366
Comments
So if these are non-reference, would that mean that its buffer gets copied if I do I like the idea of a close-only channel, though. |
@deanveloper, it would presumably be an error to copy a non-reference channel after its first use, in the same way that it is already an error to copy the types defined in the I'm not sure whether that's something we could easily enforce in the language spec, but we could certainly apply |
It feels a bit weird that the channel operators seem to operate on the channel value but really it seems like you're wanting semantics like sync.Mutex, which operates on the pointer value. You say:
The above statement seems to imply that the value doesn't need to be addressable, so logically this should work:
but I'm not sure how you'd make it work. I wonder whether it might be better to say that the operators work only on |
@rogpeppe, that's a good point. Following on the array analogy: today, you can read from an unaddressable array but not assign to its elements. The use-case for arrays is to index into a literal, but there is no such thing as a channel-buffer literal — and I am not proposing to add one here — so I suspect that it's best to require that the channels be addressable. |
Continuing the array analogy a little, the size argument is perhaps similar enough to the size of an array that square brackets would be justified instead of round. That is, I think it probably makes sense to allow directional specifiers too, so I worry that having another way to pass channels around could split conventions and make APIs less interoperable. |
That would be nice, but seems like it would conflict with existing syntax: |
Square brackets are more natural, but unfortunately ambiguous. Consider: chan[1] int vs. chan [1]int |
Re directions and having another way to pass channels around: I expect that public APIs would continue to use the existing reference channel types rather than pointers to statically-sized channels, for much the same reason that we tend to pass slices instead of pointers-to-arrays. If you only have a reference to a channel — or to just one direction of a channel — you have little reason to care about its allocation or buffering. (Plus, as a practical matter, passing channel references instead of pointers discourages callers from playing games with |
I feel like it's a bit clunky to not be able to copy a built-in type, and it'd be pretty easy to do without thinking, especially because they are so similar to regular channels (which can be copied, of course). Maybe if there were a defined behavior for when one of these gets copied? I really like the rest of this. |
There are no types in the language that work like that. You do mention But as far as I can see the only way to implement that would be to make non-reference channels be pointers, and in that case we no longer have an easy way to implement the zero value. |
Yep. I think that's probably the biggest drawback of this proposal.
We could address that with a complementary change to the language (see previously #23764 and #8005 (comment)), but of course a general language feature would raise the cost of this proposal considerably.
Since the non-reference channel types would be built in, we could define copying a value to be a compile-time error. (Essentially, we could treat these types as “not assignable” even to the same type.) A single non-copyable type family would be a (minor) wart on the language, but arguably (As a side benefit, making these channels uncopyable would provide |
Could copy create a new channel, with the same characteristics, but an empty buffer? I think that's what you'd expect if you copied a struct containing these... |
Then it wouldn't be binary-equivalent which is something I'd expect from copying a built-in type |
@networkimprov, sends to the copy would not be received via the original (and vice-versa). That seems likely to cause very subtle bugs. |
@bcmills that's what I expect in copying a container with this channel. Where the channel coordinates methods of its parent struct, a copy of the struct needs a channel with no link to the original. A channel is a mailbox with unique address, and transient senders, receivers, and buffered items. If you duplicate a mailbox, you obtain a new address. (A proposal for tee-ing a channel to clone sent data is #26282.) I think a type which blocks container copies sort-of defeats the goal of a make-less channel. |
Right, but we're talking about copying, not duplicating. If you copy a mailbox, I'd expect it to have all of the same data as the old one.
|
Channels are used for synchronization. Duplicating a channel's contents could cause a reference that is supposed to be unique to be duplicated. But failing to duplicate its contents would be even more surprising, like failing to copy the data in an array. Both of those options are subtle and arguably confusing. That's why I believe the clearest behavior is to disallow copying. |
Could you elaborate? The examples at the top of the proposal do not require copying: the goal is to enable use-cases like those examples, in which a channel is one of potentially many members of a larger structure. |
We instantiate structs via new & { } or assignment. For a struct with a channel, a chan(...) eliminates the need for a constructor to make the channel, but not the need for a copy-constructor if a chan(...) cannot be instantiated by assignment. A channel isn't analogous to an array, whose elements are readable ad infinitum. A channel item is read once, by a caller who has the unique address for the channel. EDIT: |
It might help to have literals...
|
And another alternative would be defined-type default initializers; see #28939 (comment) EDIT:
|
While you make some good points, the result of this proposal does not feel like Go. |
I do really like the "close-only" channel, which I've invented independently at least once as "this would be clearer semantically and probably dramatically faster and could be a single atomic op instead of a locking operation". I may have come up with this, and then forgotten it, more than once, but it keeps coming up. |
@seebs Clearer semantically, yes, but given that we would still want it to work with a select statement I don't see why it would be dramatically faster. |
Well, it wouldn't be dramatically faster for every use case, but I think it would for some common use cases. One fairly common usage of a close-only channel (even if there's no way to indicate those semantics in the source right now) is to check whether to abort an operation:
A close-only, non-reference channel can implement this as a lockless read. I don't know enough about the One of the #darkarts people on gopher slack had an observation a while back, that if you replace a select on two sources with a pseudorandom alternation between two nested single-source selects, this can improve performance:
Apparently this reduces select costs noticably. It would probably be even better if one or both of the selects didn't need a lock. |
Does the original select on two sources have the default case? If not you'd have to loop the alternating nested selects, yielding a busy-wait scenario. |
Yes, that transformation is only potentially sane when there's a default, so you'll exit if nothing's readable. I'm not entirely convinced it's valid even then, because arguably it's got slightly different timing, but I don't think the difference is actually observable without cheating. It seems to me that it might also turn out that it's possible to streamline life for selects that include both close-only and non-close-only channels, but I don't know how, because I don't know the code. |
yeah, but it can't do that if the channel's open, when determining whether or not to take an adjacent default. The big semantic difference is that if a channel can never be sent to, only closed, then reads from it can never change its state, and thus, don't need locking, they just need to check a state. If a channel could be sent to, then a read could unblock a sender. |
Background
I've started using 1-buffered channels pretty heavily as “selectable mutexes” (see also #16620). Unbuffered channels are also quite common in Go APIs (for example, as
cancel
channels in acontext.Context
; see also #28342 (comment)).I've noticed several issues that emerge with the existing
chan
types:A channel's buffering semantics are often semantically important: for example, a 1-buffered channel can be used as a mutex, but a 2-buffered or unbuffered channel cannot. The need to document and enforce these important properties leads to a proliferation of comments like
// 1-buffered and never closed
.Struct types that require non-
nil
channels cannot have meaningful zero-values unless they also include async.Once
or similar to guard channel initialization (which comes with losses in efficiency and clarity). WritingNew
functions for such types is a source of tedium and boilerplate.Currently, channels are always references to some other underlying object, typically allocated and stored on the heap. That makes them somewhat less efficient that a
sync.Mutex
, which can be allocated inline within astruct
whose fields it guards.Proposal
I propose a new family of channel types, with a relationship to the existing channel types analogous to the relationship between arrays and slices: the existing
chan
types are conceptually references to underlying instances of the proposed types.Syntax
Parsing this syntax has one caveat: if we read a selector expression in parentheses after a
chan
keyword, we don't know whether it is anExpression
or theElementType
until we see the next token after the closing parenthesis. I'm open to alternatives.Semantics
A
StaticChannelType
represents a non-reference channel with a buffer size indicated by theExpression
, which must be an integer constant expression. Achan(N, close) T
can be closed, while achan(N) T
(without theclose
token) cannot. The buffer size is equivalent to the size passed tomake
: a call tomake(chan T, N)
conceptually returns a reference to an underlying channel of typechan(N, close) T
.A
CloseOnlyChannelType
is a channel with element typestruct{}
that does not support send operations.An addressable
chan(N[, close]) T
can be used with the send and receive operators,range
loop,select
statement, andclose
,len
, andcap
builtins just like any other channel type (just as[N]T
supports the same index expressions as[]T
). A send expression on achan _
is a compile-time error. (Compare #21069.) Aclose
call on achan(N) T
(declared without theclose
token) is a compile-time error.chan(N) T
is not assignable tochan T
, just as[N]T
is not assignable to[]T
. However,*chan(N) T
(a pointer to a static channel) can be converted tochan T
or either of its directional variants, andchan _
can be converted tochan struct{}
or either of its directional variants. (Making*chan(N) T
assignable tochan T
also seems useful and benign, but I haven't thought through the full implications.)A
close
on achan T
that refers to a non-closeablechan(N) T
panics, much as aclose
on an already-closedchan T
panics today. (One already cannot, in general, expect toclose
an arbitrary channel.) A send on achan _
panics, much as a send on a closed channel panics today.Implementation
A
chan(N) T
is potentially much more compact that achan T
, since we can know ahead of time that some of its fields are not needed. (Achan _
could potentially be as small as a single atomic pointer, storing the head of a wait-queue or a “closed” sentinel.) The difficult aspect of a more optimized implementation is the fact that a*chan(N) T
can be converted tochan T
: we would potentially need to makechan T
a bit larger or more expensive to accommodate the fact that it might refer to a stripped-down statically-sized instance, or else apply something akin to escape analysis to decide whether a givenchan(N) T
should use a compact representation or a fullhchan
.However, we could already do something like that optimization today: we could, for example, detect that all of the functions that return a non-nil
*S
for some struct typeS
also populate it with channels of a particular size, and that those channels do not escape the package, and replace them with a more optimized implementation and/or fuse the allocations of the object and its channels. It probably wouldn't be quite as straightforward, but it could be done.I want to emphasize that the point of this proposal is to address the usability issues of reference channels: the ad-hoc comments about buffering invariants, lack of meaningful zero-values, and
make
boilerplate. Even if non-reference channels do not prove to be fertile ground for optimization, I think they would significantly improve the clarity of the code.Examples
CC: @ianlancetaylor @griesemer @josharian
The text was updated successfully, but these errors were encountered: