-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve default allocator wording #649
base: main
Are you sure you want to change the base?
Conversation
5c35b55
to
787d23b
Compare
For successful allocations of size greater than zero, the default allocator must | ||
return a pointer to a contiguous, exclusive region of memory of at least the | ||
requested size. | ||
For unsuccessful allocations, the default allocator must return [code]#nullptr#. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we instead removing this wording? The table below already says that the default allocator meets the C++ requirement for Allocator, and C++ defines the behavior of Allocator quite precisely.
If you ask me, the most confusing part of this section is this statement:
The user can implement an allocator that returns the same address as the one passed in the buffer constructor, but it is the responsibility of the user to handle the potential race conditions.
It would be good to figure out what we meant here and clarify this. Or, maybe it should be deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My best guess is that it's talking about the buffer
constructor overloads that accept both a host pointer and an allocator:
buffer(const T* hostData, const range<Dimensions>& bufferRange,
AllocatorT allocator, const property_list& propList = {});
The buffer
owns hostData
during its lifetime, so I guess it might be safe sometimes for allocator
to return hostData
(or other pointers within the range). But I don't know why that's useful, and I don't know how the user can avoid race conditions. Maybe use_mutex
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it might be safe sometimes for
allocator
to returnhostData
(or other pointers within the range)
Yes, that's my best guess too. However, this seems to assume that buffer
will only call allocator
in order to create a host buffer that contains hostData
, and it won't call allocator
for any other purpose. This is not specified anywhere, and it's contrary to the way the standard C++ library functions use allocators. When a class from the standard C++ library takes an allocator, I think it uses the allocator to allocate all the dynamic memory that it needs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we instead removing this wording?
I'd also prefer to just remove the sentence and not replace it with anything.
If you ask me, the most confusing part of this section is this statement:
I interpreted that as saying that a user could use an allocator for the user's own purposes (whatever those may be) and that same allocator could be passed to a buffer constructor. I.e., the act of giving an allocator to a buffer doesn't disallow the user from using the same allocator. Given that the SYCL runtime might choose to allocate memory using the allocator asynchronously, the user must put some locking mechanism into the allocator to avoid races.
Which seems reasonable to me, but I don't know if that's what was intended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not how I interpret that sentence, but I think we all agree that the sentence is confusing. Personally, I'd be happy to delete it.
The issue about thread safety is interesting. Is it legal for the implementation to call the user's allocator simultaneously from different threads even if the application has only one thread? (Or only one thread calling SYCL APIs?) For example, could an implementation have a background thread that potentially calls the allocator at the same time as the application's main thread?
If we think this is a valid implementation, the specification should make it clear that the allocator must be written to be thread safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I didn't know that. Thanks.
I'm growing convinced that the allocator passed to the constructor is probably supposed to be thread-safe. My current understanding is that std::allocator
is thread-safe, and that there's no way in C++ to query whether an allocator is thread-safe. If my understanding is correct, it seems like in order for a constructor to be thread-safe it would either have to rely on the thread-safety of the allocator it's using, or surround each use of a user-defined allocator with something like a mutex (because there's no way to check if the allocator is thread-safe). I don't think we'd want anybody to do the latter in practice, because it could degrade performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I skimmed the C++ spec, and I couldn't find anything on allocator thread safety one way or another. I could well have missed something, though.
I'm not convinced std::allocator
is intended to be thread-safe. It's specified as stateless, and I'd hazard a guess that any mutex or such inside the allocator could qualify as state. Also, std::allocator
s get rebound inside the standard library (and elsewhere), which implies constructing different allocator objects. A mutex data member wouldn't provide thread-safety. If std::allocator
does have a thread-safety guarantee, I'd expect it to be provided by whatever plumbing std::allocator
relies on, which could be implementation-defined.
For SYCL, I think there are two ways of looking at it:
- "SYCL is thread-safe so that the user doesn't need to worry about thread-safety. Therefore, the user shouldn't need to worry about providing a thread-safe allocator; it's the SYCL runtime that must provide the thread-safety."
- "SYCL is thread-safe. When a SYCL runtime relies on an entity provided by the user, the user-provided entity must be thread-safe so that the SYCL runtime can be thread-safe."
Both of these seem entirely reasonable to me, but only one is consistent with C++. Problem is, we don't know which one it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I skimmed the C++ spec, and I couldn't find anything on allocator thread safety one way or another. I could well have missed something, though.
It doesn't say "thread-safe", but allocator.members says:
Except for the destructor, member functions of the default allocator shall not introduce data races ([intro.multithread]) as a result of concurrent calls to those member functions from different threads. Calls to these functions that allocate or deallocate a particular unit of storage shall occur in a single total order, and each such deallocation call shall happen before the next allocation (if any) in this order.
For SYCL, I think there are two ways of looking at it:
I agree that both of these are reasonable interpretations. I have a preference for 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think std::allocator
is guaranteed thread-safe (at least as far as acquiring/releasing memory) because it is specified to call ::operator new
which in turn is specified to call malloc
or aligned_malloc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a preference for 2.
I agree. With option 1, a SYCL runtime needs to conservatively assume that any user-supplied allocator could race with any other, which hurts performance. Also, more similar to C++.
For successful allocations of size greater than zero, the default allocator must | ||
return a pointer to a contiguous, exclusive region of memory of at least the | ||
requested size. | ||
For unsuccessful allocations, the default allocator must return [code]#nullptr#. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For unsuccessful allocations, the default allocator must return [code]#nullptr#.
I think this is not the way allocators normally handle out-of-memory. A C++ allocator normally throws an exception in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that named requirement: Allocator implies that the only alternative an allocator has to providing the requested memory is throwing.
A few lines earlier in the SYCL spec (L 4892), it says:
When an allocator returns a `nullptr`, the runtime cannot allocate data on the host.
So I guess either the SYCL spec expect more C-like behavior (in which case default allocators are not actually named requirement Allocator
s), or default allocators throw on failure, in which case an allocator will actually throw when the runtime can't allocate data on the host. So one sentence or the other needs to change in the SYCL spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SYCL spec should not invent some new way for C++ allocators to work. I think the previous statement about returning nullptr
was just a mistake.
I think we're nearing consensus on the required changes:
1 and 2 make sense in this PR. I'm not sure if 3 should be done here or in a separate PR, because it might imply more changes and need more careful review. |
Can't we just delete this sentence too? The C++ specification for Allocator should cover this, no? |
I don't have a strong opinion, but given how long it's taken us to get to the bottom of this, I'd lean toward mentioning it in the SYCL spec for the avoidance of doubt. It's an important requirement for both users and implementers: The user's allocator must throw on failure. The implementer must catch exceptions from allocators and report them in an appropriate asynchronous exception list. I'm now wondering about the CTS and what existing implementations do ... |
Personally, I think it's better not to duplicate this wording from the C++ specification. Allocators are a reasonably advanced feature, and I think most programmers who define their own will understand the C++ requirements. In addition, the original spec wording didn't say anything about the out-of-memory behavior. I don't see a need to add that wording now. It's better to rely on the C++ specification. |
The way this sentence is worded could be interpreted as saying that the default allocator must be monotonic --- i.e., even if a memory region is deallocated, the default allocator is not allowed to allocate it a second time.
I've taken a crack at improving the wording, but I'm unsure about the use of exclusive. The original wording seems to go out of its way to say that any memory address controlled by the allocator must belong to no more than one allocation, but I'm not sure this is necessary. Do we really have to tell people that an allocator must not make the same storage available to two different objects at the same time? If the spec does need to explain this in detail, the word exclusive is probably not a sufficiently complete explanation of the requirements.