-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limited API support for Py_buffer #89622
Comments
Currently all APIs related to Py_buffer are excluded from the limited API. It's neither possible to use Py_buffer from a C extension with limited API nor is it possible to define heap types with buffer support using the stable ABI. The lack of Py_buffer support prevents prominent projects like NumPy or Pillow to use the limited API and produce abi3 binary wheel. To be fair it's not the only reason why these projects have not adopted the stable abi3 yet. Still Py_buffer is a necessary but not sufficient condition. Stable abi3 support would enable NumPy stack to build binary wheels that work with any Python version >= 3.11, < 4.0. The limited API excludes any C API that references Py_buffer:
It should not be terribly complicated to add Py_buffer to the stable API. All it takes are an opaque struct definition of Py_buffer, an allocate function, a free function and a bunch of getters and setters. The hard part is to figure out which getters and setters are needed and how many struct members must be exposed by getters and setters. I recommend to get feedback from NumPy, Pillow, and Cython devs first. Prototype typedef struct bufferinfo Py_buffer;
/* allocate a new Py_buffer object on the heap and initialize all members to NULL / 0 */
Py_buffer*
PyBuffer_New()
{
Py_buffer *view = PyMem_Calloc(1, sizeof(Py_buffer));
if (view == NULL) {
PyErr_NoMemory();
}
return view;
}
/* convenience function */
Py_buffer*
PyBuffer_NewEx(PyObject *obj, void *buf, Py_ssize_t len, Py_ssize_t itemsize,
int readonly, int ndim, char *format, Py_ssize_t *shape, Py_ssize_t *strides,
Py_ssize_t *suboffsets, void *internal)
{
...
}
/* release and free buffer */
void
PyBuffer_Free(Py_buffer *view)
{
if (view != NULL) {
PyBuffer_Release(view);
PyMem_Free(view);
}
} |
Py_buffer.shape requires a Py_ssize_t* pointer. It's not convenient. For example, the array module uses: static int
array_buffer_getbuf(arrayobject *self, Py_buffer *view, int flags)
{
...
if ((flags & PyBUF_ND)==PyBUF_ND) {
view->shape = &((PyVarObject*)self)->ob_size;
}
...
return 0;
} This code is not compatible with a fully opaque PyObject structure: |
shape is a pointer to array of Py_ssize_t of size ndim. array and memoryview do a trick to avoid memory allocation, but _testbuffer.ndarray allocates it dynamically in the heap. We can add a small static buffer in Py_buffer to avoid additional memory allocation in common cases. |
IIRC shape, strides, and suboffsets are all arrays of ndims length. We could optimize allocation if we would require users to specify the value of ndims and don't allow them to change the value afterwards. PyBuffer_New(int ndims) then would allocate view of size sizeof(Py_buffer) + (3 * ndims * sizeof(Py_ssize_t *)). This would give us sufficient space to memcpy() shape, strides, and suboffsets arguments into memory after the Py_buffer struct. |
ndim is not known before calling PyObject_GetBuffer(), so we will need a new API which combines PyObject_GetBuffer() and PyBuffer_New(). |
CC Antoine for his expertise of the buffer protocol Opaque Py_Buffer and PyObject structs will require a different approach and prevent some optimizations. The consumer will have to malloc() a Py_buffer struct on the heap. In non-trivial cases the producer (exporter) may have to malloc() another blob and store it in Py_buffer.internal [1]. I'm not particularly worried about the performance of malloc here. [1] https://docs.python.org/3/c-api/buffer.html?highlight=pybuffer#c.Py_buffer.internal |
Py_buffer is often used for handling arguments if the function supports bytes, bytearray and other bytes-like objects. For example bytes.partition(). Any additional memory allocation would add significant overhead here. bytes.join() creates Py_buffer for every item, it would be a deoptimization if it would need to allocate them all separately. We should allow to allocate Py_buffer on stack. Currently it has too complex structure and we cannot guarantee its stability (although there were no changes for years). I propose to split Py_buffer on transparent and opaque parts and standardize the transparent structure. It should include: obj, buf, len, possible flags (to distinguish read-only from writeable) and a pointer to opaque data. For bytes, bytearray, BytesIO, mmap and most other classes the pointer to opaque data is NULL. For array and memoryview objects the opaque data could be embedded into the object. |
Could you split this into two PRs: one to add the new API, and another to add things to the limited set? There's no rush to add it to the limited API, esp. since it can't be tested with the intended consumers yet. And it's a hassle to remove mistakes from the limited API, even in alphas/betas. To try this out I suggest making the struct opaque when something like |
Maybe a PEP is needed to collect usages of the Py_buffer API and check if the ABI is future proof. A PEP may help to discuss with other projects which currently consume this API. I suggest to start with the smallest possible API and then slowly extend it. It's too easy to make mistakes :-( Once it's added to the stable ABI, it will be really hard to change it. For example, PyBuffer.format is a "char*", but who owns the string? For a stable ABI, I would suggest to duplicate the string. For shape, stripes and suboffsets arrays, I would also suggest to allocate these arrays on the heap people to ensure that it cannot be modified from the outside. In your PR, PyBuffer_GetLayout() gives indirectly access to the internal Py_buffer structure members and allows to modify them. One way is to avoid this issue is to return a *copy* of these arrays. I would prefer to require to call "Set" functions to modify a Py_buffer to ensure that a buffer always remains consistency.
This API looks like PyCode_New() which was broken *often* so it looks like a bad pattern for a stable ABI. Maybe PyBuffer_New() + many Set() functions would be more future proof. But I don't know which Py_buffer members are mandatory to have a "valid" buffer. What if tomorrow we add new members. Will it be possible to initalize them to a reasonable default value? |
All memory is owned by the exporter object. The exporter (aka producer) is the Python type that implements Py_bf_getbuffer and Py_bf_releasebuffer. In majority of cases the exporter doesn't have to set shape, strides, and suboffsets. They are used in special cases, e.g. multidimensional arrays with custom layout. For example they can be used to convert TIFF images from strides big endian format to a NumPy array in little endian format. It's up to the exporter's Py_bf_getbuffer to decide how it fills shape, strides, and suboffsets, too. For example an exporter could allocate format, shape, strides, and suboffsets on the heap and assign pointers in its getbuffer function and store a hint in the It would be a bad idea to return copies in PyBuffer_GetLayout(). Consumers have to get the layout every time they access a specific item in the buffer in order to calculate the offset. I'd rather define the arguments as "const". The documentation already states that e.g. "The shape array is read-only for the consumer.". It is highly unlikely that we will ever have to extend the Py_buffer interface. It is already extremely versatile and can encode complex formats. You can even express the layout of a TIFF image of float32 CMYK in planar configuration (one array of 32bit floats cyan, followed by an array of magenta, then an array of yellow, and finally an array of contrast). PS: I have removed PyBuffer_NewEx() function. It did not make sense any sense. |
A consumer will use the APIs: --- Py_buffer *view;
int ndim;
const char *format;
const Py_ssize_t *shape, *strides, *suboffsets;
void *buf;
view = PyBuffer_New();
PyObject_GetBuffer(obj, view, flags);
ndim = PyBuffer_GetLayout(&format, &shape, &strides, &suboffsets);
buf = PyBuffer_GetPointer(view, [...]);
PyBuffer_Free(view); // also calls PyBuffer_Release() The API functions PyBuffer_FillInfo(), PyBuffer_FillInfoEx(), and PyBuffer_GetInternal() are for exporters (producers)-only. The exporter uses the PyBuffer_FillInfo*() in its Py_bf_getbuffer function to fill the view. It may use PyBuffer_GetInternal() in its Py_bf_releasebuffer function to access the internal field and to release additional resources. |
I do not like requirement to allocate Py_buffer on the heap. It adds an overhead. Common case in CPython code is: Py_buffer view;
void *buf;
Py_ssize_t len;
PyObject_GetBuffer(obj, &view, PyBUF_SIMPLE);
buf = view.buf;
len = view.len;
// no other fields are used
PyBuffer_Release(&view); And I want to keep it as simple and efficient as it can be. |
CPython internals can still use allocation on the stack. Only stable ABI extensions have to use allocation on the heap. |
That would be an unfair advantage. If we want people to use the limited API we should not make it much slower than the non-limited API. |
No. Limited API is generally not as performant as the non-limited one. It is even documented this way: https://docs.python.org/3/c-api/stable.html#limited-api-scope-and-performance We should not make it *much* slower, but code that can take advantage of implementation details can always be faster. Max speed should not be a concern in the limited API. |
Py_buffer *is* an ABI, and it hasn't changed from the start. Of course you can still try to collect feedback from third-party projects, but there is a very high probability that it won't need to change in the near future. |
I would advocate:
The rest is optional. |
I am someone who is interested in having this, but FWIW my motivation is slightly more narrow, I only really need abi3-friendly buffer support with contiguous 1d buffers. Not sure if there'd be interest in doing a smaller version before figuring out the entire Py_buffer API. |
Antoine has a good point. We can freeze the Py_buffer struct. If it needs to be extended in the future, it'll need a new set of functions and names -- and perhaps a versioning scheme. We'll know more about the problem when/if it comes up. |
After some consideration I also agree with Antoine. The Py_buffer API has been around for a long time without any changes to the Py_buffer struct. It is unlikely that the struct will ever change. I have created a new PR that exposes Py_buffer struct, PyBuffer_() API functions, PyBUF_ constants, Py_bf_* type slots, and PyMemoryView_FromBuffer(). We could consider to export PyPickleBuffer*() API, too. |
Would it make sense to add a "version" member to the structure? It would allow to support an old stable structure for the stable ABI and a new structure with other changes. The problem is how to initalize the version member. On Windows, many structures have a member which is the size of the structure. I tried to implement such ABI compatibility for PEP-587 PyConfig structure, but the idea was rejected (there was no need for a stable ABI when Python is embedded in an application): |
I thought of a version field, too. In the end it is going to cause more work and trouble than it would benefit us. Stack-allocated Py_buffer's are typically initialized with Py_buffer data = {NULL, NULL}; . The code initializes Py_buffer.buf and Py_buffer.obj as NULL. The remaining fields are whatever random values happens to be on the C stack. If we would append a version field to the struct, than every project would have to initialize the field properly. |
In Python 3.5, I decided to rename the public "PyMemAllocator" structure to PyMemAllocatorEx when I added a new "calloc" member. C extensions using "PyMemAllocator" fail to build to force developers to set the calloc member. IMO it's unfortunate to have to rename a structure to force developers to update their C code :-( |
The Py_buffer struct has stayed the same for over a decade and since Python 2.6.0 and 3.0.0. It is unlikely that it has to be changed in the near future. |
The C language sets other members to 0/NULL with this syntax, no? |
No, they are not set to 0/NULL. https://en.wikipedia.org/wiki/Uninitialized_variable |
Example: struct Point { int x; int y; int z; };
int main()
{
struct Point p = {1};
return p.y;
} gcc -O0 produces this machine code which sets p.y to 0 and p.z to 0: gcc -O3 heavily optimize the code, it always return 0, it doesn't return a random value from the stack: The "C99 Standard 6.7.8.21" says:
The C99 standard says that p.y and p.z must be set to 0. I'm talking about the specific C syntax of a structure static initialization: "struct MyStruct x = {...};". If "Py_buffer data = {NULL, NULL};" is allocated on the stack, all "data" Py_buffer members are set to 0 or NULL: typedef struct bufferinfo {
void *buf;
PyObject *obj; /* owned reference */
Py_ssize_t len;
Py_ssize_t itemsize; /* This is Py_ssize_t so it can be
pointed to by strides in simple case.*/
int readonly;
int ndim;
char *format;
Py_ssize_t *shape;
Py_ssize_t *strides;
Py_ssize_t *suboffsets;
void *internal;
} Py_buffer; If we want to add a version member to this structure, I would suggest to enforce the usage of a static initialization macro or an initialization function, like: The problem of the macro is that it is not usable on Python extensions was are not written in C or C++ (or more generally to extensions which cannot use macros). -- A different approach is to use an API which allocates a Py_buffer on the heap memory, so if the structure becomes larger tomorrow, an old built C extensions continues to work: Py_buffer *data = PyBuffer_New();
// ... use *data ...
PyBuffer_Free(data); PyBuffer_New() can initialize the version member and allocates the proper memory block size. The best is if the "... use *data ..." part is only done with function calls :-) |
Thanks for the investigation. I didn't know about C99 Standard 6.7.8.21. That's a useful and sensible extension to the language. In my opinion it is neither useful to extend the Py_buffer struct with a version tag nor to force users to allocate the struct on the heap. The current design has worked for over 13 years. Any deviation from the established design poses a risk to break 3rd party software. I could be convinced to add PyBuffer_New() and PyBuffer_Free() as additional feature, but their use should be optional. The function were part of my first PR #73221. |
Thanks for the review, Petr! |
There's some side effects with "buffer.h" inclusion in Panda3D when building againt 3.11a5, project manager concerns are here #29991 (comment) |
Copy of rdb's message: Should it not be using a relative include, ie. #include "../buffer.h" ? I think otherwise this change will cause breakage for many projects given how common the header name "buffer.h" may be. In Python.h, buffer.h is included before object.h. But object.h includes buffer.h. I suggest to include buffer.h before object.h and remove #include "buffer.h" from Include/cpython/buffer.h. Also, I agree that renaming buffer.h to pybuffer.h would reduce issues like that. Moreover, this header file exposes the "Py_buffer" API, so "pybuffer.h" sounds like a better name ;-) |
pmp-p:
Thanks for the report. It has been fixed. I close again the issue. |
clang doesn't like the typedef forward-decl: In file included from ../cpython/Modules/_ctypes/_ctypes.c:108:
In file included from ../cpython/Include/Python.h:43:
../cpython/Include/object.h:109:3: warning: redefinition of typedef 'PyObject' is a C11 feature [-Wtypedef-redefinition]
} PyObject;
^
../cpython/Include/pybuffer.h:23:24: note: previous definition is here
typedef struct _object PyObject;
^
1 warning generated. |
Oh. I already met this error :-( That's why I proposed in #75384 to move all forward declarations at the top of Python.h to solve such issue. I wrote #75708 to do exactly that: add a new pytypedefs.h header files to move all forward declarations at the top of Python.h. I didn't move *all* "typedef struct xxx yyy;" there: only the ones which cause interdependencies issues. |
I close again the issue, the C API should now be fine :-) |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: