Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-98712: Clarify "readonly bytes-like object" semantics in C arg-parsing docs #98710

Merged
merged 2 commits into from
Dec 23, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 35 additions & 19 deletions Doc/c-api/arg.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,24 +34,39 @@ These formats allow accessing an object as a contiguous chunk of memory.
You don't have to provide raw storage for the returned unicode or bytes
area.

In general, when a format sets a pointer to a buffer, the buffer is
managed by the corresponding Python object, and the buffer shares
the lifetime of this object. You won't have to release any memory yourself.
The only exceptions are ``es``, ``es#``, ``et`` and ``et#``.

However, when a :c:type:`Py_buffer` structure gets filled, the underlying
buffer is locked so that the caller can subsequently use the buffer even
inside a :c:type:`Py_BEGIN_ALLOW_THREADS` block without the risk of mutable data
being resized or destroyed. As a result, **you have to call**
:c:func:`PyBuffer_Release` after you have finished processing the data (or
in any early abort case).

Unless otherwise stated, buffers are not NUL-terminated.

Some formats require a read-only :term:`bytes-like object`, and set a
pointer instead of a buffer structure. They work by checking that
the object's :c:member:`PyBufferProcs.bf_releasebuffer` field is ``NULL``,
which disallows mutable objects such as :class:`bytearray`.
There are three ways strings and buffers can be converted to C:

* Formats such as ``y*`` and ``s*`` fill a :c:type:`Py_buffer` structure.
This locks the underlying buffer so that the caller can subsequently use
the buffer even inside a :c:type:`Py_BEGIN_ALLOW_THREADS`
block without the risk of mutable data being resized or destroyed.
As a result, **you have to call** :c:func:`PyBuffer_Release` after you have
finished processing the data (or in any early abort case).

* The ``es``, ``es#``, ``et`` and ``et#`` formats allocate the result buffer.
**You have to call** :c:func:`PyMem_Free` after you have finished
processing the data (or in any early abort case).

* .. _c-arg-borrowed-buffer:

Other formats take a :class:`str` or a read-only :term:`bytes-like object`,
such as :class:`bytes`, and provide a ``const char *`` pointer to
its buffer.
In this case the buffer is "borrowed": it is managed by the corresponding
Python object, and shares the lifetime of this object.
You won't have to release any memory yourself.

To ensure that the underlying buffer may be safely borrowed, the object's
:c:member:`PyBufferProcs.bf_releasebuffer` field must be ``NULL``.
This disallows common mutable objects such as :class:`bytearray`,
but also some read-only objects such as :class:`memoryview` of
:class:`bytes`.

Besides this ``bf_releasebuffer`` requirement, there is no check to verify
whether the input object is immutable (e.g. whether it would honor a request
for a writable buffer, or whether another thread can mutate the data).

.. note::

Expand Down Expand Up @@ -89,7 +104,7 @@ which disallows mutable objects such as :class:`bytearray`.
Unicode objects are converted to C strings using ``'utf-8'`` encoding.

``s#`` (:class:`str`, read-only :term:`bytes-like object`) [const char \*, :c:type:`Py_ssize_t`]
Like ``s*``, except that it doesn't accept mutable objects.
Like ``s*``, except that it provides a :ref:`borrowed buffer <c-arg-borrowed-buffer>`.
The result is stored into two C variables,
the first one a pointer to a C string, the second one its length.
The string may contain embedded null bytes. Unicode objects are converted
Expand All @@ -108,8 +123,9 @@ which disallows mutable objects such as :class:`bytearray`.
pointer is set to ``NULL``.

``y`` (read-only :term:`bytes-like object`) [const char \*]
This format converts a bytes-like object to a C pointer to a character
string; it does not accept Unicode objects. The bytes buffer must not
This format converts a bytes-like object to a C pointer to a
:ref:`borrowed <c-arg-borrowed-buffer>` character string;
it does not accept Unicode objects. The bytes buffer must not
contain embedded null bytes; if it does, a :exc:`ValueError`
exception is raised.

Expand Down