Add `PyList_GetItemRef()` (a variant of `PyList_GetItem` that returns a strong reference) #9

colesbury · 2024-01-19T20:14:46Z

Issue: python/cpython#114329
PR: python/cpython#114504

EDIT: Updated signature based on feedback below.

The free-threaded builds need a variant of PyList_GetItem that returns a strong reference instead of a borrowed reference for thread-safety reasons. PEP 703 proposed PyList_FetchItem, but since then PyDict_GetItemRef and functions with similar signatures have been added.

This proposes PyList_GetItemRef with the following signature:

PyObject *PyList_GetItemRef(PyObject *list, Py_ssize_t index)

Return a strong reference to the object at position index in the list pointed to by list. If index is out of bounds (<0 or >=len(list)), return NULL and set an IndexError. If list is not a list instance, return NULL and set a TypeError.

The text was updated successfully, but these errors were encountered:

gvanrossum · 2024-01-19T21:44:43Z

Sounds good to me, given the precedent.

serhiy-storchaka · 2024-01-21T10:21:21Z

Why

int PyList_GetItemRef(PyObject *list, Py_ssize_t index, PyObject **result)

and not simply

PyObject *PyList_GetItemRef(PyObject *list, Py_ssize_t index)

?
Other functions that store the result by the provided address instead returning it, do it because the result is is more than a single C value. For example, PyObject_GetOptionalAttr() has three possible outcome: "success", "not found" and "error". There is PyObject_GetAttr() which does not distinguish "not found" from "error" (an exception is raised in both cases), so it just returns a pointer, and use NULL value as a signal of error. PyDict_GetItemRef() has no a couterpart that raises an exception for missed key: it is expected that the user either raise corresponding exception himself (it is a low-level API) or use more higher level abstract API PyObject_GetItem().

For PyList_GetItem() there is no a counterpart that distinguish "index of range" and "other error". Perhaps it was not needed, but also it is easy to check index before the call if you want to avoid raising an IndexError (it was true with GIL).

encukou · 2024-01-22T10:01:12Z

I agree with Serhyi. If there are only two return codes, the error should be indicated by NULL.

The precedent should be for cases where one kind of “failure” is very common, and we want to avoid allocating an exception in this common case.
For dict, containment check is about as expensive as retrieving the key, so it makes sense for the API to do both -- and so it needs a three-way return.

For list, users can easily check whether they have a valid index (e.g. they're getting items in a for loop), and they should get a IndexError to re-raise if e.g. another thread is resizing the list concurrently.

gvanrossum · 2024-01-22T16:14:31Z

Okay, I can live with that, too. It means PyDict_GetItemRef is the odd one out. I wonder if the signature should somehow be reflected in the name, though? It's confusing that PySpam_GetItemRef sometimes has an output parameter, sometimes returns object-or-null. Fortunately the compiler will tell you, so it's not insurmountable.

serhiy-storchaka · 2024-01-22T16:57:05Z

Yes, I felt that it would be better to use a different verb for such kind of API and asked about this before adding PyObject_GetOptionalAttr(). https://discuss.python.org/t/make-pyobject-lookupattr-public/29104

But PyObject_GetOptionalAttr() was accepted as good enough, so there where no reason to choose a different verb for PyDict_GetItemRef() (it always returns an "optional" value, so adding Optional would not help).

gvanrossum · 2024-01-22T19:16:57Z

Okay, I lost track of what we've decided here. Or is there still no conclusion?

encukou · 2024-01-23T08:10:33Z

AFAIK, we're agreeing on everything except a general naming scheme (which would be incompatible with current API).

One thing the OP doesn't cover is type checking. in issues#29 the consensus was that arguments should be checked at runtime. (Here, this is fast thanks to Py_TPFLAGS_LIST_SUBCLASS, so I hope that's not a problem. Comment on that issue if you disagree with run-time type checking in general.)

I hope it's time for a formal vote:

PyObject *PyList_GetItemRef(PyObject *list, Py_ssize_t index)

returning:

on success, a new strong reference to the requested item
on error, NULL with an exception set:
- IndexError if index is out of bounds
- TypeError if list is not a list instance

@colesbury, I hope that's fine implementation-wise.

vstinner · 2024-01-23T13:36:05Z

@colesbury: Can you please run a microbenchmark to measure the performance of PyList_GetItemRef() vs PySequence_GetItem()? I'm not sure that it's worth it to have a specialized C function for list.

colesbury · 2024-01-23T14:54:00Z

@encukou, that proposal is fine with me. @vstinner, PySequence_GetItem() is about 60% slower than PyList_GetItemRef() (~2.8 ns vs. 1.7 ns per call)

vstinner · 2024-01-23T15:41:57Z

+1: I'm fine with PyObject *PyList_GetItemRef(PyObject *list, Py_ssize_t index) API.

@colesbury:

PySequence_GetItem() is about 60% slower than PyList_GetItemRef() (~2.8 ns vs. 1.7 ns per call)

+1.1 ns is not significant as an absolute timing, but 60% slower is significant and make me sad :-( It's too large to be ignored.

Maybe there is a way to remove a few conditions in PySequence_GetItem() to reduce the overhead compared to PyList_GetItemRef() :-(

zooba · 2024-01-23T16:56:51Z

Maybe there is a way to remove a few conditions in PySequence_GetItem()

Presumably it's the memory indirections that hurt. It bypasses the PyList_Check that is in PyList_GetItem when you go through PySequence_GetItem and hence supports a much wider range of types, but otherwise just looks up the item slot and calls it.

I hope that such a minor perf difference doesn't encourage developers to prevent custom types being used with their functions, but it does seem difficult to justify not having a slightly faster path for those who already know they have a list [subclass].

Perhaps a function to get the item pointer once to the caller can call it directly would be more efficient? (I assume the code is repeatedly getting items out of the same object, or else the 1.1ns really doesn't matter for N=1. For N>1000 I can see the impact.)

Otherwise, yes fine with the same API but better semantics.

colesbury · 2024-01-23T21:04:02Z

Follow-up question: should this be part of the stable ABI? (I think so -- the similar functions PyDict_GetItemRef and PyWeakref_GetRef were added to the stable ABI)

colesbury · 2024-01-29T19:44:38Z

@iritkatriel, what's your opinion on the proposed API?

@gvanrossum, it sounded like you were in favor of this API. If so, would you please mark as such on Petr's vote checkboxes: #9 (comment)

iritkatriel · 2024-01-29T20:56:11Z

LGTM

colesbury · 2024-01-29T21:11:16Z

FYI, here's the PR that adds PyList_GetItemRef():

python/cpython#114504

vstinner · 2024-01-30T15:18:15Z

Follow-up question: should this be part of the stable ABI? (I think so -- the similar functions PyDict_GetItemRef and PyWeakref_GetRef were added to the stable ABI)

I'm fine with adding PyList_GetItemRef() to the stable ABI 3.13.

serhiy-storchaka · 2024-01-31T09:17:23Z

It was perhaps unnoticed, but there is yet one difference of the new C API from PyList_GetItem(), not only in ownership of the result. The latter, as well as most concrete type C API, raises SystemError if the argument is not a list. It is a programming error in the C extension and should be differentiate from a programming error in Python code.

vstinner · 2024-01-31T10:50:31Z

Sam's comment:

The TypeError behavior is explicitly stated in the C API WG #9 (comment). See also capi-workgroup/api-evolution#29 (comment) and PyWeakref_GetRef().

zooba · 2024-01-31T15:28:09Z

The transitive error behaviour (C extension that just passes a user value into the call) is going to be more consistent. If we're going to do the type check, we shouldn't force the user to also do it and generate a normal looking exception message themselves.

encukou · 2024-02-01T14:27:40Z

I can see the arguments for both exception types. But TypeError works for me.
Serhyi (or anyone else), if you feel strongly about this, please open a general issue about what gets raised from type checks.

The vote is over; I'm closing this issue. Follow-ups welcome.

colesbury mentioned this issue Jan 19, 2024

Add PyList_GetItemRef, a variant of PyList_GetItem that returns a strong reference python/cpython#114329

Closed

erlend-aasland mentioned this issue Jan 24, 2024

Make list objects thread-safe in --disable-gil builds python/cpython#112087

Closed

4 tasks

colesbury mentioned this issue Jan 30, 2024

gh-114329: Add PyList_GetItemRef function python/cpython#114504

Merged

encukou closed this as completed Feb 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `PyList_GetItemRef()` (a variant of `PyList_GetItem` that returns a strong reference) #9

Add `PyList_GetItemRef()` (a variant of `PyList_GetItem` that returns a strong reference) #9

colesbury commented Jan 19, 2024 •

edited

Loading

gvanrossum commented Jan 19, 2024

serhiy-storchaka commented Jan 21, 2024

encukou commented Jan 22, 2024

gvanrossum commented Jan 22, 2024

serhiy-storchaka commented Jan 22, 2024

gvanrossum commented Jan 22, 2024

encukou commented Jan 23, 2024 •

edited by iritkatriel

Loading

vstinner commented Jan 23, 2024

colesbury commented Jan 23, 2024

vstinner commented Jan 23, 2024

zooba commented Jan 23, 2024 •

edited

Loading

colesbury commented Jan 23, 2024

colesbury commented Jan 29, 2024

iritkatriel commented Jan 29, 2024

colesbury commented Jan 29, 2024

vstinner commented Jan 30, 2024

serhiy-storchaka commented Jan 31, 2024

vstinner commented Jan 31, 2024 •

edited

Loading

zooba commented Jan 31, 2024

encukou commented Feb 1, 2024

Add PyList_GetItemRef() (a variant of PyList_GetItem that returns a strong reference) #9

Add PyList_GetItemRef() (a variant of PyList_GetItem that returns a strong reference) #9

Comments

colesbury commented Jan 19, 2024 • edited Loading

gvanrossum commented Jan 19, 2024

serhiy-storchaka commented Jan 21, 2024

encukou commented Jan 22, 2024

gvanrossum commented Jan 22, 2024

serhiy-storchaka commented Jan 22, 2024

gvanrossum commented Jan 22, 2024

encukou commented Jan 23, 2024 • edited by iritkatriel Loading

vstinner commented Jan 23, 2024

colesbury commented Jan 23, 2024

vstinner commented Jan 23, 2024

zooba commented Jan 23, 2024 • edited Loading

colesbury commented Jan 23, 2024

colesbury commented Jan 29, 2024

iritkatriel commented Jan 29, 2024

colesbury commented Jan 29, 2024

vstinner commented Jan 30, 2024

serhiy-storchaka commented Jan 31, 2024

vstinner commented Jan 31, 2024 • edited Loading

zooba commented Jan 31, 2024

encukou commented Feb 1, 2024

Add `PyList_GetItemRef()` (a variant of `PyList_GetItem` that returns a strong reference) #9

Add `PyList_GetItemRef()` (a variant of `PyList_GetItem` that returns a strong reference) #9

colesbury commented Jan 19, 2024 •

edited

Loading

encukou commented Jan 23, 2024 •

edited by iritkatriel

Loading

zooba commented Jan 23, 2024 •

edited

Loading

vstinner commented Jan 31, 2024 •

edited

Loading