Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-43574: Dont overallocate list literals #24954

Conversation

chadnetzer
Copy link

@chadnetzer chadnetzer commented Mar 21, 2021

Before v3.9, list literals weren't over-allocated, but this behavior regressed starting with v3.9:

Switched to initializing list literals w/ LIST_EXTEND
https://bugs.python.org/issue39320
#17984

Commit where over-allocation of list literals first appeared
https://bugs.python.org/issue38328
#17114
6dd9b64

This changes to the original behavior of not over-allocating lists for list literals, by not over-allocating any 0-length list that is extended. Technically this is a change in behavior, as lists that are initialized to length 0, and then immediately extended, will not be over-allocated on that initial extension. This change in behavior is likely innocuous, and possibly desirable, whereas the long-standing Python behavior of not using over-allocation for list literals is almost certainly desired.

Adds a test_overallocation test for lists to find regressions (assuming the regression doesn't alter the behavior of both the list-literal and list-initialization from a known length).

https://bugs.python.org/issue43574

This restores the behavior of Python versions before v3.9, where a list
initialized from a list literal allocates memory for exactly as many
elements as it contains (until a list append or other operation that
changes the list size).

The interpreter was changed in v3.9 to use the LIST_EXTEND bytecode to
initialize a new list from a literal, which caused it to overallocate.
By not performing this over-allocation on an empty list, we restored the
original behavior (though now explicitly creating an empty list and
extending it once will also not result in an over-allocation).
@the-knights-who-say-ni
Copy link

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept this contribution by verifying everyone involved has signed the PSF contributor agreement (CLA).

CLA Missing

Our records indicate the following people have not signed the CLA:

@chadnetzer

For legal reasons we need all the people listed to sign the CLA before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

If you have recently signed the CLA, please wait at least one business day
before our records are updated.

You can check yourself to see if the CLA has been received.

Thanks again for the contribution, we look forward to reviewing it!

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed approach looks fine to me. Two things:

  1. Can you sign the contributor's agreement?
  2. Can you add a NEWS item? See https://devguide.python.org/committing/?highlight=blurb#updating-news-and-what-s-new-in-python

Lib/test/test_list.py Outdated Show resolved Hide resolved
@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

Objects/listobject.c Outdated Show resolved Hide resolved
Objects/listobject.c Outdated Show resolved Hide resolved
Ensure that lists initialized from both list-literals and iterables of known
length are initialized without over-allocation, by comparing them with
allocation of lists initialized from an unknown size where over-allocation is
expected.
@chadnetzer
Copy link
Author

I have made the requested changes; please review again.

Let me know if the test_overallocation tests (Lib/test/test_list.py) seem reasonable, or overkill.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@pitrou: please review the changes made to this pull request.

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, thank you

@pitrou
Copy link
Member

pitrou commented Mar 22, 2021

I'll let other people review before merging this.

@chadnetzer
Copy link
Author

chadnetzer commented Mar 23, 2021

Hmmm. The list_resize() function is used elsewhere in Objects/listobject.c, and in particular is called on lists (empty or otherwise) when inserting, appending, etc. So this PR will have an effect on empty lists in those areas as well, not just for list_extend(). This will essentially delay overallocation for empty lists in those situations (until after the first insert, append, etc), and possibly affect the amount of overallocation afterwards.

In the interest of fairness, I'll gather some examples to show the behavior change, at least to help decide if any of those situations will suffer a negative consequence as a result. (I'll edit this comment to with the examples, when ready)

Edit: Okay, so there is an additional complication with this change; really it's meant to target list_extend(), which is now used to initialize new lists from literals of length 2 or more, but it's implemented by changing list_resize(), which is used by more than list_extend(). And in particular, it also changes the behavior of appending or inserting to an empty list, by also not overallocating on that first append.

However, the current behavior (unpatched 3.10alpha), for an empty list, is to overallocate on the first append/insert, to allow space for 4 elements, and then overallocate again (to space for 8 elements) when the 4 elements are added. With this patch to list_resize(), however, the first append will not overallocate and so when appending the second item to that list, it will overallocate directly to 8 elements (skipping the overallocation for 4 items entirely).

Overall this is likely undesirable, and certainly not intended by this fix, it's just a side-effect of doing the work in the common helper (list_resize()) for list extend/append/insert/etc.

Here's an example of the list append/insert behavior change, to clarify:

$ ./python.exe
Python 3.10.0a6+ (remotes/gh/master:20a5b7e986, Mar 23 2021, 17:07:20) [Clang 12.0.0 (clang-1200.0.32.29)] on darwin
>>> # Here's the current insert behavior without the proposed [bpo-43574](https://bugs.python.org/issue43574) patch
>>> l = []
>>> l.__sizeof__()
40
>>> l.append("eggs")  # first append overallocates by 3 slots (ie. 4 total, 1 used)
>>> l.__sizeof__()
72
>>> l.append("ham")  # additional appends don't overallocate until 4 slots are used
>>> l.__sizeof__()
72

but with the patch, which (unintentionally) delays the over-allocation of the first append to an empty list

# Now w/ [bpo-43574](https://bugs.python.org/issue43574) patch (no overallocation for extends on empty lists)
$ ./python.exe
Python 3.10.0a6+ (heads/bpo43574-dont-overallocate-list-literals:7436223a71, Mar 23 2021, 17:40:29) [Clang 12.0.0 (clang-1200.0.32.29)] on darwin
>>> l = []
>>> l.__sizeof__()
40
>>> l.append("eggs")  # No overallocation on first append
>>> l.__sizeof__()
48
>>> l.append("ham")  # So the second append skips 4 total items, and allocates space for 8 items
>>> l.__sizeof__()
104

So, I think that warrants a delay in merging and some further consideration of how to handle this; I can think of a few ways to "fix" this, but not without some additional complication. Basically because we want to signal that initializing a list from a known number of items shouldn't overallocate, but adding to an already created list should be able to do so, which isn't quite expressible in the helper functions we have currently, imo. Or possibly we don't care about over-allocating when initializing from a known number of items (though it's a regression), but maybe we want to ensure that a known 3 or 4 item list doesn't immediately over-allocate to 8 items (before any appends, etc.) is what we want to change...

Edit 2: I think I have 1 or 2 fairly straightforward ways to adjust this PR to remain on-target for restoring list-literal overallocation behavior, w/o also inadvertently affecting list append/insert over-allocation behavior, but will have to wait until the weekend to follow up w/ the details (fyi).

The list_resize() helper is used by single-item list appends and
inserts, which normally would overallocate (to a capacity of 4) on the
first added element, and then again to 8 when the capacity is filled.
But if this over-allocation is postponed on the first added element, the
current expansion formula would then over-allocate to a capacity of 8 on
the second append/insertion.

List-literals of length 1 and 2 are not created with the list_extend()
function (which then calls list_resize()), but instead built directly as
capacity 1 or 2 lists with the BUILD_LIST opcode.  By excluding the
special case for list_resize() of empty lists when newsize is greater
than 1, its allows list append/insert to continue to over-allocate
without skipping capacity 4.
@chadnetzer
Copy link
Author

I made a small but significant tweak to my original proposal, and added a lot of of commentary to the bug tracker issue. Basically, the latest tweak restores the existing list append/insert behavior for empty lists, while still reverting back to the long-standing behavior of not overallocating list-literals (which, for length 3 or more, are created by using the list_extend() helper)

sweeneyde
sweeneyde previously approved these changes Apr 4, 2021
Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. I would welcome a bit more thorough testing.

for literal in test_literals:
# Ensure that both list literals, and lists made from an iterable
# of known size use the same amount of allocation. It will be
# verified later that no over-allocation occurs for list literals.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment says "verified later", but I see no specific test of no-over-allocation of list literals. There is SizeofTest in test_sys.py but it only tests two literal lengths.
I would imagine this kind of test:

prev_literal = None
for literal in [[] + test_literals]:
    if prev_literal is not None
        # Allocation of literal N uses one more pointer than allocation of literal N-1
        self.assertEqual(sizeof(literal), sizeof(prev_literal) + struct.calcsize('P'))
    prev_literal = literal

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I should have said "verified below", ie. in same test, after the loop. It's an implicit test, though, since with the current over-allocation strategy, the list capacity can sometimes equal list length, so it's not universally true that each list-literal will have less memory allocated than a normal list of equivalent length. So it only can test that at least sometimes the list-literals are created with less capacity then a grown list of equivalent length.

I was avoiding a direct test of memory usage, like you propose, just to keep things general. But I can certainly add such a direct test, which (afaict) is really the only way to test that all the list-literals are being created without over-allocation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pitrou I ran with your idea, and added a direct test of the memory allocated for a list-literal, which also let me remove the confusing wording about verifying "later", since that's no longer the case. Thanks for the suggestion.

Confirm that list-literals aren't over-allocated by directly checking
their used memory, which is the combination of the size of an empty list
and the number of pointers needed for a list of exactly that length.
@gpshead gpshead added type-bug An unexpected behavior, bug, or error performance Performance or resource usage needs backport to 3.9 only security fixes labels Apr 11, 2021
Copy link
Member

@gpshead gpshead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall good, some test maintainability and debugging suggestions.


# bpo-38373: initialized or grown lists are not always over-allocated.
# Confirm that over-allocation occurs at least some of the time.
self.assertEqual(True, any(x>0 for x in overalloc_amts))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add msg=repr(overalloc_amts) to aid debugging if this fails.

# Direct check that list-literals do not over-allocate, by
# calculating the total size of used pointers.
total_ptr_size = len(literal) * struct.calcsize('P')
self.assertEqual(sizeof(literal), sizeof([]) + total_ptr_size)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a msg=repr(literal) arg to all of the asserts in this loop to aid debugging directly from a test failure log.

]

overalloc_amts = []
for literal in test_literals:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no action required: meta: this is where I really wish we had an absltest.paramtereize or pytest.mark.parameterize available in the stdlib. it'd run all of these as unique tests with appropriate names instead of a loop that prevents others from running upon first failure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think unittest.subTest() displays parameters on the failing subtests and keeps running through each even on failure.

# Confirm that over-allocation occurs at least some of the time.
self.assertEqual(True, any(x>0 for x in overalloc_amts))

# Empty lists should overallocate on initial append/insert (unlike
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turn this bit into an additional test method.

# bpo-43574: Don't overallocate for list-literals
sizeof = sys.getsizeof

# First handle empty list and empty list-literal cases. Should have no
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this set of initial assertions into its own test method.

@@ -0,0 +1,3 @@
``list`` objects don't overallocate when starting empty and then extended, or
when set to be empty. This effectively restores previous ``list`` memory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quantify what "previous" means. 3.8 and earlier I believe.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@github-actions
Copy link

github-actions bot commented Jun 3, 2021

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Jun 3, 2021
@gpshead gpshead added the needs backport to 3.10 only security fixes label Jul 6, 2021
@gpshead gpshead self-assigned this Jul 6, 2021
@gpshead gpshead removed their assignment Apr 20, 2022
@ambv ambv removed the needs backport to 3.9 only security fixes label May 17, 2022
@ambv
Copy link
Contributor

ambv commented May 17, 2022

This missed the boat for inclusion in Python 3.9 which accepts security fixes only as of today.

@sweeneyde
Copy link
Member

Is this PR still relevant since #31816 was merged?

@methane
Copy link
Member

methane commented May 18, 2022

Is this PR still relevant since #31816 was merged?

I think no. This PR and the #87740 can be closed.

@methane methane closed this May 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting changes needs backport to 3.10 only security fixes performance Performance or resource usage stale Stale PR or inactive for long period of time. type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants