Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-106092: Fix use-after-free crash in frame_dealloc #106875

Merged
merged 1 commit into from
Aug 1, 2023

Conversation

andersk
Copy link
Contributor

@andersk andersk commented Jul 18, 2023

It was possible for the trashcan to delay the deallocation of a PyFrameObject until after its corresponding _PyInterpreterFrame has already been freed. So frame_dealloc needs to avoid dereferencing the f_frame pointer unless it first checks that the pointer still points to valid memory.

Fixes #106092. This should be backported to 3.11 and 3.12.

It was possible for the trashcan to delay the deallocation of a
PyFrameObject until after its corresponding _PyInterpreterFrame has
already been freed.  So frame_dealloc needs to avoid dereferencing the
f_frame pointer unless it first checks that the pointer still points
to valid memory.

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
@markshannon
Copy link
Member

I think it better to ensure that f->f_frame always points to valid memory, rather than do the additional check.
Something like https://github.com/python/cpython/compare/main...faster-cpython:cpython:safe-clear-frame?expand=1

Do you have a reproducer that doesn't need third-party modules?

@andersk
Copy link
Contributor Author

andersk commented Jul 31, 2023

Something like https://github.com/python/cpython/compare/main...faster-cpython:cpython:safe-clear-frame?expand=1

+static const _PyInterpreterFrame cleared_frame = {
+    .owner = FRAME_CLEARED
+};

Note that FRAME_CLEARED has never been a valid member of enum _frameowner; see #106156.

+        f->f_frame = (_PyInterpreterFrame *)&cleared_frame;
         Py_DECREF(f);

Couldn’t that incorrectly skip the deallocation of f_executable, f_funcobj, and f_locals in frame_dealloc, if the frame was allocated by PyFrame_New and has owner == FRAME_OWNED_BY_FRAME_OBJECT?

Do you have a reproducer that doesn't need third-party modules?

Not at present—further minimization is probably possible but difficult, because the segfault requires a specific coincidence of the position of the stack pointer within the current stack chunk, the GC threshold, and the trashcan recursion depth.

@markshannon
Copy link
Member

A reproducer that causes an assertion failure would be good enough, I think.

if (f->f_frame->owner == FRAME_OWNED_BY_FRAME_OBJECT) {
assert(f->f_frame == (_PyInterpreterFrame *)f->_f_frame_data);
_PyInterpreterFrame *frame = (_PyInterpreterFrame *)f->_f_frame_data;
if (f->f_frame == frame && frame->owner == FRAME_OWNED_BY_FRAME_OBJECT) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks correct.

Note:
In theory, f->f_frame == frame and frame->owner == FRAME_OWNED_BY_FRAME_OBJECT should be equivalent, but in practice they are not.
We can't just check frame->owner == FRAME_OWNED_BY_FRAME_OBJECT because of the Trashcan mechanism, and we can't just check f->f_frame == frame because of this.
So we need both checks here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this check be put in other places too? We are currently investigating the root cause of segfaults in 3.11.5 with call stacks ending with

#0  0x0000000000635d9a in ?? ()
#1  0x00000000004d8e27 in PyFrame_FastToLocalsWithError ()
#2  0x00000000004d8e3e in ?? ()
#3  0x000000000054e5f6 in _PyObject_GenericGetAttrWithDict ()

What we are trying to access here is frame.f_locals. The frame object is obtained from looping over all thread states and calling PyThreadState_GetFrame on them (so we know we have a strong reference to it). Because of the inlining it's not immediately obvious where in PyFrame_FastToLocalsWithError we are segfaulting exactly, but our current hunch is that it is because the referenced _PyInterpreterState might have been deallocated. We're still investigating this.

@markshannon markshannon merged commit 557b05c into python:main Aug 1, 2023
21 checks passed
@markshannon markshannon added the needs backport to 3.12 bug and security fixes label Aug 1, 2023
@miss-islington
Copy link
Contributor

Thanks @andersk for the PR, and @markshannon for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12.
🐍🍒⛏🤖

@markshannon markshannon added the needs backport to 3.11 only security fixes label Aug 1, 2023
@bedevere-bot
Copy link

GH-107532 is a backport of this pull request to the 3.12 branch.

@miss-islington
Copy link
Contributor

Thanks @andersk for the PR, and @markshannon for merging it 🌮🎉.. I'm working now to backport this PR to: 3.11.
🐍🍒⛏🤖

@bedevere-bot bedevere-bot removed the needs backport to 3.12 bug and security fixes label Aug 1, 2023
@bedevere-bot
Copy link

GH-107533 is a backport of this pull request to the 3.11 branch.

@bedevere-bot bedevere-bot removed the needs backport to 3.11 only security fixes label Aug 1, 2023
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Aug 1, 2023
…106875)

It was possible for the trashcan to delay the deallocation of a
PyFrameObject until after its corresponding _PyInterpreterFrame has
already been freed.  So frame_dealloc needs to avoid dereferencing the
f_frame pointer unless it first checks that the pointer still points
to the interpreter frame within the frame object.

(cherry picked from commit 557b05c)

Co-authored-by: Anders Kaseorg <andersk@mit.edu>
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Aug 1, 2023
…106875)

It was possible for the trashcan to delay the deallocation of a
PyFrameObject until after its corresponding _PyInterpreterFrame has
already been freed.  So frame_dealloc needs to avoid dereferencing the
f_frame pointer unless it first checks that the pointer still points
to the interpreter frame within the frame object.

(cherry picked from commit 557b05c)

Co-authored-by: Anders Kaseorg <andersk@mit.edu>
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Yhg1s pushed a commit that referenced this pull request Aug 1, 2023
… (#107532)

gh-106092: Fix use-after-free crash in frame_dealloc (GH-106875)

It was possible for the trashcan to delay the deallocation of a
PyFrameObject until after its corresponding _PyInterpreterFrame has
already been freed.  So frame_dealloc needs to avoid dereferencing the
f_frame pointer unless it first checks that the pointer still points
to the interpreter frame within the frame object.

(cherry picked from commit 557b05c)

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Co-authored-by: Anders Kaseorg <andersk@mit.edu>
@andersk andersk deleted the frame_dealloc branch August 1, 2023 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

Segmentation fault in 3.11.4, 3.12.0b3; _PyInterpreterFrame ownership issue
7 participants