Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segmentation fault appeared in python 3.10.0b2 #88470

Closed
zzzeek mannequin opened this issue Jun 3, 2021 · 16 comments
Closed

segmentation fault appeared in python 3.10.0b2 #88470

zzzeek mannequin opened this issue Jun 3, 2021 · 16 comments
Labels
3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) release-blocker type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@zzzeek
Copy link
Mannequin

zzzeek mannequin commented Jun 3, 2021

BPO 44304
Nosy @terryjreedy, @vstinner, @pablogsal, @miss-islington, @erlend-aasland
PRs
  • bpo-44304: Fix crash in the sqlite3 module when the GC clears Statement objects #26545
  • [3.10] bpo-44304: Fix crash in the sqlite3 module when the GC clears Statement objects (GH-26545) #26547
  • bpo-44304: Ensure the sqlite3 destructor callback is always called with the GIL held #26551
  • [3.10] bpo-44304: Ensure the sqlite3 destructor callback is always called with the GIL held (GH-26551) #26552
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-06-05.03:24:53.965>
    created_at = <Date 2021-06-03.18:23:07.494>
    labels = ['interpreter-core', '3.10', 'type-crash', 'release-blocker']
    title = 'segmentation fault appeared in python 3.10.0b2'
    updated_at = <Date 2021-06-05.23:13:31.988>
    user = 'https://bugs.python.org/zzzeek'

    bugs.python.org fields:

    activity = <Date 2021-06-05.23:13:31.988>
    actor = 'pablogsal'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-06-05.03:24:53.965>
    closer = 'pablogsal'
    components = ['Interpreter Core']
    creation = <Date 2021-06-03.18:23:07.494>
    creator = 'zzzeek'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 44304
    keywords = ['patch']
    message_count = 16.0
    messages = ['395028', '395029', '395121', '395128', '395145', '395146', '395147', '395148', '395150', '395151', '395155', '395162', '395175', '395181', '395186', '395187']
    nosy_count = 6.0
    nosy_names = ['terry.reedy', 'vstinner', 'zzzeek', 'pablogsal', 'miss-islington', 'erlendaasland']
    pr_nums = ['26545', '26547', '26551', '26552']
    priority = 'release blocker'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'crash'
    url = 'https://bugs.python.org/issue44304'
    versions = ['Python 3.10']

    @zzzeek
    Copy link
    Mannequin Author

    zzzeek mannequin commented Jun 3, 2021

    segmentation fault related to object deallocation and traceback objects, is extremely difficult to reproduce and definitely appeared as of 3.10.0b2, does not occur in 3.10.0b1. linux and osx platforms are affected.

    The issue requires "greenlet==1.1.0" to be installed, as well as that for me to reproduce it I have to use sqlite3 with some special APIs also, so while this issue might be in greenlet, or sqlite3, I have a feeling these are all factors that are combining together in a very random way to reveal something that's actually happening in the interpreter, but in any case would be great if core devs can see why it happens. The below script has been tested on various linux platforms, and the overall bug was revealed by an interaction in the SQLAlchemy test suite that's occurring in many places including OSX, linux. As noted, a whole bunch of very random things are needed for me to reproduce it.

    import greenlet
    import sqlite3
    
    
    class _ErrorContainer(object):
        error = None
    
    
    def _expect_raises_fn(fn):
        ec = _ErrorContainer()
        try:
            fn()
        except Exception as err:
            assert str(err) == "this is a test"
    
            # assign the exception context outside of the except
            # is necessary
            ec.error = err
    \# don't del the exception context is necessary
    

    # del ec

    def greenlet_spawn(fn, *args, **kwargs):
    
        # spawning a greenlet is necessary
        context = greenlet.greenlet(fn, greenlet.getcurrent())
    
        # assignment to "result" is necessary
        result = context.switch(*args, **kwargs)
    # raising exception is necessary
    raise Exception("this is a test")
    
    class OuterConnectionWrapper:
        def __init__(self, connection):
            self.connection = connection
    
        def go(self, stmt):
            sqlite_connection = self.connection
            cursor = sqlite_connection.cursor()
            cursor.execute("select 1")
            return cursor
    
        def execute(self, stmt):
            return greenlet_spawn(self.go, stmt)
    
        def _do_close(self):
            self.connection.close()
            self.connection = None
    
        def close(self):
            self._do_close()
    
    
    class InnerConnectionWrapper:
        def __init__(self, connection):
            self.connection = connection
    
        def create_function(self, *arg, **kw):
            self.connection.create_function(*arg, **kw)
    
        def cursor(self):
            return self.connection.cursor()
    
        def close(self):
            self.connection = None
    
    
    class ConnectionPool:
        def __init__(self):
            self.conn = sqlite3.connect(":memory:")
    
            def regexp(a, b):
                return None
    
            self.conn.create_function("regexp", 2, regexp)
    
        def connect(self):
            return InnerConnectionWrapper(self.conn)
    
    
    def do_test():
        pool = ConnectionPool()
    
        def go():
            c1 = pool.connect()
            conn = OuterConnectionWrapper(c1)
            try:
                conn.execute("test")
            finally:
                conn.close()
    
        _expect_raises_fn(go)
    
    
    do_test()

    @zzzeek zzzeek mannequin added 3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Jun 3, 2021
    @zzzeek
    Copy link
    Mannequin Author

    zzzeek mannequin commented Jun 3, 2021

    if the issue is in greenlet this can be bounced back to https://github.com/python-greenlet/greenlet/issues/242

    @terryjreedy
    Copy link
    Member

    Coredevs are likely to assume that the problem is in greenlets, especially if it has any C code (I don't know). I suggest you first try to find the commit between releases that resulted in the problem. If it is not deterministicly reproducible, you likely need to do a manual binary search. (This is as much as I know.)

    If you get an OS 'traceback' longer than a dozen line or so, please attach it as a file rather than pasting in a message.

    @terryjreedy terryjreedy added the type-crash A hard crash of the interpreter, possibly with a core dump label Jun 4, 2021
    @zzzeek
    Copy link
    Mannequin Author

    zzzeek mannequin commented Jun 4, 2021

    yes, if I have time I will begin to undertake that, wanted to put it up here in case anyone has git bisect on speed dial for cpython.

    @pablogsal
    Copy link
    Member

    I did the bisection:

    ff359d7 is the first bad commit
    commit ff359d7
    Author: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
    Date: Mon May 31 02:12:27 2021 -0700

    bpo-42972: Fix sqlite3 traverse/clear functions (GH-26452) (GH-26461)
    
    (cherry picked from commit d1124b09e8251061dc040cbd396f35ae57783f4a)
    
    Co-authored-by: Erlend Egeberg Aasland <erlend.aasland@innova.no>
    

    :040000 040000 701f24eacbdf2b52b612dc33ae9a7dcf24a73826 100af9271a7e7c038a860388587b1cb096b8494f M Modules
    bisect run success

    @pablogsal
    Copy link
    Member

    The problem is that this is wrong:

    static int
    stmt_clear(pysqlite_Statement *self)
    {
    if (self->st) {
    Py_BEGIN_ALLOW_THREADS
    sqlite3_finalize(self->st);
    Py_END_ALLOW_THREADS
    self->st = 0;
    }
    Py_CLEAR(self->sql);
    return 0;
    }

    1. tp_clear should *not* do anything other than cleaning refs to python objects
    2. sqlite3_finalize is called without the GIL but that can trigger a call to the destructor of the session (_destructor()) that executes Python code

    @pablogsal
    Copy link
    Member

    Using the reproducer code by Mike, without this fix:

    ❯ ./python ../lel.py
    [1] 25344 segmentation fault ./python ../lel.py

    With this fix:

    ❯ ./python ../problem.py
    ❯ echo $?
    0

    @pablogsal
    Copy link
    Member

    This is a simpler reproducer:

            dest = sqlite.connect(':memory:')
            def md5sum(t):
                return
    
            dest.create_function("md5", 1, md5sum)
            x = dest("create table lang (name, first_appeared)")
            del md5sum, dest
    
            y = [x]
            y.append(y)
        del x,y
        gc.collect()
    

    @pablogsal
    Copy link
    Member

    New changeset fa106a6 by Pablo Galindo in branch 'main':
    bpo-44304: Fix crash in the sqlite3 module when the GC clears Statement objects (GH-26545)
    fa106a6

    @miss-islington
    Copy link
    Contributor

    New changeset ad2f3b7 by Miss Islington (bot) in branch '3.10':
    bpo-44304: Fix crash in the sqlite3 module when the GC clears Statement objects (GH-26545)
    ad2f3b7

    @erlend-aasland
    Copy link
    Contributor

    Thanks Mike for the report and reproducer, and Pablo for the fix!

    @zzzeek
    Copy link
    Mannequin Author

    zzzeek mannequin commented Jun 5, 2021

    great news!

    Based on how many random factors were needed to reproduce as well as that it seemed to be gc related and appeared very suddenly, I had an intuition this was on the cpython side, thanks so much for doing this Pablo!

    @erlend-aasland
    Copy link
    Contributor

    sqlit3.Cursor is prone to the same bug. Do you want me to create a new issue for it, or can I reuse this issue number, Pablo? I'll see if I can create a reproducer for that as well.

    @pablogsal
    Copy link
    Member

    sqlit3.Cursor is prone to the same bug.

    No, is not: it doesn't drop the GIL in tp_clear. There is technically no bug, but is true that tp_clear should only clean Python references. Although in this case there are some sematics with the cleanup that are not clear.

    @pablogsal
    Copy link
    Member

    New changeset 6e3b7cf by Pablo Galindo in branch 'main':
    bpo-44304: Ensure the sqlite3 destructor callback is always called with the GIL held (GH-26551)
    6e3b7cf

    @pablogsal
    Copy link
    Member

    New changeset 317e9ed by Miss Islington (bot) in branch '3.10':
    bpo-44304: Ensure the sqlite3 destructor callback is always called with the GIL held (GH-26551) (GH_26552)
    317e9ed

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) release-blocker type-crash A hard crash of the interpreter, possibly with a core dump
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants