Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread safety in python-flint #224

Open
oscarbenjamin opened this issue Sep 9, 2024 · 0 comments
Open

Thread safety in python-flint #224

oscarbenjamin opened this issue Sep 9, 2024 · 0 comments

Comments

@oscarbenjamin
Copy link
Collaborator

I have just tested python-flint out properly in a free-threaded build of CPython 3.13 ie. without the GIL:

https://peps.python.org/pep-0703/

Unfortunately some fairly basic operations can result in crashes when run without the GIL.

When running without the GIL this crashes reliably:

def test_python_threads():
    iterations = 10**5
    threads = 3 + 1
    size = 3
    M = flint.fmpz_mat([[0]*size for _ in range(size)])

    def set_values():
        for i in range(iterations // 5):
            i = random.randrange(M.nrows())
            j = random.randrange(M.ncols())
            if random.uniform(0, 1) > 0.5:
                # Bigger than 2**62:
                M[i,j] = 10**128
            else:
                # Smaller than 2**62:
                M[i,j] = 0

    def get_dets():
        for i in range(iterations):
            M.det()

    threads = [Thread(target=set_values) for _ in range(threads-1)]
    threads.append(Thread(target=get_dets))

    for t in threads:
        t.start()
    for t in threads:
        t.join()

What this does is create a few threads that share a 3x3 fmpz_mat matrix. In some threads values of the matrix are modified specifically alternating between small integers and large integers which causes Flint to allocate and deallocate the memory for the large integers.

Then there is another thread that repeatedly computes the determinant of the matrix (not strictly necessary but perhaps slightly more likely that a user would do that than modify a matrix in multiple threads simultaneously).

The problem here is mutation of objects and in particular mutations that can trigger reallocation. This can happen when:

  • Setting an element of a matrix or polynomial. It is okay with nmod but anything that has heap allocation is not (fmpz, fmpq, ...).
  • Resizing a polynomial by assigning a coefficient value like p[i] = c.
  • Performing any in-place operation (e.g. rref with inplace=True)

We would need to have some sort of per-object locking to make this memory safe. The problem then is that the locking would need to be applied to any operation that reads the matrix not just those that write to it. I expect that Cython will add support for these things but it does not seem to be there yet: cython/cython#6221

A simple way out is to have immutable types e.g. if polynomials were immutable then there would be no problems. Making matrices immutable seems more problematic than making polynomials immutable though. I think people often like to create a matrix by starting out with a matrix of zeros and then writing loops that set different values.

For the time being I suggest that if shipping binaries for free-threaded Python 3.12 then we should just document the fact that some operations are not thread-safe. As far as I know this is only an issue if mutating a polynomial or matrix when the object is shared among multiple threads. Mutating the object simultaneously from multiple threads is usually a bad idea anyway so we can just document it as such.

I am also unsure more generally which FLINT functions can be expected to be thread-safe in the sense that they can be safely called from multiple threads (rather than FLINT's internal use of threads which I assume is safe).

I assume that sharing multiple const references to a FLINT object is safe which is usually all that python-flint would do. The exceptions being like fmpz_mat_set_entry and fmpz_poly_set_coeff. There are however other places that non const references are used e.g. gr functions take non-const references to the gr_ctx_t as noted here:
https://flintlib.org/doc/gr.html#c.gr_ctx_t

Context objects are not normally passed as const in order to allow storing mutable caches, additional debugging information, etc.

I don't know whether such "mutable caches etc" are thread-safe although I would imagine so...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant