Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent read segfault #844

Open
ArnaudLevaufre opened this issue Sep 3, 2018 · 13 comments
Open

Concurrent read segfault #844

ArnaudLevaufre opened this issue Sep 3, 2018 · 13 comments

Comments

@ArnaudLevaufre
Copy link

Hello

When openning a netcdf file in read mode more than one time in a ThreadPool results in a segfault. When using a process pool there is no issue and the concurrent read works fine. It can be an issue in a web environment where the server is using threads to handle its clients and two or more clients make a request that need to read the same netcdf file.

I have a simple python script and a dataset that reproduces the segfault:
netcdfSegfault.zip

The python script provided in the zip:

import netCDF4
from multiprocessing.pool import ThreadPool

def read_netcdf(path):
    ieast = 1750
    iwest = 1760
    inorth = 1380
    isouth = 1370

    with netCDF4.Dataset(path, 'r') as ncf:
        return ncf['/Depth/ndepths'][:][iwest:ieast, isouth:inorth]


if __name__ == "__main__":
    print("Netcdf4 version: %s" % netCDF4.__version__)
    path = "./max_depth_ndepth_quonops.nc"
    with ThreadPool(2) as p:
        print(p.map(read_netcdf, [path for i in range(2)]))

Will output the following when executed.

Netcdf4 version: 1.4.0
Segmentation fault (core dumped)
@jswhit
Copy link
Collaborator

jswhit commented Sep 3, 2018

This scripts gives me

Netcdf4 version: 1.4.2
Traceback (most recent call last):
  File "segfault.py", line 17, in <module>
    with ThreadPool(2) as p:
AttributeError: __exit__

with python 2.7.

@jswhit
Copy link
Collaborator

jswhit commented Sep 3, 2018

May be related to #640

@jswhit
Copy link
Collaborator

jswhit commented Sep 3, 2018

Works with python3.6 if you replace ncf['/Depth/ndepths'][:][iwest:ieast, isouth:inorth] with ncf['/Depth/ndepths'][iwest:ieast, isouth:inorth]. I think it's a memory issue, not a concurrency issue.

@shoyer
Copy link
Contributor

shoyer commented Sep 3, 2018

You need to use a lock when using netCDF4 with multiple threads. Unfortunately, the underlying HDF5 library is not thread safe.

@dopplershift
Copy link
Member

Yeah, my rule from experience is that if I'm using multiple threads, all access to netcdf4-python needs to be guarded by a lock.

@jswhit
Copy link
Collaborator

jswhit commented Sep 3, 2018

You can build HDF5 thread-safe, but it's not the default
https://portal.hdfgroup.org/display/knowledge/Questions+about+thread-safety+and+concurrent+access. You can read hdf5 files concurrently from multiple processes though (so using Pool instead of ThreadPool should be safe).

@dopplershift
Copy link
Member

Even if HDF5 is compiled with thread safety, the netcdf4 C library is not thread safe.

@shoyer
Copy link
Contributor

shoyer commented Sep 4, 2018 via email

@ArnaudLevaufre
Copy link
Author

Thanks for the answers. I didn't know the HDF5 library was not thread safe (I must admit I didn't look for this information and took it for granted). @jswhit removing the middle [:] works on the example but not reliably with the real implementation in the context of a web server. So for now I will stick to using processes for concurent read.

@jswhit
Copy link
Collaborator

jswhit commented Sep 4, 2018

There was at least a plan to make the netcdf library thread safe (see https://www.unidata.ucar.edu/blogs/developer/entry/implementing-thread-safe-access-to). @DennisHeimbigner - was that every implemented?

also https://github.com/Unidata/netcdf-c/projects/6

@DennisHeimbigner
Copy link
Collaborator

Not yet. It is waiting on some other library changes. The priority is high, however.

@lanougue
Copy link

Hi everyone
@ArnaudLevaufre , did you have experienced similar errors with concurent reads even with processes ?
I opened an issue in xarray (issue) since I have strong issues trying to read netcdf files with several processes.
Looking for some help here too ...
Thks

@ArnaudLevaufre
Copy link
Author

Hi. I had no issues with concurrent reads when using processes so I can't help you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants