Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run_function metadata error #315

Closed
reint-fischer opened this issue Mar 4, 2022 · 7 comments
Closed

run_function metadata error #315

reint-fischer opened this issue Mar 4, 2022 · 7 comments

Comments

@reint-fischer
Copy link

Hi all,

When trying to create a recipe for SAMOS data (https://samos.coaps.fsu.edu/html/nav.php?s=2) with @hsosik and @SBS-EREHM, we ran into a problem when trying to execute run_function(), where something goes wrong with consolidating the metadata (as in #300) which creates an error saying FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmptzxt2o1f/Mnhnk1k0/.zmetadata'. I was using the binder environment created for the OSM tutorial. Are we missing something in the code?

The specific ship-data that we were trying to write a recipe for is shown here in the catalog: http://tds.coaps.fsu.edu/thredds/catalog/samos/data/research/ZCYL5/catalog.html

Below is the full code and error message:

Code

import pandas as pd
import xarray as xr
from pangeo_forge_recipes.patterns import ConcatDim, FilePattern
from pangeo_forge_recipes.recipes import XarrayZarrRecipe, setup_logging

def make_url(time):
    year=time.strftime('%Y')
    year_month_day = time.strftime('%Y%m%d')
    return(f'http://tds.coaps.fsu.edu/thredds/fileServer/samos/data/research/ZCYL5/{year}/ZCYL5_{year_month_day}v30001.nc')

dates = pd.date_range('2021-01-01','2021-01-03', freq='D')

time_concat_dim = ConcatDim("time", dates, nitems_per_file=1)

pattern = FilePattern(make_url, time_concat_dim)

recipe = XarrayZarrRecipe(pattern, inputs_per_chunk=30)

setup_logging()

recipe_pruned = recipe.copy_pruned()

run_function = recipe_pruned.to_function()

run_function()

Error

[03/04/22 18:40:31] INFO     Caching input 'Index({DimIndex(name='time', index=0, sequence_len=2, operation=<CombineOp.CONCAT: 2>)})'                                                        xarray_zarr.py:149
                    INFO     Caching file 'http://tds.coaps.fsu.edu/thredds/fileServer/samos/data/research/ZCYL5/2021/ZCYL5_20210101v30001.nc'                                                   storage.py:154
                    INFO     Copying remote file 'http://tds.coaps.fsu.edu/thredds/fileServer/samos/data/research/ZCYL5/2021/ZCYL5_20210101v30001.nc' to cache                                   storage.py:165
[03/04/22 18:40:32] INFO     Caching input 'Index({DimIndex(name='time', index=1, sequence_len=2, operation=<CombineOp.CONCAT: 2>)})'                                                        xarray_zarr.py:149
                    INFO     Caching file 'http://tds.coaps.fsu.edu/thredds/fileServer/samos/data/research/ZCYL5/2021/ZCYL5_20210102v30001.nc'                                                   storage.py:154
                    INFO     Copying remote file 'http://tds.coaps.fsu.edu/thredds/fileServer/samos/data/research/ZCYL5/2021/ZCYL5_20210102v30001.nc' to cache                                   storage.py:165
/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/xarray_zarr.py:111: RuntimeWarning: Failed to open Zarr store with consolidated metadata, falling back to try reading non-consolidated metadata. This is typically much slower for opening a dataset. To silence this warning, consider:
1. Consolidating metadata in this existing store with zarr.consolidate_metadata().
2. Explicitly setting consolidated=False, to avoid trying to read consolidate metadata, or
3. Explicitly setting consolidated=True, to raise an error in this case instead of falling back to try reading non-consolidated metadata.
  return xr.open_zarr(target.get_mapper())
                    INFO     Creating a new dataset in target                                                                                                                                xarray_zarr.py:451
                    INFO     Opening inputs for chunk Index({DimIndex(name='time', index=0, sequence_len=1, operation=<CombineOp.CONCAT: 2>)})                                               xarray_zarr.py:333
                    INFO     Opening input with Xarray Index({DimIndex(name='time', index=0, sequence_len=2, operation=<CombineOp.CONCAT: 2>)}):                                             xarray_zarr.py:249
                             'http://tds.coaps.fsu.edu/thredds/fileServer/samos/data/research/ZCYL5/2021/ZCYL5_20210101v30001.nc'                                                                              
                    INFO     Opening 'http://tds.coaps.fsu.edu/thredds/fileServer/samos/data/research/ZCYL5/2021/ZCYL5_20210101v30001.nc' from cache                                             storage.py:260
Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/mapping.py", line 135, in __getitem__
    result = self.fs.cat(k)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/spec.py", line 760, in cat
    return self.cat_file(paths[0], **kwargs)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/spec.py", line 670, in cat_file
    with self.open(path, "rb", **kwargs) as f:
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/spec.py", line 1030, in open
    f = self._open(
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/implementations/local.py", line 155, in _open
    return LocalFileOpener(path, mode, fs=self, **kwargs)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/implementations/local.py", line 250, in __init__
    self._open()
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/implementations/local.py", line 255, in _open
    self.f = open(self.path, mode=self.mode)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmptzxt2o1f/Mnhnk1k0/.zmetadata'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/backends/zarr.py", line 364, in open_group
    zarr_group = zarr.open_consolidated(store, **open_kwargs)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/zarr/convenience.py", line 1183, in open_consolidated
    meta_store = ConsolidatedMetadataStore(store, metadata_key=metadata_key)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/zarr/storage.py", line 2590, in __init__
    meta = json_loads(store[metadata_key])
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/fsspec/mapping.py", line 139, in __getitem__
    raise KeyError(key)
KeyError: '.zmetadata'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/xarray_zarr.py", line 442, in prepare_target
    ds = open_target(config.storage_config.target)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/xarray_zarr.py", line 111, in open_target
    return xr.open_zarr(target.get_mapper())
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/backends/zarr.py", line 768, in open_zarr
    ds = open_dataset(
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/backends/api.py", line 495, in open_dataset
    backend_ds = backend.open_dataset(
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/backends/zarr.py", line 824, in open_dataset
    store = ZarrStore.open_group(
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/backends/zarr.py", line 381, in open_group
    zarr_group = zarr.open_group(store, **open_kwargs)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/zarr/hierarchy.py", line 1168, in open_group
    raise GroupNotFoundError(path)
zarr.errors.GroupNotFoundError: group not found at path ''

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/backends/file_manager.py", line 199, in _acquire_with_cache_info
    file = self._cache[self._key]
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/backends/lru_cache.py", line 53, in __getitem__
    value = self._cache[key]
KeyError: [<class 'h5netcdf.core.File'>, ('/tmp/tmptzxt2o1f/hOz6eeMt/777be2b9214151be7e2c4f211c36a334-http_tds.coaps.fsu.edu_thredds_fileserver_samos_data_research_zcyl5_2021_zcyl5_20210101v30001.nc',), 'r', (('decode_vlen_strings', True), ('invalid_netcdf', None))]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jovyan/untitled.py", line 25, in <module>
    run_function()
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/executors/python.py", line 46, in function
    stage.function(config=pipeline.config)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/xarray_zarr.py", line 464, in prepare_target
    with open_chunk(chunk_key, config=config) as ds:
  File "/srv/conda/envs/notebook/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/xarray_zarr.py", line 339, in open_chunk
    dsets = [stack.enter_context(open_input(input_key, config=config)) for input_key in inputs]
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/xarray_zarr.py", line 339, in <listcomp>
    dsets = [stack.enter_context(open_input(input_key, config=config)) for input_key in inputs]
  File "/srv/conda/envs/notebook/lib/python3.9/contextlib.py", line 448, in enter_context
    result = _cm_type.__enter__(cm)
  File "/srv/conda/envs/notebook/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/xarray_zarr.py", line 303, in open_input
    with xr.open_dataset(f, **kw) as ds:
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/backends/api.py", line 495, in open_dataset
    backend_ds = backend.open_dataset(
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/backends/h5netcdf_.py", line 374, in open_dataset
    store = H5NetCDFStore.open(
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/backends/h5netcdf_.py", line 178, in open
    return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/backends/h5netcdf_.py", line 123, in __init__
    self._filename = find_root_and_group(self.ds)[0].filename
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/backends/h5netcdf_.py", line 189, in ds
    return self._acquire()
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/backends/h5netcdf_.py", line 181, in _acquire
    with self._manager.acquire_context(needs_lock) as root:
  File "/srv/conda/envs/notebook/lib/python3.9/contextlib.py", line 119, in __enter__
    return next(self.gen)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/backends/file_manager.py", line 187, in acquire_context
    file, cached = self._acquire_with_cache_info(needs_lock)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/backends/file_manager.py", line 205, in _acquire_with_cache_info
    file = self._opener(*self._args, **kwargs)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/h5netcdf/core.py", line 712, in __init__
    self._h5file = h5py.File(path, mode, **kwargs)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/h5py/_hl/files.py", line 507, in __init__
    fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
  File "/srv/conda/envs/notebook/lib/python3.9/site-packages/h5py/_hl/files.py", line 220, in make_fid
    fid = h5f.open(name, flags, fapl=fapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 106, in h5py.h5f.open
OSError: Unable to open file (file signature not found)

Thank you,

Reint

@reint-fischer reint-fischer changed the title metadata run_function metadata error Mar 5, 2022
@rabernat
Copy link
Contributor

rabernat commented Mar 7, 2022

Hi @reint-fischer. Thanks for reporting this. We will look into it and get back to you.

@cisaacstern
Copy link
Member

@reint-fischer, thanks for raising this issue!

I was able to move past the error you encountered as follows:

  1. I ran the same code you posted above, but with setup_logging(level="DEBUG") to expose debugging logs. Here are the last few lines of debug logs before the error appears:

Screen Shot 2022-03-07 at 8 44 48 AM

  1. This revealed that we are hitting the issue when xarray tries to open one of cached inputs from the following path
path = "/tmp/tmpwxl4qo75/1EyjKoGo/777be2b9214151be7e2c4f211c36a334-http_tds.coaps.fsu.edu_thredds_fileserver_samos_data_research_zcyl5_2021_zcyl5_20210101v30001.nc"
  1. Referring back to the traceback you posted, I noted that we appear to be deep into the h5py stack there. This lead me to guess that our issue was one of the xarray backend we are using to open your cached input. Pangeo Forge defaults to the h5netcdf xarray backend, but we can customize the backed as follows:
 # Start coding here!

import pandas as pd
import xarray as xr
from pangeo_forge_recipes.patterns import ConcatDim, FilePattern
from pangeo_forge_recipes.recipes import XarrayZarrRecipe, setup_logging

def make_url(time):
    year=time.strftime('%Y')
    year_month_day = time.strftime('%Y%m%d')
    return(f'http://tds.coaps.fsu.edu/thredds/fileServer/samos/data/research/ZCYL5/{year}/ZCYL5_{year_month_day}v30001.nc')

dates = pd.date_range('2021-01-01','2021-01-03', freq='D')

time_concat_dim = ConcatDim("time", dates, nitems_per_file=1)

pattern = FilePattern(make_url, time_concat_dim)

recipe = XarrayZarrRecipe(
     pattern,
     inputs_per_chunk=30,
+    xarray_open_kwargs=dict(engine="netcdf4")
)

setup_logging()

recipe_pruned = recipe.copy_pruned()

run_function = recipe_pruned.to_function()

run_function()

With these xarray_open_kwargs added, I am able to get past the error you hit, but am now encountering a new error:

ValueError                                Traceback (most recent call last)
/tmp/ipykernel_369/1701840335.py in <module>
     25 run_function = recipe_pruned.to_function()
     26 
---> 27 run_function()

/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/executors/python.py in function()
     44                         stage.function(m, config=pipeline.config)
     45                 else:
---> 46                     stage.function(config=pipeline.config)
     47 
     48         return function

/srv/conda/envs/notebook/lib/python3.9/site-packages/pangeo_forge_recipes/recipes/xarray_zarr.py in prepare_target(config)
    497                         "ignore"
    498                     )  # suppress the warning that comes with safe_chunks
--> 499                     ds.to_zarr(target_mapper, mode="a", compute=False, safe_chunks=False)
    500 
    501     # Regardless of whether there is an existing dataset or we are creating a new one,

/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/core/dataset.py in to_zarr(self, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options)
   2035             encoding = {}
   2036 
-> 2037         return to_zarr(
   2038             self,
   2039             store=store,

/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/backends/api.py in to_zarr(dataset, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options)
   1404 
   1405     if mode in ["a", "r+"]:
-> 1406         _validate_datatypes_for_zarr_append(dataset)
   1407         if append_dim is not None:
   1408             existing_dims = zstore.get_dimensions()

/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/backends/api.py in _validate_datatypes_for_zarr_append(dataset)
   1299 
   1300     for k in dataset.data_vars.values():
-> 1301         check_dtype(k)
   1302 
   1303 

/srv/conda/envs/notebook/lib/python3.9/site-packages/xarray/backends/api.py in check_dtype(var)
   1290         ):
   1291             # and not re.match('^bytes[1-9]+$', var.dtype.name)):
-> 1292             raise ValueError(
   1293                 "Invalid dtype for data variable: {} "
   1294                 "dtype must be a subtype of number, "

ValueError: Invalid dtype for data variable: <xarray.DataArray 'flag' (time: 2879)>
dask.array<concatenate, shape=(2879,), dtype=|S35, chunksize=(1440,), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) datetime64[ns] 2021-01-01 ... 2021-01-02T23:59:00
Attributes:
    long_name:                quality control flags
    A:                        Units added
    B:                        Data out of range
    C:                        Non-sequential time
    D:                        Failed T>=Tw>=Td
    E:                        True wind error
    F:                        Velocity unrealistic
    G:                        Value > 4 s. d. from climatology
    H:                        Discontinuity
    I:                        Interesting feature
    J:                        Erroneous
    K:                        Suspect - visual
    L:                        Ocean platform over land
    M:                        Instrument malfunction
    N:                        In Port
    O:                        Multiple original units
    P:                        Movement uncertain
    Q:                        Pre-flagged as suspect
    R:                        Interpolated data
    S:                        Spike - visual
    T:                        Time duplicate
    U:                        Suspect - statistial
    V:                        Spike - statistical
    X:                        Step - statistical
    Y:                        Suspect between X-flags
    Z:                        Good data
    metadata_retrieved_from:  ZCYL5_20210101v10001.nc dtype must be a subtype of number, datetime, bool, a fixed sized string, a fixed size unicode string or an object

Perhaps you or @rabernat have thoughts on how to resolve this error?

@SBS-EREHM
Copy link

SBS-EREHM commented Mar 7, 2022 via email

@cisaacstern
Copy link
Member

@SBS-EREHM, thanks for weighing in, I've just opened #320 which I believe is the root of the problem both you and @reint-fischer are seeing.

Manually setting xarray_open_kwargs=dict(engine="netcdf4") as I noted in #315 (comment) should resolve this, if you want it to work right now.

I'll make a PR today to make this the default behavior, however. I'll ping this thread as soon as that's merged and released, after which point this should just work out-of-the-box (without any manual xarray_open_kwargs workarounds needed).

@rabernat
Copy link
Contributor

rabernat commented Mar 7, 2022

I'll make a PR today to make this the default behavior,

I'm not sure we want engine='netcdf4' to be the default behavior. AFAIK, only the h5netcdf engine can open file-like objects from fsspec, which would require copy_input_to_local_file=True.

@cisaacstern
Copy link
Member

cisaacstern commented Mar 7, 2022

@reint-fischer and @SBS-EREHM, apologies for being too quick with a suggested fix in #315 (comment). Upon further reflection (and input from @rabernat), it looks like the source of this issue is deeper than I'd realized. We've opened h5netcdf/h5netcdf#157 to push these concerns upstream. I will report back on this thread when we have some progress on this.

Edit: The solution for this problem is now being tracked in #320. I will ping this thread when we have a fix.

@cisaacstern
Copy link
Member

@reint-fischer and @SBS-EREHM, thanks for your patience as we've been working through a series of issues that your recipe surfaced. I've made considerable headway with this recipe, which I will summarize in a new comment on pangeo-forge/staged-recipes#120. AFAICT, this issue is duplicative of that one, and that thread is less cluttered with my prior thoughts, so I figured it would be a better place to reset the conversation on this recipe. I'll close this issue now and we can continue the conversation on the linked thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants