Use the correct fits image dtype #531

bamford · 2024-11-29T16:01:31Z

Currently, when kerchunk.fits.process_file is used on a FITS image containing floating-point data, the dtype is not identified as bigendian. This PR corrects the lookup table so it specifies bigendian float dtypes.

A test is provided to illustrate the issue and prevent recursion.

martindurant · 2024-12-03T15:18:47Z

Thanks for submitting this. Please let me know when it's ready.

bamford · 2024-12-03T15:49:39Z

Hi @martindurant, I think the PR is ready. It is a tiny change, and the relevant tests pass. I guess no one has been trying kerchunk on floating point FITS images yet. The failing tests seem to be due to something else.

Just to let you know my use case... In the past, I've converted FITS images to zarr so I can process them as an on-disk, dask-chunked, xarray Dataset, with all the benefits that implies. I'm now dealing with a large dataset (the Euclid survey) and I was looking for a way to do the same thing without doubling up on disk space. Fortunately, I discovered kerchunk and it is perfect! It took a little while to work out the details, but the documentation was very helpful. I've now got it working for a pile of fairly complicated, multi-extension fits files, and have run a quick test to show I can do efficient calculations over large numbers of files. Many thanks to you and the team.

martindurant · 2024-12-03T19:03:11Z

I've now got it working for a pile of fairly complicated, multi-extension fits files, and have run a quick test to show I can do efficient calculations over large numbers of files

Would love to see a notebook or any sharable material out of this!

martindurant · 2024-12-03T19:04:10Z

@emfdavid , any idea why "s3://noaahrrr-bdp-pds/hrrr.20220804/conus/hrrr.t01z.wrfsfcf01.grib2" would no longer be publicly accessable?

akrherz · 2024-12-03T19:07:56Z

noaahrrr-bdp-pds

Just a typo and should be noaa-hrrr-bdp-pds ?

martindurant · 2024-12-03T19:11:17Z

Oh, I see the context now: the URL is intentionally wrong, but it now raises PermissionError rather than FileNotFound. I think for the purposes of the test, either will do:

    def test_parse_grib_idx_no_file():
        with pytest.raises((FileNotFoundError, PermissionError)):

emfdavid · 2024-12-04T15:02:58Z

The behavior may be different depending on whether the bucket or the blob doesn't exist?

If you are making an intentional not found test, put some answer in your question for your future self to debug and make the key something like "s3://noaahrrr-bdp-pds/hrrr.20220804/definitely_doesnt_exist_test/hrrr.t01z.wrfsfcf01.grib2"

martindurant · 2024-12-04T15:04:58Z

It may be a behaviour change in S3, but I suspect that the bucket does now exists, but we can't read from it, whereas before it didn't exist. Either way, changing the line in the test as I indicated will fix this.

bamford · 2024-12-04T23:16:20Z

I've now got it working for a pile of fairly complicated, multi-extension fits files, and have run a quick test to show I can do efficient calculations over large numbers of files

Would love to see a notebook or any sharable material out of this!

Here is a gist of an example notebook.. You won't be able to run it, but it illustrates what I needed to do to get a particular set of FITS files into a meaningful structure. _rename is a bit hacky (as is _fix_byte_order) and would be unnecessary with a bit more flexibility in kerchunk.fits.process_files.

See discussion in otherwise unrelated PR fsspec#531.

bamford · 2024-12-04T23:27:50Z

It may be a behaviour change in S3, but I suspect that the bucket does now exists, but we can't read from it, whereas before it didn't exist. Either way, changing the line in the test as I indicated will fix this.

I've made this change, but not sure if you wanted me to fix it here. If not, I can revert.

Add test on a fits image (currently fails)

6946271

bamford changed the title ~~Fits image dtype~~ Use the correct fits image dtype Nov 29, 2024

Ensure float FITS images are identified as bigendian.

f627fd4

bamford force-pushed the fits_image_dtype branch from 491a52a to f627fd4 Compare November 29, 2024 16:28

bamford marked this pull request as ready for review November 29, 2024 16:32

bamford mentioned this pull request Nov 29, 2024

Fits image dtype bamford/kerchunk#1

Closed

Fix broken test.

2081a90

See discussion in otherwise unrelated PR fsspec#531.

martindurant merged commit 013798e into fsspec:main Dec 5, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use the correct fits image dtype #531

Use the correct fits image dtype #531

bamford commented Nov 29, 2024 •

edited

Loading

martindurant commented Dec 3, 2024

bamford commented Dec 3, 2024

martindurant commented Dec 3, 2024

martindurant commented Dec 3, 2024

akrherz commented Dec 3, 2024

martindurant commented Dec 3, 2024

emfdavid commented Dec 4, 2024

martindurant commented Dec 4, 2024

bamford commented Dec 4, 2024

bamford commented Dec 4, 2024

Use the correct fits image dtype #531

Use the correct fits image dtype #531

Conversation

bamford commented Nov 29, 2024 • edited Loading

martindurant commented Dec 3, 2024

bamford commented Dec 3, 2024

martindurant commented Dec 3, 2024

martindurant commented Dec 3, 2024

akrherz commented Dec 3, 2024

martindurant commented Dec 3, 2024

emfdavid commented Dec 4, 2024

martindurant commented Dec 4, 2024

bamford commented Dec 4, 2024

bamford commented Dec 4, 2024

bamford commented Nov 29, 2024 •

edited

Loading