Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow custom field map to override default field types #149

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jasper-tms
Copy link

Thanks for writing this library! The nrrd spec says that the space directions field must be a list of some combination of vectors (for spatial axes) and none (for non-spatial axes like color). It looks to me like pynrrd insists that space directions is a matrix and so it does not support the ability to add none entries. I made a small change to enable this.

With this commit, it becomes possible to override the default type of a header field. By overriding the default type of space directions with 'string', users can provide a string value for this that is directly put into the file header. Users will have to know that the proper format of the space directions field is something like

(0,0,0.8) (0,0.8,0) (2,0,0) none

to produce a valid .nrrd header, but at least this is now possible.

As a result, a command like this succeeds in producing a nrrd file that has a correctly specified header for a dataset that is xyz plus a color channel:

import nrrd
nrrd.write(
    'examplenrrd_3spatialdimspluscolor.nrrd',
    np.random.randint(0, 2**16, size=(2, 100, 256, 256), dtype=np.uint16),
    header={'encoding': 'raw',
            'dimension': 4,
            'space dimension': 3,
            'space directions': '(0,0,0.8) (0,0.8,0) (2,0,0) none',
            'space units': ['microns', 'microns', 'microns']},
    custom_field_map={'space directions': 'string'},
    index_order='C')

And get a proper header:

NRRD0005
# This NRRD file was generated by pynrrd
# on 2024-07-26 12:37:49(GMT).
# Complete NRRD file format specification at:
# http://teem.sourceforge.net/nrrd/format.html
type: uint16
dimension: 4
space dimension: 3
sizes: 256 256 100 2
space directions: (0,0,0.8) (0,0.8,0) (2,0,0) none
endian: little
encoding: raw
space units: "microns" "microns" "microns"

I have confirmed that this is a proper header according to the nrrd spec using the unu tool from teem, (which the devs have said here is the recommended way to confirm header conformance):

$ unu minmax examplenrrd_3spatialdimspluscolor.nrrd 
min: 0
max: 65535

Note that if I remove the none from the header, I get the appropriate complaint from unu:

$ unu minmax examplenrrd_3spatialdimspluscolor.nrrd 
unu minmax: trouble with "examplenrrd_3spatialdimspluscolor.nrrd":
[unu minmax] unu minmax: trouble loading "examplenrrd_3spatialdimspluscolor.nrrd"
[unu minmax] [nrrd] nrrdLoad: trouble reading "examplenrrd_3spatialdimspluscolor.nrrd"
[unu minmax] [nrrd] nrrdRead: trouble
[unu minmax] [nrrd] _nrrdRead: trouble reading NRRD file
[unu minmax] [nrrd] _nrrdFormatNRRD_read: trouble parsing space directions info |(0,0,0.8) (0,0.8,0) (2,0,0)|
[unu minmax] [nrrd] _nrrdReadNrrdParse_space_directions: trouble getting space vector 4 of 4
[unu minmax] [nrrd] _nrrdSpaceVectorParse: hit end of string before seeing (

I would guess this commit wouldn't cause any issues in existing code anywhere in the world, since before this the custom field map would just be ignored for any standard metadata fields, and the behavior is only changed when providing custom field types for the standard fields.

Feel free to merge this PR if you agree this change is a move in the correct direction. Thanks!

@jasper-tms
Copy link
Author

The next step to make life easier for users wanting to generate these sort of space dimensions fields with some number of none entries without the user having to make the string themselves is to support users passing in list arguments that look like [0.8, 0.8 2, None] (specifying the voxel size in each dimension) and this library's parser takes care of turning that into (0,0,0.8) (0,0.8,0) (2,0,0) none. I'm happy to work on this as well, either as part of this PR or a new one, if folks agree this would be useful.

@addisonElliott
Copy link
Collaborator

It's been awhile since I've used the library, but the docs state that any rows that are none will turn into a list of NaN (source: https://pynrrd.readthedocs.io/en/stable/background/datatypes.html#double-matrix)

Are you having issues with loading space directions that have a none row?

Fundamentally, I think we'll want to consider #148 where a list of a vectors datatype is added to better handle this field instead.

@jasper-tms
Copy link
Author

Thanks a lot, I had missed that aspect of the behavior. That does address my current needs, though:

  1. In format_optional_vector() It might be worth checking for python Nones in addition to checking for np.nan values – I had initially tried something that looked like np.array([(1, 0), (None, None)]), which gave an error, and I incorrectly concluded that I wasn't on the right track even though all I had to do was change None to np.nan. (Currently format_optional_vector() checks if the vector itself is None, but doesn't check if it's a vector containing all Nones.)
  2. I still would think the commit I suggested might constitute an improvement as it adds some flexibility for users who want to do custom things. (The downside is that users could then create malformed headers, but with power comes responsibility, so it's a design choice whether to grant users those powers.)

@addisonElliott
Copy link
Collaborator

Regarding point 1, I agree that handling None like NaN makes sense. I would be in favor of refactoring the formatting of ‘none’ to return a list of None instead of NaN. I’m not sure NaN is very intuitive. Could do a breaking release or make it an option that you can select between.

For point 2, I’m all for flexibility for the user but I don’t see the need to give flexibility for the predefined header fields in the NRRD spec. I’m not sure why anyone would want to customize that and go against the NRRD spec. In this case, the fundamental problem was poor handling of the fields in this library.

addisonElliott added a commit that referenced this pull request Nov 5, 2024
…ption (#157)

## Changes

* Add `int vector list` and `double vector list` datatypes that are lists of Numpy arrays or `None`.
    * These new types are similar to their `int matrix` and `double matrix` counterparts except they are **not** 2D Numpy matrices.
 * Add `nrrd.SPACE_DIRECTIONS_TYPE` to enable switching the datatype for the `space directions` field.
    * Valid options are `double matrix` or `double vector list`. The current default is `double matrix` for backwards compatibility but will be switched to `double vector list` in the next major release.
    * `double vector list` is superior over `double matrix` because it doesn't have the confusing row-of-NaN's representation and it doesn't imply an affine transform by being a matrix.
* Support row-of-None in addition to row-of-NaN for `parse_optional_matrix` & `format_optional_matrix` in addition to new vector list parsing/formatting functions

Fixes #148
Revises #149
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants