Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String-valued zVariables: Writing & reading yields wrong numpy array shape. #172

Open
ErikPGJ opened this issue Oct 31, 2022 · 4 comments
Open
Labels

Comments

@ErikPGJ
Copy link

ErikPGJ commented Oct 31, 2022

Footnote: I sent an e-mail on a similar issue 2022-10-27 but I can not reproduce that exact problem now (possibly confused two different installs, with different cdflib versions).

I am trying to store a string-valued zVariable in a CDF, but when I read it back from the CDF, it has a different shape.
(cdflib 0.4.8, xarray 2022.10.0, numpy 1.23.0, Python 3.9.2).

with cdflib.cdfwrite.CDF(PATH, delete=True, cdf_spec={'Compressed': 0}) as cdf:
    NA_CHAR = np.array(['abc', 'de', 'g'])
    cdflib_data = NA_CHAR.tolist()     # Converting to LIST.
    cdf.write_var(
        {
            'Variable': 'ZV_CHAR',
            'Data_Type': cdf.CDF_CHAR,
            'Num_Elements': 3,
            'Rec_Vary': [True],
            'Dim_Sizes': (1,),
            'Var_Type': 'zVariable',
            'Compress': 0,
        },
        var_data=cdflib_data,
    )

cdf = cdflib.cdfread.CDF(PATH)
act_na = cdf.varget('ZV_CHAR')
act_xna = cdflib.cdf_to_xarray(PATH).get('ZV_CHAR').data

print(f'act_na.shape  = """"{act_na.shape}""""')
print(f'act_xna.shape = """"{act_xna.shape}""""')
print(f'act_na  = """"{act_na}""""')
print(f'act_xna = """"{act_xna}""""')

which generates the following output

act_na.shape  = """"(3, 1, 1)""""
act_xna.shape = """"(3,)""""
act_na  = """"[[['abc']]

 [['de']]

 [['g']]]""""
act_xna = """"['abc' 'de' 'g']""""

I see the same behaviour for CDF_CHAR and CDF_UCHAR.

@ErikPGJ
Copy link
Author

ErikPGJ commented Oct 31, 2022

It could be that as little as setting 'Dim_Sizes': (), solves the problem, generating

act_na.shape  = """"(3,)""""
act_xna.shape = """"(3,)""""
act_na  = """"['abc' 'de' 'g']""""
act_xna = """"['abc' 'de' 'g']""""

@ErikPGJ
Copy link
Author

ErikPGJ commented Oct 31, 2022

Also, I also can not get writing+reading 2D string-valued zVariables to work as I would expect.

with cdflib.cdfwrite.CDF(PATH, delete=True, cdf_spec={'Compressed': 0}) as cdf:
    NA_CHAR = np.array([['11', '12'], ['21', '22'], ['31', '32']])
    cdflib_data = NA_CHAR.tolist()     # Converting to LIST.
    print(f'NA_CHAR.shape = {NA_CHAR.shape}')
    print(f'cdflib_data = """"{cdflib_data}""""')
    cdf.write_var(
        {
            'Variable': 'ZV_CHAR',
            'Data_Type': cdf.CDF_CHAR,
            'Num_Elements': 3,
            'Dim_Sizes': (3,),
            'Rec_Vary': (True, True),  # Req., but docs says only for rVars?!
            'Var_Type': 'zVariable',
            'Compress': 0,
        },
        var_data=cdflib_data,
    )

cdf = cdflib.cdfread.CDF(PATH)
act_na = cdf.varget('ZV_CHAR')
act_xna = cdflib.cdf_to_xarray(PATH).get('ZV_CHAR').data

print(f'act_na.shape  = """"{act_na.shape}""""')
print(f'act_xna.shape = """"{act_xna.shape}""""')
print(f'act_na  = """"{act_na}""""')
print(f'act_xna = """"{act_xna}""""')

generates

NA_CHAR.shape = (3, 2)
cdflib_data = """"[['11', '12'], ['21', '22'], ['31', '32']]""""
act_na.shape  = """"(2, 3, 1)""""
act_xna.shape = """"(2, 3)""""
act_na  = """"[[['11']
  ['12']
  ['21']]

 [['22']
  ['31']
  ['32']]]""""
act_xna = """"[['11' '12' '21']
 ['22' '31' '32']]""""

cdfdump:

Variable Data:
  Record # 1: ["11","12","21"]
  Record # 2: ["22","31","32"]

Note that while I do write a 3x2 array and get a 2x3 array back, it is not a transposed array/matrix: The components are in the wrong locations.

Note: I get an error if I omit Rec_vary but https://pypi.org/project/cdflib/0.4.4/ (the last version with documentation at that location) tells me that Rec_Vary is only for rVariables while I am writing a zVariable.

@dstansby dstansby added the Bug label Nov 9, 2022
@ErikPGJ
Copy link
Author

ErikPGJ commented Mar 3, 2023

Any progress on this?

@bryan-harter
Copy link
Collaborator

Sorry for the delay getting back to you, and thanks for documenting the error so well! I think we'll take a look at this shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants