Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CMOR segfaults when further_info_url is specified as an empty or pure whitespace string #700

Closed
matthew-mizielinski opened this issue Apr 14, 2023 · 2 comments · Fixed by #701
Labels

Comments

@matthew-mizielinski
Copy link

I've found a minor issue that has been causing me a few issues.

If you specify the further_info_url string in the JSON file provided to cmor.dataset_json() and you provide a string that does not match the pattern in the CVs you'll get an appropriate error when cmor.write() is called;

! Error: The attribute "further_info_url" could not be validated. 
! The current input value is "incorrect_string", which is not valid. 

If you omit a further_info_url entry from the dataset JSON file altogether then CMOR will construct this field for you appropriately and everything works fine.

However, if you include a further_info_url entry which is either an empty string or only contains whitespace then CMOR segfaults when cmor.write() is called, giving no information on what is going on.

Would it be possible to augment the validation of this field?

@mauzey1
Copy link
Collaborator

mauzey1 commented Apr 14, 2023

I have identified where in CMOR this segfault is happening. Below is a section of cmor_CV_checkFurtherInfoURL that checks if the URL template has tokens separated by '<' and '>' characters.

cmor/Src/cmor_CV.c

Lines 379 to 385 in 8d45339

/* -------------------------------------------------------------------- */
/* If this is a string with no token we have nothing to do. */
/* -------------------------------------------------------------------- */
szToken = strtok(szFurtherInfoURLTemplate, "<>");
if (strcmp(szToken, cmor_current_dataset.furtherinfourl) == 0) {
return (0);
}

When passing an emtpy string to strtok, it will assign a null to szToken. This will cause strcmp to segfault when comparing szToken.

This lead to discovering another behavior of this function: Using the further info URL of the current dataset as a template.

cmor/Src/cmor_CV.c

Lines 373 to 377 in 8d45339

/* -------------------------------------------------------------------- */
/* Retrieve default Further URL info */
/* -------------------------------------------------------------------- */
strncpy(szFurtherInfoURLTemplate, cmor_current_dataset.furtherinfourl,
CMOR_MAX_STRING);

If you were to have the following value in your user input...

"further_info_url":       "https://furtherinfo.es-doc.org/<mip_era><source_id>"

then the function will treat that as a template to be turned into a URL for comparison.

It will then proceed to compare what is provided by the user input with the URL created from the template.

C Traceback:
In function: 

!!!!!!!!!!!!!!!!!!!!!!!!!
!
! Warning: The further info in attribute does not match the one found in your Control Vocabulary(CV) File. 
! We found "https://furtherinfo.es-doc.org/<mip_era><source_id>" and 
! CV requires "https://furtherinfo.es-doc.org/CMIP6.PCMDI-test-1-0" 
! Check your Control Vocabulary file "Tables/CMIP6_CV.json".
! 
!
!!!!!!!!!!!!!!!!!!!!!!!!!

So any string that begins with "https://furtherinfo.es-doc.org/" will be valid input according to the CV's further_info_url value, but not if they contain any of the template tokens such as the ones in cmor.h

#define CMOR_DEFAULT_FURTHERURL_TEMPLATE "https://furtherinfo.es-doc.org/<mip_era><institution_id><source_id><experiment_id><sub_experiment_id><variant_label>"

@durack1
Copy link
Contributor

durack1 commented Apr 17, 2023

@mauzey1 it would be great to catch this segfault, and rather issue an error message suggesting changes required to the user and quit, does that seems reasonable?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants