Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support for '+=' in Namelists #1389

Merged
merged 9 commits into from
Apr 28, 2017
Merged

Conversation

erichlf
Copy link

@erichlf erichlf commented Apr 20, 2017

Namelists.pm had the ability to parse += in namelist files. This adds the
support for += into the python version.

Test suite: scripts_regression_tests.py --fast
and

./create_newcase -case test_nml_0420 -compset A -res f45_g37

with the following in user_nl_cpl

cplflds_custom = 'foo','bar'
cplflds_custom += 'baz'

Test baseline:
Test namelist changes:
Test status: bit for bit

Fixes #839

User interface changes?: Added support for '+=' to namelist parser.

Code review: @billsacks, @jgfouca

Namelists.pm had the ability to parse += in namelist files. This adds the
support for += into the python version.

Fixes #839
@billsacks
Copy link
Member

This doesn't seem to work.

I set up a case with

./create_newcase -case test_nml_0420 -compset A -res f45_g37 --run-unsupported

And put the following in user_nl_cpl:

cplflds_custom = 'foo','bar'
cplflds_custom += 'baz'

After running ./case.setup and ./preview_namelists, I expected that the final value of cplflds_custom in CaseDocs/drv_in would be 'foo','bar','baz', but in fact I got an error:

ERROR: Error in parsing namelist: expected literal value, but got 'cplflds_custom'

Also: Please fill out the PR template that is auto-created for you when you create a PR. Among other things, this should include what testing you have run on these changes. You need to run scripts_regression_tests before a PR can be merged.

@erichlf
Copy link
Author

erichlf commented Apr 21, 2017

@billsacks I can't even get your setup command to work. I get

create_newcase: error: unrecognized arguments: --run-unsupported

What am I missing here?

@jedwards4b
Copy link
Contributor

The --run-unsupported flag is a cesm only feature - acme users can just leave it out.

When checking for string literals I forgot to add '+=' to separators.
Additionally, it was necessary to keep track of added values when parsing.
@erichlf
Copy link
Author

erichlf commented Apr 21, 2017

@billsacks Not sure why but the unit tests were not catching this problem. I had used doctest as my main method for checking that this worked. Anyway, I ran your case after a few fixes and things now work. Not sure how to catch this exception in the literal parser though unit tests.

Copy link
Member

@billsacks billsacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this working!

In addition to my inline comments, please fill out the PR template that is auto-created for you when you create a PR. Among other things, this should include what testing you have run on these changes. You need to run scripts_regression_tests before this PR can be merged.

@@ -1440,12 +1440,15 @@ def _eat_comment(self):
self._advance()
return True

def _expect_char(self, chars):
def _expect_char(self, chars, RETURN=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes you made to _expect_char make its usage and implementation more confusing. I'd suggest reverting the changes to _expect_char, and where you have

        if self._expect_char("+", RETURN=True):

replace it with an inline check (see below).


self._eat_whitespace()
# check to see if we have a "+="
if self._expect_char("+", RETURN=True):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following on from my above comment: I'd replace this with an inline check:

if self._curr() == '+':

@@ -1659,6 +1666,29 @@ def _look_ahead_for_equals(self, pos):
break
return False

def _look_ahead_for_plusequals(self, pos):
r"""Look ahead to see if the next whitespace character is '='.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should say: if the next two non-whitespace characters are '+='

@@ -1782,6 +1824,7 @@ def _parse_literal(self, allow_name=False, allow_eof_end=False):
separators = [' ', '\n', ',', '/']
if allow_name:
separators.append('=')
separators.append('+=')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what this code is doing, but this doesn't look right to me: You end up with a list of strings like [' ', '\n', ',', '/', '=', '+=']. But then, in the next line, it looks like it's checking if a single character is in this list. That will never match the '+='. I'm wondering if you should just be appending '+' rather than '+=' here?

@@ -1919,63 +1965,70 @@ def _parse_name_and_values(self, allow_eof_end=False):
alternate file format in "groupless" mode.)

>>> _NamelistParser("foo='bar' /")._parse_name_and_values()
(u'foo', [u"'bar'"])
(u'foo', [u"'bar'"], False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please modify the documentation of this function to indicate the new return value in the tuple.

if name in self._settings:
dsettings = self._settings[name]
values = merge_literal_lists(dsettings, values)
if name in self._settings and not addto:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would find this logic easier to read if it were refactored to:

if name in self._settings:
    if not addto:
        values = merge_literal_lists(self._settings[name], values)
    else:
        values = self._settings[name] + values

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, the logic was wrong and I have fixed this. Somehow I missed one of the unit tests failing.

name, values = self._parse_name_and_values(allow_eof_end=True)
if name in self._settings:
name, values, addto = self._parse_name_and_values(allow_eof_end=True)
if name in self._settings and not addto:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above for suggestion on refactoring this logic

@@ -2095,6 +2149,12 @@ def parse_namelist(self):
OrderedDict([(u'foo(2)', [u"'bar'"])])
>>> _NamelistParser("foo(2)='bar', foo(3)='bazz'", groupless=True).parse_namelist()
OrderedDict([(u'foo(2)', [u"'bar'"]), (u'foo(3)', [u"'bazz'"])])
>>> _NamelistParser("foo='bar', foo='bazz'", groupless=True).parse_namelist()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand the code correctly, the tests you added here cover the case where groupless is True, where the implementation is below. But I don't see similar tests for the implementation in _parse_namelist_group. Tests should be added to cover that code, too. Or, better yet, the duplicate code should be extracted into a shared routine called from both places; then it could just be covered by one set of tests.

I cleaned up some of the logic as per @billsacks.
Fixed some flawed logic which resulted in wrong value assignment.
restored original function call for _expect_char
Fixed #1397
@erichlf
Copy link
Author

erichlf commented Apr 25, 2017

I believe I have addressed all your comments @billsacks. Additionally, I have added a fix for #1397.

@erichlf erichlf requested a review from jgfouca April 25, 2017 21:22
Copy link
Member

@billsacks billsacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is better, but there are still some issues that need to be addressed. See my new inline comments. In addition, this comment from my last review still should be addressed, by adding tests to the other routine you modified:

If I understand the code correctly, the tests you added here cover the case where groupless is True, where the implementation is below. But I don't see similar tests for the implementation in _parse_namelist_group. Tests should be added to cover that code, too. Or, better yet, the duplicate code should be extracted into a shared routine called from both places; then it could just be covered by one set of tests.

Finally, please address this comment:

In addition to my inline comments, please fill out the PR template that is auto-created for you when you create a PR. Among other things, this should include what testing you have run on these changes. You need to run scripts_regression_tests before this PR can be merged.

and also note the now two fixed issues in your overall PR comment.

@@ -1446,6 +1454,9 @@ def _expect_char(self, chars):
Does not return anything, but raises a `_NamelistParseError` if `chars`
does not contain the character at the current position.

The RETURN optional is used to allow for checking of consecutive
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to remove the documentation of RETURN now

values = merge_literal_lists(dsettings, values)
if not addto:
values = merge_literal_lists(dsettings, values)
else:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this block of code indented under the

if name in self._settings

conditional? I can imagine that it doesn't matter one way or another for

if not addto:
    values = merge_literal_lists(dsettings, values)

but for

else:
    values = self._settings[name] + values

I think this will raise an exception if (name in self._settings) is false.

I see that the similar block of code that you added in the following routine is all indented under the if name in self._settings: conditional, and that looks more correct to me.

@billsacks
Copy link
Member

billsacks commented Apr 26, 2017

I found one more use case that doesn't work but should: Set up a case the same as in #1389 (comment) but in user_nl_cpl put only:

cplflds_custom += 'baz'

This should result in cplflds_custom being set to just 'baz' (a += without a previous setting should be treated the same as =), but currently it gives an error:

   Calling drv buildnml
Traceback (most recent call last):
  File "./preview_namelists", line 64, in <module>
    _main_func(__doc__)
  File "./preview_namelists", line 61, in _main_func
    create_namelists(case)
  File "/Users/sacks/cime/scripts/Tools/../../scripts/lib/CIME/preview_namelists.py", line 86, in create_namelists
    mod.buildnml(case, caseroot, compname)
  File "/Users/sacks/cime/src/drivers/mct/cime_config/buildnml", line 340, in buildnml
    _create_drv_namelists(case, infile, confdir, nmlgen, files)
  File "/Users/sacks/cime/src/drivers/mct/cime_config/buildnml", line 61, in _create_drv_namelists
    nmlgen.init_defaults(infile, config)
  File "/Users/sacks/cime/scripts/Tools/../../scripts/lib/CIME/nmlgen.py", line 114, in init_defaults
    nml_dict = parse(in_file=file_, groupless=True)
  File "/Users/sacks/cime/scripts/Tools/../../scripts/lib/CIME/namelist.py", line 848, in parse
    namelist_dict = _NamelistParser(text, groupless).parse_namelist()
  File "/Users/sacks/cime/scripts/Tools/../../scripts/lib/CIME/namelist.py", line 2188, in parse_namelist
    self._parse_namelist_group()
  File "/Users/sacks/cime/scripts/Tools/../../scripts/lib/CIME/namelist.py", line 2116, in _parse_namelist_group
    values = self._settings[name] + values
KeyError: u'cplflds_custom'

This works correctly in the perl. This is important for testmods, because sometimes you don't want to rely on whether some inherited testmod has or hasn't already set a variable.

@jedwards4b
Copy link
Contributor

I think that this request is really muddying the waters with respect to the format of this file. += is not an allowed value in fortran namelist input and so I think that we should not allow it here.

@rljacob
Copy link
Member

rljacob commented Apr 26, 2017

But according to #839 this is needed for CLM test mods. And the perl version could do it.

@jedwards4b
Copy link
Contributor

That is true but I think that issue #1397 shows why it might be a bad idea. I don't think that it's at all clear what += should do.

@billsacks
Copy link
Member

@jedwards4b - I don't understand how #1397 shows that this is a bad idea. Perhaps your real concern is with #1397 ? -- in which case I'm happy to close that issue as a wontfix. I don't care too much about that one. I DO care about this one.

In the perl implementation, += meant "append these values to the existing list" (which is the same as the meaning of += for python lists), and I'd like that behavior to be maintained: CLM's testmods rely on this behavior to keep the testmods maintainable, since we have a lot of cases of one testmod building on another.

@jedwards4b
Copy link
Contributor

But if a list is initialized with 6 values this will then be the seventh? I'm worried that this forces assumptions on the way namelist variables are initialized.

@billsacks
Copy link
Member

But if a list is initialized with 6 values this will then be the seventh?

Yes.

I'm worried that this forces assumptions on the way namelist variables are initialized.

Can you explain your concerns in more detail?

@erichlf
Copy link
Author

erichlf commented Apr 26, 2017

The original logic in the parser allowed for

          >>> x = parse(text='&foo bar=1 /')                                       
          >>> x.get_variable_value('foo', 'bar')                                   
          [u'1']                                                                   
          >>> x.set_variable_value('foo', 'bar(2)', [u'3'], var_size=4)            
          >>> x.get_variable_value('foo', 'bar')                                   
          [u'1', u'3'] 

Which is essentially the same thing (expanding lists).

@jedwards4b
Copy link
Contributor

If a variable is initialized as
f = '1', '2', '3', '4'

and you in a user_nl_ do
f += '5'

you expect that to append f and add a fifth element (how would the average user know this), but now we want to change the default of f to be f = '1', '2', '3'
your f += '5' now changes the 4th element.

@billsacks
Copy link
Member

@jedwards4b - thanks for the example. I think I see your concern: You are concerned about the case where order matters, so you specifically want to set the Nth element of an array. In that case, you're right that this feature is not appropriate; something like

f(5) = 5

would be appropriate.

The case += is designed for is the case where order doesn't matter. This is the case for CLM's hist_fincl* variables, for example, and (I think) for cplflds_custom, and probably other variables as well. In these cases, the

f(5) = 5

syntax is NOT appropriate, but the += syntax IS appropriate and robust. So the two are complementary.

@billsacks
Copy link
Member

billsacks commented Apr 26, 2017

I was also thinking more about this point:

+= is not an allowed value in fortran namelist input and so I think that we should not allow it here.

I feel that we should not be so restrictive: The user_nl files are one of the primary aspects of our "user interface". Restricting their usage to what's available in the somewhat antiquated Fortran namelist implementation feels overly restrictive: We should be free to add value on top of existing Fortran namelist capabilities.

(And as an aside: While we currently use these user_nl files mainly to generate Fortran namelists, that is not the only case: We also use them to create CISM's config files, which are similar to MS Windows-style ini files (or files that can be parsed by python's configuration file parser module); and in the future there is discussion of using them to create netCDF-based parameter files. At that point, tying their format to Fortran namelists will make even less sense.)

*********1*********2*********3*********4*********5*********6*********7**
[ Description of the changes in this Pull Request. It should be enough
information for someone not following this development to understand.
Lines should be wrapped at about 72 characters. ]

Test suite: `scripts_regression_tests.py --fast` and unit tests
Test baseline:
Test namelist changes:
Test status: bit for bit

Fixes #839

User interface changes?: add '+=' to namelist

Code review: @billhicks, @jgfoucar
@billsacks
Copy link
Member

@erichlf - I see you pushed a new commit. Is this ready for re-review?

@gold2718
Copy link

@billsacks,

We should be free to add value on top of existing Fortran namelist capabilities.

I'm fine with doing this but not at the cost of breaking correct Fortran syntax and semantics. Therefore, I think #1397 should take precedence.

@billsacks
Copy link
Member

@gold2718 -

We should be free to add value on top of existing Fortran namelist capabilities.

I'm fine with doing this but not at the cost of breaking correct Fortran syntax and semantics. Therefore, I think #1397 should take precedence.

I'm confused about what you're arguing for. I thought your comment in #1397 was that we should leave that as is. I'm fine with that, and will close that issue if I'm understanding you correctly there. But now I'm confused by your statement that #1397 should take precedence.

So: Can you clarify your feelings on both #1397 and this use of +=?

This whole thing has turned into a much bigger argument than I ever anticipated. I don't really care how, exactly, we implement this, but CLM's tests need some way for one user_mods directory to add items to a list that was created by its parent user_mods directory (where that original list is of unknown length). If someone has an alternative suggestion for how to implement that, I'm certainly open to it.

@gold2718
Copy link

@billsacks,
Sorry for not being clear, I was just saying that all standard Fortran namelist syntax needs to work with the consequence being that += functionality cannot change any standard functionality.
I think this is just an implementation issue -- with any luck, += can be made to do what you want without breaking any standard functionality.

@billsacks
Copy link
Member

@gold2718 - thanks. I agree that standard Fortran namelist syntax should work. So, since += isn't supported at all in standard Fortran namelists, does that mean we're free to define it as we want, in your view?

And do I understand correctly that you feel we should close #1397 ?

@gold2718
Copy link

@billsacks, yes the definition is up to the CIME community (and should be carefully documented as an extension). Perhaps the documentation can provide some examples. Speaking of defining +=, what should this do?

item = 'foo', 'bar'
item += 'baz'
item(5) = 'mom'
item += 'qux'

Does the current PR do that?

@billsacks
Copy link
Member

what should this do?

item = 'foo', 'bar'
item += 'baz'
item(5) = 'mom'
item += 'qux'

Does the current PR do that?

As I mentioned in #1389 (comment) , it really feels like there are two classes of list variables: ones where position is important, and ones where position is unimportant. The item += syntax is useful for lists where position is unimportant, whereas the item(5) = syntax is useful for lists where position is important. So I'm having trouble coming up with a real use case where someone would want to do what you're asking about – and so, personally, I'd be fine with this behavior being undefined, until we have a real use case that suggests what the behavior should be.

That said, if I were forced to define this behavior, I'd say that item would have the following values:

item = 'foo', 'bar'
! Now item is 'foo', 'bar'
item += 'baz'
! Now item is 'foo', 'bar', 'baz'
item(5) = 'mom'
! Now item is 'foo', 'bar', 'baz', [undefined], 'mom'
item += 'qux'
! Now item is 'foo', 'bar', 'baz', [undefined], 'mom', 'qux'

@gold2718
Copy link

It seems that your solution to my use case is not ideal (as it may break the Fortran code that is dealing with these lists are is not expecting blanks).
I'm not sure how the implementation is going but if possible, I would suggest that += not allow 'holes'. In that case, my example above would result in an error (on the item += 'qux' line). I think this behavior would be in line with us getting to define += behavior to suit our use cases (which are these lists that grow like history field lists).

@billsacks
Copy link
Member

It seems that your solution to my use case is not ideal (as it may break the Fortran code that is dealing with these lists are is not expecting blanks).
I'm not sure how the implementation is going but if possible, I would suggest that += not allow 'holes'. In that case, my example above would result in an error (on the item += 'qux' line). I think this behavior would be in line with us getting to define += behavior to suit our use cases (which are these lists that grow like history field lists).

I'm confused. In your example, isn't the hole introduced by this line?:

item(5) = 'mom'

@ekluzek
Copy link
Contributor

ekluzek commented Apr 27, 2017

I agree with @gold2718 that as an extension to the standard FORTRAN namelist this feature should be well documented as an extension. One advantage of sticking only to the FORTRAN standard for namelist that are supported by the compilers we use -- is you don't have to document how the interface works you can point to any number of things that documents namelists. Making sure behavior matches an established interface means we don't have to document something, and our documentation is almost always lacking.

That said I agree with @billsacks that this is a useful and important feature for CLM. And we've already pointed out that we've moved beyond the standard in a few other areas. As for the specific case being thrown around. I'm pretty sure we already allow more than one namelist setting to appear and the second one overrides the first as that is what a FORTRAN namelist does.

So in the case with item(5) one resolution would be to treat "item(5)" as different from "item" and let FORTRAN resolve the following in the final namelist

item = 'foo', 'bar', 'baz', 'qux'
item(5) = 'mom'

So when it goes into FORTRAN it becomes: item = 'foo', 'bar', 'baz', 'qux', 'mom'

Another option which I think I prefer, but I'm not sure it can be easily done -- is to not allow positional arguments alongside "+=". But, you'd have to keep track if either are used.

My main point though is that this is an important feature that we need to keep. So as long as we have some resolution to this...

@gold2718
Copy link

Yes, the 'hole' is introduced by item(5) = 'mom' but that is a normal operator so we have to do what it says (but see below). However, if it is followed by a += operator, then the 'hole' check can be performed.
If a user manages to do something dumb like:

item = 'foo', 'bar'
item += 'baz'
item(5) = 'mom'

for a case where the Fortran code expects a compact list (no 'holes'), then they will not get expected behavior and I'm okay with that. The alternative is to try and implement @ekluzek's suggestion:

Another option which I think I prefer, but I'm not sure it can be easily done -- is to not allow positional arguments alongside "+=".

but, as he pointed out, that may be hard to do.

@billsacks
Copy link
Member

Another option which I think I prefer, but I'm not sure it can be easily done -- is to not allow positional arguments alongside "+=". But, you'd have to keep track if either are used.

One way to do this would be: For any namelist item that is a list, have an extra attribute in namelist_definition.xml that says whether it is a positional list or a non-positional list. Positional lists support assignment to a specific index but not +=; non-positional lists support += but not assignment to a specific index.

Personally, though, I feel that this is getting overly paranoid for something that I feel is unlikely to arise in practice.

And actually, if people are worried about this feature being misused, I'll throw out one more option: Don't document it at all. It seems reasonable to have some undocumented features that we use internally for our testing but don't expect users to use. However, the debate in this thread has gone on long enough that I don't want to start yet another debate with this suggestion, so if you don't like it, a simple "no" will suffice :-)

@erichlf
Copy link
Author

erichlf commented Apr 27, 2017

@gold2718 The way the code was before I touched it given your example:

item = 'foo', 'bar'
item(5) = 'mom'

would have given
[u'foo', u'bar', u'', u'', u'mom']
I have not changed this functionality. The only thing my PR would change is allowing for ease of adding a term to the end of a list using the '+=' operator. This way a user wouldn't have to know exactly what index was the last term.

@billsacks Yes, my code is ready for review.

@gold2718
Copy link

@erichlf, It sounds like the current functionality matches @billsacks' 'undocumented internal test functionality' option.
I move that we move forward with the current PR. Have enough tests been done? Is the PR comment field complete?

Copy link
Member

@billsacks billsacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @erichlf . I think this is close. In addition to my one new line comment:

(1) It looks like this still needs tests of += that cover _parse_namelist_group (which I don't think is currently covered in this respect - correct me if I'm wrong).

(2) Also, please add tests of the case where the initial setting of 'foo' is foo += 'bar'. I think this would look like this (untested):

        >>> _NamelistParser("foo+='bar'", groupless=True).parse_namelist()
        OrderedDict([(u'foo', [u"'bar'"])])

and similarly for _parse_namelist_group.

(3) Finally, once you're done, please rerun scripts_regression_tests without --fast.

values = merge_literal_lists(dsettings, values)
if addto:
values = self._settings[name] + values
if not addto:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it works correctly now, but I'm still wondering why it looks different from the analogous code below:

                if name in self._settings:
                    if addto:
                        values = self._settings[name] + values
                    else:
                        values = merge_literal_lists(self._settings[name], values)

My understanding is that the two blocks of code are doing the same thing, so I'd find it easier to read & understand this code if they looked identical (or called some shared function). I realize that they looked a little different in the baseline code, but now that this code block is getting more complicated, it would be good to make them uniform, so that the true differences between the two don't get lost in accidental differences like this.

This would amount to a small logic change, in that the 'if not addto' block would now only be executed if name in self._settings is true; but if I understand the code correctly, that shouldn't matter, because: if (not addto) is true but (name in self._settings) is false, then you're calling merge_literal_lists with dsettings = [], which looks to me like a no-op.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beats the heck out of me why it is different. I don't seem to understand it well enough, but I do know that it works.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I believe I addressed your issues in the latest push.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I guess I'm okay with your leaving this as is. However, particularly since the two versions look different, I feel that all of the += tests that you have for parse_namelist should be duplicated for _parse_namelist_group (because knowing that this code works right in one doesn't tell you that it works right in the other).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_parse_namelist_group only parses groups. If you look at the tests in that function there is only group testing without variable testing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's right. My understanding from a quick read of the code is that _parse_namelist_group is for parsing variables that belong to a namelist group, which is actually the case for most variables. Indeed, I have confirmed that, when I remove these lines:

--- a/scripts/lib/CIME/namelist.py
+++ b/scripts/lib/CIME/namelist.py
@@ -2105,8 +2105,6 @@ def _parse_namelist_group(self):
             if self._groupless:
                 if name in self._settings:
                     dsettings = self._settings[name]
-                    if addto:
-                        values = self._settings[name] + values
                 if not addto:
                     values = merge_literal_lists(dsettings, values)
                 self._settings[name] = values

all doctests still pass, yet the behavior is incorrect when I have this in user_nl_cpl:

cplflds_custom = 'foo'
cplflds_custom += 'bar'

In this case, I get:

$ grep cplflds_custom CaseDocs/drv_in
  cplflds_custom = 'bar'

As far as I can tell, the critical code in _parse_namelist_group is doing the same thing as the analogous code in parse_namelist - it's just that one is used for variables that belong to a group and the other is used for variables that don't belong to a group (I didn't know there were any variables in the latter category, actually). This was the motivation for my suggestion of making them share code, or at least look identical. I still understand if you don't want to do that out of fear of breaking something, but if they don't share code, then it's important that there be tests that cover both blocks of nearly-identical code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that I can explain the group vs no-group thing: variables read from namelist definition xml files always have a group associated with them as it is defined in the xml file. However variables read from user_nl_* do not immediately have a group associated with them and so fall into that no group category. I was not the author of this code and admit to finding it less than intuitive...

I don't think that's completely right. I just tried introducing these diffs:

--- a/scripts/lib/CIME/namelist.py
+++ b/scripts/lib/CIME/namelist.py
@@ -2102,6 +2102,7 @@ def _parse_namelist_group(self):
         while self._curr() != '/':
             name, values, addto = self._parse_name_and_values()
             dsettings = []
+            print("Parsing a group: name, groupless = %s, %s"%(name, self._groupless))
             if self._groupless:
                 if name in self._settings:
                     dsettings = self._settings[name]
@@ -2173,6 +2174,7 @@ def parse_namelist(self):
         if self._groupless and self._curr() != '&':
             while self._pos < self._len:
                 name, values, addto = self._parse_name_and_values(allow_eof_end=True)
+                print("Parsing groupless: %s"%(name))
                 if name in self._settings:
                     if addto:
                         values = self._settings[name] + values

and when I ran ./preview_namelists with a user_nl_cpl file that sets cplflds_custom, I got:

Parsing a group: name, groupless = cplflds_custom, True
Parsing a group: name, groupless = cplflds_custom, True

I don't fully understand the operation of this code, but I am confident that _parse_namelist_group is being called to parse at least some entries in user_nl files, and so needs to be covered by tests.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I was thinking about _parse_namelist_group_name. I have now pushed that last change.

Added some unit tests in _parse_namelist_group and parse_namelist to
ensure '+=' works as expected.

Test suite: `scripts_regression_tests.py`
and `python -m doctest CIME/namelist.py`
Test baseline:
Test namelist changes: adds '+=' operator
Test status: bit for bit

Fixes #839

User interface changes?: Added support for '+=' to namelist parser.

Code review: @billsacks @jgfouca
@jedwards4b
Copy link
Contributor

I think that I can explain the group vs no-group thing: variables read from namelist definition xml files always have a group associated with them as it is defined in the xml file. However variables read from user_nl_* do not immediately have a group associated with them and so fall into that no group category. I was not the author of this code and admit to finding it less than intuitive...

There was no unit test to check '+=' in parse_namelist_group, so I have
added one.

Test suite: python -m doctest CIME/namelist.py
Test baseline:
Test namelist changes: '+=' support
Test status: bit for bit

Fixes #839

User interface changes?: Added support for '+=' to user_nl_*

Code review: @billsack
@billsacks
Copy link
Member

@erichlf - thanks for adding that test.

You're going to hate me, but as I've looked more at this code for the above discussion, I have one more question: Should += (i.e., addto) also be handled in this final block of code in _parse_namelist_group?

            else:
                group = self._settings[group_name]
                if name in group:
                    dsettings = group[name]
                values = merge_literal_lists(dsettings, values)
                group[name] = values

It looks like that code is doing basically the same thing as the two other blocks that you have changed, so I'm wondering why this block doesn't also handle '+='. I started trying to trace the logic of when groupless is true vs. false, but gave up. I feel that this block should either (a) handle addto, or (b) if it's not supposed to handle addto, then it should raise an exception if addto is True.

When groupless was False the '+=' operator did not work in
_parse_namelist_group. I have added a unit test for this case and
changed the logic of the code to handle this case.

Test suite:
Test baseline: python -m doctest CIME/namelist.py
Test namelist changes: '+=' support
Test status: bit for bit

Fixes #839

User interface changes?: '+=' support in user_nl_*

Code review: @billsacks
@erichlf
Copy link
Author

erichlf commented Apr 28, 2017

@billsacks No, that was a good catch. I have fixed this and added the appropriate test.

There was no test in _preview_namelist_group which test the case
foo+=bar without anything being set in foo before.

Test suite: python -m doctest CIME/namelist.py
Test baseline:
Test namelist changes: '+=' support
Test status: bit for bit

Fixes #839

User interface changes?: '+=' support in user_nl_*

Code review: @billsacks
@erichlf
Copy link
Author

erichlf commented Apr 28, 2017

@billsacks I just added one more test to _parse_namelist_group which test foo+='bar' without foo being set previously.

@billsacks
Copy link
Member

@erichlf - Okay, this looks ready to go! I added one last missing unit test that I noticed, and ran ./scripts_regression_tests.py A_RunUnitTests.

Can you please let me know when you have rerun scripts_regression_tests (without --fast) on the final version? At that point, I'll merge it to master.

Thank you very much for all of your work on this feature!

@erichlf
Copy link
Author

erichlf commented Apr 28, 2017

@billsacks All pass.

@billsacks billsacks merged commit 84b7bde into master Apr 28, 2017
@rljacob rljacob deleted the erichlf/namelist/plusEqual branch April 21, 2018 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants