Shallow-copy mutable containers in slot values #826

jbednar · 2023-08-30T04:53:19Z

Fixes #793.
Fixes #746.

Adds shallow copying of mutable slot values when instantiating Parameter objects or inheriting values for slots. Shallow copying is only done when per_instance is True (the default); if you don't want such copying you can share both Parameter instances and mutable slot values by setting per_instance=False.

import param

class A(param.Parameterized):
    p = param.Selector(objects=[1, 2], check_on_set=False)

class B(A):
    p = param.Selector(default=2)


b = B()
b.p = 3

print(A.param.p.objects, B.param.p.objects, b.param.p.objects)
# [1, 2] [1, 2, 3] [1, 2, 3]

(previously A's objects list was also [1, 2, 3]; now fixed to be [1, 2])

There are various issues still, as listed below.

Tests

Needs a test capturing the example above, but I've got to run, so if anyone wants to add that I'd be grateful, or I can do that tomorrow.

Previous handling of watchers

PR #306 added this logic to Parameters.getitem() for shallow-copying watchers:

try:
    # Do not copy watchers on class parameter
    watchers = p.watchers
    p.watchers = {}
    p = copy.copy(p)
except:
    raise
finally:
    p.watchers = {k: list(v) for k, v in watchers.items()}

This PR generalizes such copying to apply to all the slot values besides default, but I was unable to determine why the existing code was so complex. First, why p.watchers = {k: list(v) for k, v in watchers.items()} instead of just p.watchers = copy.copy(watchers.items())? copy.copy is used just a couple of lines before, and when I use it here it seems to work just the same, as I'd expect, so I can't tell why it was a list comprehension that appears to achieve the same as copy.copy.

Second, what's with the try/except/finally? Deleting all that seems to work just fine, and I can't quite imagine a situation when the try block could fail but the watchers are still valid to reconstruct finally. Maybe something in Panel? For now I deleted it as it does not seem to achieve anything.

Third, I've kept the convoluted mechanism for zeroing-out the watchers in the original Parameter object when creating the new one. Simply shallow-copying the watchers after copying the Parameter object leaves an extra set of watchers that causes tests to fail, but I'm not at all sure why it's appropriate to move the watchers to the new Parameter object rather than where they were originally watching. For now I chose to be conservative, but I'm not confident why it was done this way, so if there's a reason, it should have a clear comment added.

What's mutable and when is it copied?

For now, mutable sequences, mappings, and sets are considered mutable, which covers the list and dict cases that I'm aware of us using in slots. Such values are shallow-copied for all slots except default, to ensure that any independent Parameter copies also have independent objects lists, etc. Parameterized objects are also mutable, but I don't know if they are ever used in slots. If they are, we should consider whether they should be shallow-copied, but my guess is that they should not, since it's largely containers that we want shallow-copied, not arbitrary instantiated objects. Maybe is_mutable should be renamed is_mutable_container to convey that?

The reason for not copying the default value is to allow a specific list or dict to be shared across Parameters, which has always been supported. E.g. one could have a few global lists like search_paths, debug_search_paths, etc., whose values are curated and maintained independently of which particular Parameters have been set to those values. We can revisit whether mutable containers should be copied for default too, but if so, it should be a separate PR with its own justification.

I currently check for default only when copying the Parameter object, not during inheritance. It might also be appropriate to skip shallow-copying for for inheritance as well, but I haven't found an example where that would come up. Probably worth doing?

_update_state

After merging this PR, I believe we should be able to remove _update_state as discussed in #807 and #817 . _update_state was needed because we had to wait for param inheritance before adding to the objects lists for Selectors, but now that values get shallow copied, it should be feasible to populate the objects lists (e.g. from the default value) immediately, during the constructor. Doing so should simplify the Selector logic, but I haven't yet tested that this PR's approach will achieve that goal.

My guess is that eliminating _update_state is required, for fixing this remaining bad behavior:

import param

class A(param.Parameterized):
    p = param.Selector(objects=[1, 2], check_on_set=False, per_instance=True)
    
class B(A):
    i = param.Integer(8)

b = B()
b.p = 3

print(A.param.p.objects, B.param.p.objects, b.param.p.objects)
#[1, 2, 3] [1, 2, 3] [1, 2, 3]

Here the Parameter object isn't copied into B, and the new value 3 ends up in A's object list, which it should not. I think that may be from _update_state being called on the class's copy of the objects list, not the instance's, since it's called in the metaclass, but I haven't tracked that down.

param/parameterized.py

tests/testparameterizedobject.py

param/parameterized.py

maximlt · 2023-08-30T12:42:24Z

on Previous handling of watchers, I agree we need to better understand what was going on there, hopefully Philipp can remember some of that otherwise we'll have to try to dig more into it
on _update_state, I wish we could get rid of it but I also know that the Selector(s) Parameters are quite dynamic (some slot default being callable that depend on the state of other slots) and are difficult to setup correctly when the slots aren't populated in the expected order, which happens in an inheritance context.

What do you think about the examples below? In the first snippet the code accesses the Parameter before setting p.s to 3, to run __getitem__ and shallow-copy the objects. In the second case this is commented out, leading to the objects on the class Parameter being updated as the Parameter doesn't yet exist on the instance. This is similar to the last example you shared @jbednar, without any inheritance though.

class P(param.Parameterized):
    s = param.Selector(objects=[1, 2], check_on_set=False)

p = P()
p.param.s.objects  # runs `__getitem__`
p.s = 3
assert P.param.s.objects == [1, 2]

class P(param.Parameterized):
    s = param.Selector(objects=[1, 2], check_on_set=False)

p = P()
# p.param.s.objects
p.s = 3
assert P.param.s.objects == [1, 2, 3]

…ance

jbednar · 2023-08-30T13:32:39Z

@maximlt , that's a good point:

Maybe we need to populate the Parameter objects more proactively. Any suggestions for where to make that change?

maximlt · 2023-08-30T15:09:33Z

Any suggestions for where to make that change?

Maybe at the start of Parameters._setup_params which is called from Parameterized.__init__?

maximlt · 2023-09-01T12:36:26Z

Actually the behavior I reported in my previous comment is not tied to mutable containers.

Code

import param

class P(param.Parameterized):
    x = param.Number()

p = P()

P.param.x.bounds = (100, 200)

assert p.param.x.bounds == (100, 200)

class P(param.Parameterized):
    x = param.Number()

p = P()

p.param.x  # side-effects there

P.param.x.bounds = (100, 200)

assert p.param.x.bounds is None

Accessing a Parameter on the .param namespace creates a Parameter on the instance. I'm not sure it's really a bug, certainly it's confusing, but really the way to get a Parameter you can control at the class level should be with per_instance=False.

philippjfr · 2023-09-02T17:44:25Z

My reading of the watchers code above is that it's not fully correct. The problem being that we can't distinguish between watchers that are meant to be on the class parameter and watchers that are meant to be on the instance parameter since there is no difference until the instance parameter is created, i.e.:

param.Parameterized.param.watch(print, 'name', what='allow_None')

looks exactly the same as:

p = param.Parameterized()
p.param.watch(print, 'name', what='allow_None')

In general it seems very rare to have watchers for attributes on class parameters so this hasn't come up, but it's definitely not correct. I think this is another issue we can't easily fix without making instance parameter creation non-lazy.

Second, what's with the try/except/finally? Deleting all that seems to work just fine, and I can't quite imagine a situation when the try block could fail but the watchers are still valid to reconstruct finally. Maybe something in Panel? For now I deleted it as it does not seem to achieve anything.

I do think it's important, at least if we decide not to fix the underlying issue yet. If there is somehow an error while creating the instance parameter (e.g. because some value is not safe to copy) then the watchers end up being lost entirely.

jbednar · 2023-09-03T19:58:53Z

If there is somehow an error while creating the instance parameter (e.g. because some value is not safe to copy) then the watchers end up being lost entirely.

How would it help to have watchers copied correctly but not whatever slot value was about to be copied after the failing one? This try/except/finally seems like it would just cover up problems, not avoid them -- why would we want watchers copied correctly when other slots aren't? If we wanted to copy all the slots that we could, and put a warning for any that didn't work, wouldn't we do that in a loop where the try/except was per slot, not for the entire copy operation?

philippjfr · 2023-09-03T20:01:55Z

It's not about the object being copied to but about restoring the class that's being copied from.

jbednar · 2023-09-03T20:21:44Z

Hmm; that's weird. My understanding of that code is that if copy.copy fails, an exception will be raised, with p remaining the class Parameter object, except that p.watchers has been replaced with a copy of the original dict. Seems like a lot of hoops to be jumping back and forth through to achieve that outcome, i.e. emptying out the watchers and then replacing it with something just like the original.

And then if copy.copy succeeds, no exception will be raised, and p will become the instance's copy of the Parameter, where the instance p.watchers will have a copy of the original dict. But won't the original class Parameter now have an empty watchers dict? How is this working at all when there are multiple instances that need to have the same Parameter instantiated into it? Only the first one gets the watchers?

Unless I'm very confused, it seems like we have two defensible options here: (1) Leave watchers on class Parameter objects, and don't copy them to instances. Instances would need their own watchers, which would be specific to that instance. Watchers on class Parameter objects would be watching the class Parameter object, not instances. (2) Leave watchers on Class Parameter objects, and copy them to all instances. Thus anything that watches the class Parameter will also be watching all the instances.

I am not seeing any argument for what it seems like the current code is doing, i.e. emptying out the watchers from the class Parameter object after instantiation. When is that ever appropriate?

jbednar · 2023-09-08T05:06:40Z

@maximlt , I've pushed a few more commits to address the issues above:

In 9ba4aa1 I've restored the watchers logic to its original state using try/except/finally; if it's to be addressed, that should be done in a separate PR. It should now be unchanged from before this PR, just refactored differently.
In 7fcf8d4 I refactored the new Parameter.__getitem__ implementation to pull out the logic for copying Parameter objects per_instance into a new function _instantiated_parameter(). This commit should not change anything about the behavior; it just makes that logic reusable elsewhere.
In 6ce1ef3 I make Parameter.__set__ use _instantiated_parameter() to ensure that the Parameter object has been copied per_instance before any setting is done, so that e.g. _ensure_value_is_in_objects will affect the per_instance Parameter object instead of the class's Parameter object. In this commit I had to update one warning message because it's being generated another level deep in the stack trace, plus I deleted one newly-added line from the tests that was essentially enforcing that the objects on a Selector were stored only in the class, which this commit deliberately changes.

6ce1ef3 is the key one for addressing the problems above. After that commit, I believe these behaviors are now correct:

import param

class A(param.Parameterized):
    p = param.Selector(objects=[1, 2], check_on_set=False, per_instance=True)

class B(A):
    i = param.Integer(8)

b = B()
b.p = 3

print(A.param.p.objects, B.param.p.objects, b.param.p.objects)
# [1, 2] [1, 2] [1, 2, 3]

(b now gets its own per_instance copy of the objects list)

class P(param.Parameterized):
    s = param.Selector(objects=[1, 2], check_on_set=False)

p = P()
p.param.s.objects
p.s = 3
print(P.param.s.objects, p.param.s.objects)
#[1, 2] [1, 2, 3]

class P(param.Parameterized):
    s = param.Selector(objects=[1, 2], check_on_set=False)

p = P()
p.s = 3
print(P.param.s.objects, p.param.s.objects)
#[1, 2] [1, 2, 3]

(invoking __getitem__ is no longer required for p to get its own per_instance copy of the s Parameter object)

I'm ok with keeping the behavior you show in 3. above, i.e. that accessing a Parameter on the .param namespace creates a Parameter on the instance. I agree that if you want to control a Parameter at the class level, you can either change the class values before creating any instances, or you can set per_instance=False. If you change the Parameter at the class level after having instantiated some classes, there's no guarantee that the class-level changes will propagate down. As you show, sometimes they do, but really if you want such changes you have to make them on the instance or you have to set per_instance to False!

So far, I haven't messed with _update_state, because changing that wasn't necessary for anything to do with instantiating slots.

So this PR should be ready to review and merge now; it now does what it says: adds shallow-copying for Parameter slot values that happen to be mutable containers, whenever a Parameter object is copied to an instance or inherits its slot values. Parameter objects are still copied only lazily, as before, but now there is one more case where it gets copied, namely when a parameter value is set on the instance (to handle _ensure_value_is_in_objects).

maximlt

I left a question in my review. In addition to that, I would like to get your opinion @jbednar on the following examples?

class A(param.Parameterized):
    p = param.Selector(objects=[1, 2])

class B(A):
    p = param.Selector(default=2)

B.param.p.objects.append(3)
print(A.param.p.objects)  # [1, 2]

class A(param.Parameterized):
    p = param.Selector(objects=[1, 2])

class B(A):
    pass

B.param.p.objects.append(3)
print(A.param.p.objects)  # [1, 2, 3]

param/parameterized.py

jbednar · 2023-09-13T23:47:52Z

I assume everyone is happy with the behavior in case 1 (two separate Parameter objects, and two separate objects lists)?

Case 2 is trickier. On the one hand, the current behavior in 2 seems like the only way to define a Parameter object whose settings can easily be inherited across a hierarchy. I.e., not copying the Parameter object between A and B is what lets people mess with the one in A and set things up as they like, while knowing that it will all propagate nicely. Plus, if they do want separate Parameter objects in subclasses, it's easy enough to just put them there, as in 1.

On the other, I do agree that there's some argument for duplicating the Parameter automatically to make case 2 act like case 1, so that they behave independently. On balance I think it's ok that we aren't doing that; seems like we lose more than we gain.

philippjfr · 2023-09-15T14:53:15Z

I'll be addressing watcher related issues in a separate PR. I also think that on balance the behavior in the example above is defensible. So I'm going to consider this ready and merge.

jbednar added 2 commits August 29, 2023 23:41

Shallow-copy mutable slots when instantiating or inheriting

faed678

Set per_instance=False to avoid shallow-copying

b3e6da1

jbednar added this to the 2.0 milestone Aug 30, 2023

jbednar requested review from philippjfr and maximlt August 30, 2023 04:53

maximlt reviewed Aug 30, 2023

View reviewed changes

jbednar added 4 commits August 30, 2023 08:24

Reverted unnecessary test change

41e3e75

Moved is_mutable to _utils/_is_mutable_container

d715f0b

Make handling of default consistent between instantiation and inherit…

b7ec94f

…ance

Fixed flake

769623e

jbednar and others added 2 commits August 30, 2023 11:06

Restore semi-shallow copying of watchers

0de38ca

add tests

5747f2f

maximlt mentioned this pull request Sep 4, 2023

Class-level watchers and copying watchers behaviors #829

Closed

jbednar added 4 commits September 7, 2023 22:55

Refactor getitem (no change in logic)

7fcf8d4

Instantiate parameter object on instance when setting value

6ce1ef3

Restore watchers handling to deal with in a separate PR

9ba4aa1

File cleanup; no code change

326799e

jbednar mentioned this pull request Sep 8, 2023

Selector objects shared across instances and class #746

Closed

jbednar changed the title ~~Shallow-copy slot values~~ Shallow-copy mutable containers in slot values Sep 8, 2023

maximlt added 2 commits September 13, 2023 14:30

small refactoring

60daf93

add more tests

dada602

maximlt reviewed Sep 13, 2023

View reviewed changes

param/parameterized.py Show resolved Hide resolved

param/parameterized.py Show resolved Hide resolved

philippjfr merged commit ae8c6fb into main Sep 15, 2023

philippjfr deleted the shallowcopyslots branch September 15, 2023 14:53

maximlt mentioned this pull request Sep 15, 2023

Ensure Parameter re-validates default if slot changes #820

Merged

philippjfr mentioned this pull request Sep 15, 2023

Ensure class watchers are not inherited by instance parameter #833

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shallow-copy mutable containers in slot values #826

Shallow-copy mutable containers in slot values #826

jbednar commented Aug 30, 2023 •

edited by maximlt

Loading

maximlt commented Aug 30, 2023

jbednar commented Aug 30, 2023

maximlt commented Aug 30, 2023

maximlt commented Sep 1, 2023 •

edited by jbednar

Loading

philippjfr commented Sep 2, 2023

jbednar commented Sep 3, 2023

philippjfr commented Sep 3, 2023

jbednar commented Sep 3, 2023

jbednar commented Sep 8, 2023 •

edited

Loading

maximlt left a comment

jbednar commented Sep 13, 2023

philippjfr commented Sep 15, 2023

Shallow-copy mutable containers in slot values #826

Shallow-copy mutable containers in slot values #826

Conversation

jbednar commented Aug 30, 2023 • edited by maximlt Loading

Tests

Previous handling of watchers

What's mutable and when is it copied?

_update_state

maximlt commented Aug 30, 2023

jbednar commented Aug 30, 2023

maximlt commented Aug 30, 2023

maximlt commented Sep 1, 2023 • edited by jbednar Loading

philippjfr commented Sep 2, 2023

jbednar commented Sep 3, 2023

philippjfr commented Sep 3, 2023

jbednar commented Sep 3, 2023

jbednar commented Sep 8, 2023 • edited Loading

maximlt left a comment

Choose a reason for hiding this comment

jbednar commented Sep 13, 2023

philippjfr commented Sep 15, 2023

jbednar commented Aug 30, 2023 •

edited by maximlt

Loading

maximlt commented Sep 1, 2023 •

edited by jbednar

Loading

jbednar commented Sep 8, 2023 •

edited

Loading