Parameter instances can compare equal while having different hash values #11654

wshanks · 2024-01-26T20:38:10Z

Environment

Qiskit Terra version: 0.45.2
Python version: 3.12
Operating system: Fedora Linux 39

What is happening?

Currently, two Parameter instances can compare equal while not having the same hash value. This condition violates the Python data model which is documented as requiring:

The only required property is that objects which compare equal have the same hash value

How can we reproduce the issue?

a = Parameter("a")
b = Parameter("b", uuid=a._uuid)

a == b  # True
hash(a) == hash(b)  # False

What should happen?

Here there is a decision to make. Parameters are essentially a tuple of a name string and a Python UUID. Currently, __eq__ uses only the UUID while __hash__ uses the name and the UUID. The two methods should use the same data.

In #11647 (comment), @jakelishman mentioned that the documentation states that the name is required for equality, so I created #11652 to add the name to the equality check (I think @jakelishman was referring to "Setting the uuid field when creating two parameters to the same thing (along with the same name) allows them to be equal." in the Parameter.__init__ docstring). However, in the pulse module, there is code that makes use of the fact that parameter names are not part of the equality test by redefining the name:

qiskit/qiskit/pulse/schedule.py

Lines 882 to 912 in ce02f06

    
               .. rubric:: Program Scoping 
        
               When you call a subroutine from another subroutine, or append a schedule block 
        
               to another schedule block, the management of references and parameters 
        
               can be a hard task. Schedule block offers a convenient feature to help with this 
        
               by automatically scoping the parameters and subroutines. 
        
               .. code-block:: 
        
                   from qiskit import pulse 
        
                   from qiskit.circuit.parameter import Parameter 
        
                   amp1 = Parameter("amp") 
        
                   with pulse.build() as sched1: 
        
                       pulse.play(pulse.Constant(100, amp1), pulse.DriveChannel(0)) 
        
                   print(sched1.scoped_parameters()) 
        
               .. parsed-literal:: 
        
                  (Parameter(root::amp),) 
        
               The :meth:`~ScheduleBlock.scoped_parameters` method returns all :class:`~.Parameter` 
        
               objects defined in the schedule block. The parameter name is updated to reflect 
        
               its scope information, i.e. where it is defined. 
        
               The outer scope is called "root". Since the "amp" parameter is directly used 
        
               in the current builder context, it is prefixed with "root". 
        
               Note that the :class:`Parameter` object returned by :meth:`~ScheduleBlock.scoped_parameters` 
        
               preserves the hidden `UUID`_ key, and thus the scoped name doesn't break references 
        
               to the original :class:`Parameter`.

So we have to decide which way to go:

Remove name from the hash calculation so that only the UUID matters for Parameter instances. In practice, this is probably fine since the UUID's should always be unique.
Add name to the equality test and rework the pulse code. It is just using the renaming for convenience of labeling parameters. Perhaps the labeling could be done differently (scoped_parameters could return a dict with the keys being the scoped parameter names?). I don't see scoped_parameters used anywhere here or qiskit-experiments currently.

I lean toward option 2. I think a key question is how tied is Parameter to ParameterExpression (technically it is a direct subclass so very tied 🙂). If Parameter just needs to be a variable that floats around through code objects and can later be assigned a value, allowing different names seems fine. However, if it is important that a Parameter can always be used in an expression and then later substituted for a different value, allowing different names is a problem because ParameterExpression maps the names directly to sympy/symengine symbols and wants to substitute them using the sympy/symengine mechanism. If Parameters are being passed around and manipulated, you could end up calling assign with a Parameter instance with a different name from the one that was used to create the underlying symbolic expression of a ParameterExpression. Or you could say that both use cases are allowed -- when doing simple value substitution renaming Parameters is okay; when doing symbolic expressions, you have to be careful not to reuse UUID's with different names.

@nkanazawa1989 What do you think as the author of the scoped_parameters code?

Any suggestions?

No response

The text was updated successfully, but these errors were encountered:

ihincks · 2024-01-26T20:50:35Z

Doesn't something like 1. need to happen in any case because of the Python data model violation? I'm having trouble imagining a case where implementing 1. would be worse than the current situation. Of course, this doesn't imply that 1. alone is the optimal route here.

wshanks · 2024-01-26T20:54:22Z

Sorry, I should have started 2 by saying "Add name to the equality test and rework the pulse code" (edited it to say that now). The choice is either make both eq and hash use only UUID or make them both use UUID and name.

ihincks · 2024-01-26T20:55:53Z

Ah okay, that makes more sense, thanks for clarifying.

jakelishman · 2024-01-26T22:48:17Z

Ah yeah, that's not ideal about the pulse code. I'd also lean in favour of 2, since it's the documented behaviour and imo the less surprising version of equality.

With the public uuid getter in place, if Pulse wants to continue using its current system, maybe it could mechanically change all the equality checks to a.uuid == b.uuid and it'd generally work?

wshanks · 2024-01-27T04:48:33Z

One thing that is interesting is that the pulse code actually suffers from this bug. The ScheduleBlock docstring says that

Note that the Parameter object returned by scoped_parameters() preserves the hidden UUID key, and thus the scoped name doesn’t break references to the original Parameter.

However, an important use case of parameter referencing is broken as demonstrated by this code:

from qiskit import pulse
from qiskit.circuit import Parameter


f = Parameter("f")
with pulse.build() as sched:
    pulse.set_frequency(f, pulse.DriveChannel(0))


scoped_f = next(iter(sched.scoped_parameters()))

print(f"{f=}")
print(f"{scoped_f=}")

print(f"{sched=}")
print(f"assign f to 5: {sched.assign_parameters({f: 5}, inplace=False)}")
print(f"assign scoped_f to 5: {sched.assign_parameters({scoped_f: 5}, inplace=False)}")

which prints

f=Parameter(f)
scoped_f=Parameter(root::f)
sched=ScheduleBlock(SetFrequency(f, DriveChannel(0)), name="block0", transform=AlignLeft())
assign f to 5: ScheduleBlock(SetFrequency(5, DriveChannel(0)), name="block0", transform=AlignLeft())
assign scoped_f to 5: ScheduleBlock(SetFrequency(f, DriveChannel(0)), name="block0", transform=AlignLeft())

Because the parameter assignment code uses dicts and sets to manage the parameters, the fact that f and scoped_f do not have the same hash causes the assignment with scoped_f to fail because the test of whether scoped_f is in sched's parameters fails.

I'd also lean in favour of 2, since it's the documented behaviour and imo the less surprising version of equality.

To be fair to the pulse code, the documentation I referenced above was only added in 0.45.0 in #10875. I am not sure if there is anything else that says that a Parameter compares with UUID and name rather than just UUID? The behavior of hash using the name and UUID only started with that same change as well. Prior to that, only the UUID was used for the hash. The motivation for that change was to make Parameter and ParameterExpression instances that compare equal also hash to the same value. That motivation pushes towards treating Parameters as elements of expressions rather than simple values to be substituted (going back to the decision I mentioned before).

The problem with the pulse code is that its tests all create a schedule with parameters and then check the scoped parameters for equality with the original parameters. None of them tests try to assign a value to a schedule using the scoped parameter. Otherwise, this issue would have been caught when reviewing #10875.

Given the need to align Parameter with ParameterExpression, I think we need to add the name to the equality test of Parameter. That means that 0.45 broke the intended use case for ScheduleBlock.scoped_parameters. Maybe the best option is what you suggested just above of just living with this and having pulse code check parameters by UUID if they want to match them up. I think we could transition to a different format for the scoped parameters. Instead of returning tuple[Parameter(scoped_name), ...], it could return dict[scoped_name, Parameter(original_name)]. We could use a keyword argument to select the output and change the default format to the new one in a later release if we wanted to keep backwards compatibility. I think the scoped parameters is a niche enough use case and the current behavior is difficult to work with enough that I wouldn't rule out just switching the return format now and treating that as a bug fix, since I am not sure there is a good fix without breaking compatibility.

wshanks · 2024-01-27T15:01:00Z

I thought of one way to fix the pulse code with equality testing name and UUID. We could add a new ScopedParameter subclass of Parameter that keeps the _name and _uuid from Parameter for __eq__ and __hash__ but also stores a _scope attribute which it uses to override the name property. I think this is overkill (I think just returning a dict fits the need for scoped_parameters without adding a new class) and is still vulnerable to code assuming that name matches the symbol in symbolic expressions, but it would fit the intended current API behavior.

jakelishman · 2024-01-29T10:09:33Z

I'm always a bit nervous around introducing subclasses on objects that weren't intended to have them, because it introduces an extra layer that can get out of sync. For example here, it'd be easy for us to have the hash computer (accidentally or otherwise) from within Parameter using the _name attribute, but ofc that would be out of sync with a subclass - the point being that subclassing and overriding name breaks the assumption that Parameter is allowed to make within its own code for performance that _name directly backs name. That said, the size of the change is small, and Parameter is immutable, so the chances of a subclass breaking are fairly slim (and also, I won't stand in the way of how pulse wants to work internally).

For #10875 - yeah, I wasn't blaming Pulse, just trying to find the most consistent path. I hadn't realised the name/UUID equality doc got added in that PR, but I/we should have spotted the __eq__ problem at that point.

nkanazawa1989 · 2024-01-30T14:51:59Z

Indeed I was trying to introduce another parameter class (not subclass) designed for the pulse module when we investigated the design of the calibration module in Qiskit Experiments. Pulse, or calibration template, doesn't allow name conflict within the same context, and the name must be a unique property in this domain while circuit does allow the name conflict. This made me think the machinery with UUID was not necessary in the pulse domain. Unfortunately, we didn't adopt the idea, because of an issue in the parameter linking between a circuit and pulse program via the pulse gate framework.

I think the nature of pulse parameter comes from the conventional design of the calibration software -- we start from a configuration object where a plan text stores all system information. Since we have several conventional names, e.g. "amp" representing the hight of the waveform, we usually append the scope information to the parameter name, and the scope usually means the pulse kind and in which channel it is played.

We preferred the convenience of parameter link, and decided to push the complexity arising from the representation difference to the ScheduleBlock itself. I feel the current implementation is indeed too complicated to maintain, and I'd love to have another (more pulse oriented parameter) class and simplify the implementation of the schedule block. This could break the parameter link in the pulse gate because it might be difficult to automatically create mapping between different classes. On the other hand, recently I think the pulse gate is bit outdated. Considering what we are looking into, namely, writing a circuit with more than 100 qubit width, this representation will quickly become inefficient and we need more efficient mechanism.

With the refactoring in the pulse feature branch, I'd like to support the qe-compiler PulseDialect output in which they don't have the pulse gate representation. This means we introduce clear separation between circuit and pulse. Something like glueing the circuit transpiler and pulse compiler together with a IR-conversion pass which is effectively a scheduler. In this world, no pulse gate exist, and representation is much robust and scalable.

Anyways, I'm also fine with dropping the scope feature from the ScheduleBlock in 1.0 if this is really problematic (IIRC Qiskit Experiments doesn't use this feature and implements own parameter management mechanism).

wshanks · 2024-01-30T17:43:42Z

Thanks, @nkanazawa1989. Yes, I agree with the direction you outline. To summarize, I think scoped_parameters is little used and is somewhat broken in the current context of Qiskit (it worked before 0.45). It is also not aligned with where we want to go strategically with pulse programming. What I want to resolve here is what to do about it, ideally for Qiskit 1.0.

Let's assume that we accept the code currently in #11652 (Parameter __eq__ and __hash__ use both name and UUID). Then some options are:

Remove scoped_parameters. We could deprecate in 0.46 and remove in 1.0, right?
Leave scoped_parameters unchanged. Change the tests to compare UUID instead of Parameter equality as @jakelishman suggested. This is easiest but scoped_parameters is almost useless because the parameters can not be used for assignment. The documentation should cover this.
Change scoped_parameters to return a dictionary mapping scope to the original parameters. This would make it possible to look up parameters by scope but would be a breaking API change.
Add a new subclass like I suggested. I share @jakelishman's concern that other code could use _name and name interchangeably and run into confusing bugs.

@nkanazawa1989 Do you prefer 1?

nkanazawa1989 · 2024-01-31T03:57:41Z

Yes. I can quickly approve a deprecation PR to 0.46 branch if you can prepare before the deadline.

As noted in Qiskit#11654, parameter scoping does not work properly beginning with Qiskit 0.45.0. The problem is that the `Parameter` name is now used for the hash value and so the parameter produced with the scoped name is not equivalent to the original parameter and does not substitute for it when using parameter assignment. Since the scoped parameter mechanism was a convenience feature and not widely used, it is simply deprecated rather than made to work with parameter assignment.

wshanks · 2024-01-31T17:51:59Z

I opened #11654 to deprecate the scoping code. Next I will open a PR to main to remove it along with its tests which are blocking #11652. Then #11652 can close this issue. I know there is a deadline of Feb. 1 for some things. Please let me know if that is a problem for this plan, where the main goal is getting the fix for the issue here into 1.0 (I think it's okay to miss the rc as long as it makes the final release?).

jakelishman · 2024-01-31T17:56:26Z

Anything that misses rc1 should really trigger us to create an rc2 (though in practice I don't think it will unless it's critical), so it really should be a strategy of last resort.

I can briefly check a pulse PR before Naoki comes back online, but it'll be best if he does final review once he's back. After that, #11652 should be trivial and I'll throw it in the merge queue.

nkanazawa1989 · 2024-01-31T18:16:03Z

hehe I'm still online. Just approved but #11691 still needs lint fix. @jakelishman can you take over and approve it? The PR looks fine with me.

jakelishman · 2024-01-31T18:49:05Z

np, thanks Naoki - yeah, I'll do it in a bit.

The `scoped_parameters` and `search_parameters` methods have been removed from the `ScheduleBlock` class. These methods returned `Parameter` objects that partially linked to the parameters in the `ScheduleBlock` instance but assigning values using these objects did not work correctly. Users should use `ScheduleBlock.parameters` instead and iterate through `ScheduleBlock.references` and compare to the `Schedule.parameters` attributes of the subreferences when needing to distinguish which subroutine a parameter is used in. See Qiskit#11654 for more information.

* Remove pulse scoped_parameters and search_parameters The `scoped_parameters` and `search_parameters` methods have been removed from the `ScheduleBlock` class. These methods returned `Parameter` objects that partially linked to the parameters in the `ScheduleBlock` instance but assigning values using these objects did not work correctly. Users should use `ScheduleBlock.parameters` instead and iterate through `ScheduleBlock.references` and compare to the `Schedule.parameters` attributes of the subreferences when needing to distinguish which subroutine a parameter is used in. See #11654 for more information. * Lint and sphinx clean up

* Deprecate pulse reference parameter scoping As noted in #11654, parameter scoping does not work properly beginning with Qiskit 0.45.0. The problem is that the `Parameter` name is now used for the hash value and so the parameter produced with the scoped name is not equivalent to the original parameter and does not substitute for it when using parameter assignment. Since the scoped parameter mechanism was a convenience feature and not widely used, it is simply deprecated rather than made to work with parameter assignment. * black * Update releasenotes/notes/deprecate-pulse-parameter-scoping-6b6b50a394b57937.yaml Co-authored-by: Jake Lishman <jake@binhbar.com> * Move warning to documentation --------- Co-authored-by: Jake Lishman <jake@binhbar.com> Co-authored-by: Jake Lishman <jake.lishman@ibm.com>

wshanks added the bug Something isn't working label Jan 26, 2024

wshanks mentioned this issue Jan 26, 2024

Ensure that Parameters compare equal when they have the same hash #11652

Merged

jakelishman mentioned this issue Jan 26, 2024

Add uuid property to Parameter #11647

Merged

wshanks mentioned this issue Jan 31, 2024

Deprecate pulse parameter scoping #11691

Merged

wshanks mentioned this issue Jan 31, 2024

Remove pulse scoped_parameters and search_parameters #11692

Merged

jakelishman closed this as completed in #11652 Feb 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parameter instances can compare equal while having different hash values #11654

Parameter instances can compare equal while having different hash values #11654

wshanks commented Jan 26, 2024 •

edited

Loading

ihincks commented Jan 26, 2024 •

edited

Loading

wshanks commented Jan 26, 2024

ihincks commented Jan 26, 2024

jakelishman commented Jan 26, 2024

wshanks commented Jan 27, 2024 •

edited

Loading

wshanks commented Jan 27, 2024

jakelishman commented Jan 29, 2024

nkanazawa1989 commented Jan 30, 2024

wshanks commented Jan 30, 2024

nkanazawa1989 commented Jan 31, 2024

wshanks commented Jan 31, 2024

jakelishman commented Jan 31, 2024

nkanazawa1989 commented Jan 31, 2024

jakelishman commented Jan 31, 2024

Parameter instances can compare equal while having different hash values #11654

Parameter instances can compare equal while having different hash values #11654

Comments

wshanks commented Jan 26, 2024 • edited Loading

Environment

What is happening?

How can we reproduce the issue?

What should happen?

Any suggestions?

ihincks commented Jan 26, 2024 • edited Loading

wshanks commented Jan 26, 2024

ihincks commented Jan 26, 2024

jakelishman commented Jan 26, 2024

wshanks commented Jan 27, 2024 • edited Loading

wshanks commented Jan 27, 2024

jakelishman commented Jan 29, 2024

nkanazawa1989 commented Jan 30, 2024

wshanks commented Jan 30, 2024

nkanazawa1989 commented Jan 31, 2024

wshanks commented Jan 31, 2024

jakelishman commented Jan 31, 2024

nkanazawa1989 commented Jan 31, 2024

jakelishman commented Jan 31, 2024

wshanks commented Jan 26, 2024 •

edited

Loading

ihincks commented Jan 26, 2024 •

edited

Loading

wshanks commented Jan 27, 2024 •

edited

Loading