-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Optimize1qGatesDecomposition
length heuristic
#6553
Fix Optimize1qGatesDecomposition
length heuristic
#6553
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, two small comments
...tes/fixed-bug-in-Optimize1qGatesDecomposition-skipping-short-sequences-044a64740bf414a7.yaml
Show resolved
Hide resolved
qiskit/transpiler/passes/optimization/optimize_1q_decomposition.py
Outdated
Show resolved
Hide resolved
I wonder if we can't completely remove the check if the new run is shorter. With the changes in #5827 I'm hoping there should be no remaining cases where the resynthesized sequence is ever non-optimal (and if any do remain, I consider it a bug). Maybe for now get the check to raise an exception if the new sequence is longer? |
I added a branch for a warning, rather than an error. It would be obnoxious for a synthesis flaw to generate suboptimal programs, but probably worse to then prevent a user from running at all. I'll check out the test failures tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this LGTM thanks for fixing this. I agree all that logic is out of date with #5827 now since the issues that single gate case were there for have been fixed. This will also probably improve performance a bit since we're not going to do an identity check on each standalone 1q gate. Just a few nits inline, but then I think this good to go.
Also can you update the release note to say you've also fixed #6473 because I believe this will fix that issue too. (basically just add a new bullet point to the release note yaml for the second fix).
qiskit/transpiler/passes/optimization/optimize_1q_decomposition.py
Outdated
Show resolved
Hide resolved
...tes/fixed-bug-in-Optimize1qGatesDecomposition-skipping-short-sequences-044a64740bf414a7.yaml
Outdated
Show resolved
Hide resolved
@@ -102,7 +84,13 @@ def run(self, dag): | |||
new_circs.append(decomposer._decompose(operator)) | |||
if new_circs: | |||
new_circ = min(new_circs, key=len) | |||
if len(run) > len(new_circ) or (single_u3 and new_circ.data[0][0].name != "u3"): | |||
if all(g.name in self.basis for g in run) and len(run) < len(new_circ): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an idea on what the runtime cost of this additional check is? If it's noticeable I'd rather not add this to just raise a warning because this will be called quite a lot during a transpile call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have data, but I imagine it's negligible relative to the nonnegotiable cost of constructing new_circ
: self.basis
tends not to be super large, and decomposer
has to read all of run
anyhow. I'll do a small amount of with/without benchmarking.
Oh I didn't see the test failures before reviewing and just assumed they all passed. We probably should get to the bottom of those. Most I think are fine and we just need to update the tests like But |
Yeah this is tricky. The Euler Z(phi)X(theta)Z(lambda) decomposition should find an optimal expansion for 0<=theta<=pi. The input circuit has theta=-pi/2 outside that range, so the Euler decomposer includes the extra Z(). It wouldn't be crazy to change the Euler decomposer to check in case the negative theta := -theta allows such a simplification and emit that if its fewer gates. On the other hand, both of these solutions are optimal in terms of X() gates which is what actually cost something for many hardware implementations. |
Some of the test problems were legitimate: the slot Fixed that, updated some of the reference circuits in the test suite, but this still isn't quite good to go: |
Ah good catch, yeah I missed that in my earlier review, |
qiskit/transpiler/passes/optimization/optimize_1q_decomposition.py
Outdated
Show resolved
Hide resolved
I ran the quantum volume benchmarks from the asv benchmark suite on this just now and it causes a roughly 30% run time performance regression:
My assumption is that this is caused by all the times we're iterating over the gates in each run to check for calibrations now. I haven't profiled it yet but I can do that on monday. But until we get to the bottom of this I've removed the automerge label to prevent mergify from automerging this. |
It would probably be a good idea to add a couple of extra lines to |
Co-authored-by: Lev Bishop <18673315+levbishop@users.noreply.github.com>
Co-authored-by: Lev Bishop <18673315+levbishop@users.noreply.github.com>
Good idea. Done. |
…/qiskit-terra into bugfix/nonnative-1q-heuristic
I reran benchmarks just now, it looks like the performance regression from ealier has now been resolved and the performance is equivalent to the current
I'm a bit concerned about that variance, especially as I ran this on my local benchmark system which is doesn't have anything else running on it. But it doesn't look outside the norm for the benchmark (my guess is that the seeding for the qv circuit generation isn't correct or something like that). |
I also see really strong variance across benchmarking runs, and it sounds worth looking into. I don't think the variance is something I introduced, but who knows what it might be covering up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, one nit inline about docs formatting but it's not worth blocking over as it doesn't get published. I definitely appreciate all the comments added, it makes it easier to trace through things now.
""" | ||
Installs the angles phi, theta, and lam into a KAK-type decomposition of the form | ||
K(phi) . A(theta) . K(lam) , where K and A are an orthogonal pair drawn from RZGate, RYGate, | ||
and RXGate. | ||
|
||
Behavior flags: | ||
`simplify` indicates whether gates should be elided / coalesced where possible. | ||
`allow_non_canonical` indicates whether we are permitted to reverse the sign of the | ||
middle parameter, theta, in the output. When this and `simplify` are both enabled, | ||
we take the opportunity to commute half-rotations in the outer gates past the middle | ||
gate, which permits us to coalesce them at the cost of reversing the sign of theta. | ||
|
||
NOTE: The input value of `theta` is expected to lie in [0, pi). | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a private method and neither the docs builds or linters care about docstring formatting on this isn't a big issue. But I wonder if it would be better to restructure this in the normal docstrig format for the napoleon plugin just to be consistent with the rest of the docs. If this were a built and published doc it wouldn't actually pass CI.
a by-hand attempt at reformatting the docstring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a marathon... thanks for sticking with it
Summary
The decision in
Optimize1qGatesDecomposition
whether to use the re-synthesized gate string or the original gate string is decided purely by length: we prefer whichever of the new string and the old string is shorter. This is a rough approximation to expected infidelity cost of the sequence when both strings are of native gates. An input string with non-native gates might be unnaturally short, hence persist into the output — a bug.Details and comments
Before:
After:
I also removed some special casing on
U3
and single gates that seemed to overlap with this oversight. Happy to reverse that.