Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'Peephole' optimization - or: collecting and optimizing two-qubit blocks - before routing #12727

Merged
merged 51 commits into from
Aug 1, 2024

Conversation

sbrandhsn
Copy link
Contributor

Summary

Fixes #12562

Details and comments

This PR collects and consolidates two-qubit gates into blocks before routing. Each block is then subsequently analysed to determine whether the two-qubit unitary representing the block equals the identity or is a product of single-qubit gates. In the former case, the gates in the block are removed and in the latter case the block is replaced by single-qubit gates correspondingly.

Points up for debate:

  • Do we want to run the new pass Split2QUnitaries as part of ConsolidatedBlocks?
  • Do the passes for gate direction checks and fixes always play nicely with having two-qubit unitaries before routing? There were no test failures but I'm still wondering.
  • Is the current path of checking for the two-qubit unitaries for being a product of single-qubit unitaries the fastest or should we rely on Weyl-decomposition? If the former is true, we should probably port the corresponding functions to rust as they will touch a lot of two-qubit unitaries. :-)

@sbrandhsn sbrandhsn requested a review from a team as a code owner July 5, 2024 18:53
@qiskit-bot
Copy link
Collaborator

One or more of the following people are relevant to this code:

  • @Qiskit/terra-core

@coveralls
Copy link

coveralls commented Jul 5, 2024

Pull Request Test Coverage Report for Build 9812549561

Details

  • 45 of 45 (100.0%) changed or added relevant lines in 3 files are covered.
  • 7 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+0.02%) to 89.843%

Files with Coverage Reduction New Missed Lines %
crates/qasm2/src/expr.rs 1 94.02%
crates/qasm2/src/lex.rs 6 91.6%
Totals Coverage Status
Change from base Build 9807934619: 0.02%
Covered Lines: 65234
Relevant Lines: 72609

💛 - Coveralls

@coveralls
Copy link

coveralls commented Jul 8, 2024

Pull Request Test Coverage Report for Build 10198580745

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 59 of 62 (95.16%) changed or added relevant lines in 4 files are covered.
  • 299 unchanged lines in 10 files lost coverage.
  • Overall coverage increased (+0.06%) to 89.764%

Changes Missing Coverage Covered Lines Changed/Added Lines %
qiskit/transpiler/preset_passmanagers/builtin_plugins.py 28 29 96.55%
qiskit/transpiler/passes/optimization/split_2q_unitaries.py 29 31 93.55%
Files with Coverage Reduction New Missed Lines %
crates/qasm2/src/lex.rs 3 93.23%
crates/circuit/src/imports.rs 3 77.78%
qiskit/circuit/library/standard_gates/u.py 3 93.07%
qiskit/qasm3/exporter.py 4 95.46%
qiskit/circuit/commutation_checker.py 5 96.99%
qiskit/circuit/_utils.py 6 92.0%
crates/circuit/src/parameter_table.rs 12 94.55%
crates/circuit/src/circuit_data.rs 35 95.14%
qiskit/circuit/quantumcircuit.py 87 93.54%
crates/circuit/src/operations.rs 141 88.68%
Totals Coverage Status
Change from base Build 10186441336: 0.06%
Covered Lines: 67356
Relevant Lines: 75037

💛 - Coveralls

mtreinish added a commit to mtreinish/qiskit-core that referenced this pull request Jul 8, 2024
This commit migrates the two functions in the private module
`qiskit.synthesis.two_qubit.local_invariance` to primarily be
implemented in rust. Since the two_qubit_local_invariants() function is
being used in Qiskit#12727 premptively porting these functions to rust will
both potentially speed up the new transpiler pass in that PR and also
facilitate a future porting of that pass to rust. The two python space
functions continue to exist as a small wrapper to do input type
checking/conversion and rounding of the result (since the python API for
rounding is simpler). There is no release note included for these
functions as they are internal utilities in Qiskit and not exposed as a
public interface.
mtreinish added a commit to mtreinish/qiskit-core that referenced this pull request Jul 8, 2024
This commit migrates the two functions in the private module
`qiskit.synthesis.two_qubit.local_invariance` to primarily be
implemented in rust. Since the two_qubit_local_invariants() function is
being used in Qiskit#12727 premptively porting these functions to rust will
both potentially speed up the new transpiler pass in that PR and also
facilitate a future porting of that pass to rust. The two python space
functions continue to exist as a small wrapper to do input type
checking/conversion and rounding of the result (since the python API for
rounding is simpler). There is no release note included for these
functions as they are internal utilities in Qiskit and not exposed as a
public interface.
@mtreinish
Copy link
Member

Is the current path of checking for the two-qubit unitaries for being a product of single-qubit unitaries the fastest or should we rely on Weyl-decomposition? If the former is true, we should probably port the corresponding functions to rust as they will touch a lot of two-qubit unitaries. :-)

I took care of this in: #12739 since it was very simple. From a complexity PoV I would expect the local_invariants() function to be much faster since it's far less complex the the TwoQubitWeylDecomposition class.

Copy link
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks great to me, I really like this idea of collecting 2q unitary blocks before routing and deferring synthesis until the translation phase. It avoids the potential risk of early synthesis making routing worse and will potentially make routing faster because we have reduced the total gate count.

I left some inline comments, on the new pass itself most of the suggestions are adjusting the access patterns to get attributes from the DAGOpNode directly instead of using the python op. Although some of these access patterns will shift after #12730 merges so we should keep an eye on potential conflicts there. The other thing is after #12739 merges and decompose_two_qubit_product_gate() is ported to rust we actually can move the majority of this pass to rust, where we pass a list of DAGOpNodes to rust do the filtering and resynthesis in rust, and return a list of 1q unitary pairs. This is basically what's done in Optimize1qGatesDecomposition now and I expect that it will be faster than doing it all python side. But that can wait until we have the necessary features in rust.

@@ -109,6 +111,16 @@ def _define(self):
self._definition = QuantumCircuit(2)
self._definition.cx(0, 1)

def to_matrix(self) -> np.ndarray:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? Should this skip the gate if the matrix is empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking into this - a couple of tests were failing...

Comment on lines 44 to 47
dag_node.apply_operation_back(UnitaryGate(ur), qargs=(node.qargs[0],))
dag_node.apply_operation_back(UnitaryGate(ul), qargs=(node.qargs[1],))
dag_node.global_phase += phase
dag.substitute_node_with_dag(node, dag_node)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we can use substitute node with dag here, it'll probably be a bit more performant to do something like:

https://github.com/Qiskit/qiskit/blob/main/qiskit/transpiler/passes/optimization/optimize_1q_decomposition.py#L244-L248

Where we just insert the decomposition gate before the 2q gate and then just remove the node. This should be a bit faster because it avoids all the extra complexity in the dag substitution code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thanks. I will modify this.

qiskit/transpiler/preset_passmanagers/builtin_plugins.py Outdated Show resolved Hide resolved
Comment on lines 4 to 16
Added a new pass :class:`.Split2QUnitaries` that iterates over all two-qubit gates or unitaries in a
circuit and replaces them with two single-qubit unitaries, if possible without introducing errors, i.e.
the two-qubit gate/unitary is actually a (kronecker) product of single-qubit unitaries. In addition,
the passes :class:`.Collect2qBlocks`, :class:`.ConsolidateBlocks` and :class:`.Split2QUnitaries` are
added to the `init` stage of the preset pass managers with optimization level 2 and optimization level 3
if the pass managers target a :class:`.Target` that has a discrete basis gate set, i.e. all basis gates
have are not parameterized. The modification of the `init` stage should allow for a more efficient routing
for quantum circuits that (a) contain two-qubit unitaries/gates that are actually a product of single-qubit gates
or (b) contain multiple two-qubit gates in a continuous block of two-qubit gates. In the former case,
the routing of the two-qubit gate can simply be skipped as no real interaction between a pair of qubits
occurs. In the latter case, the lookahead space of routing algorithms is not 'polluted' by superfluous
two-qubit gates, i.e. for routing it is sufficient to only consider one single two-qubit gate per
continuous block of two-qubit gates.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would split this into two release notes (with separate bullet points under features). One documenting the new pass, the other documenting the modifications to the preset pass manager.

Co-authored-by: Matthew Treinish <mtreinish@kortar.org>
Copy link
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is basically ready. I just left some small inline comments on the tests and docs but we can do these in follow ups pretty easily as long as we do it before the 1.2.0 final release.

test/python/compiler/test_transpiler.py Show resolved Hide resolved
test/python/transpiler/test_split_2q_unitaries.py Outdated Show resolved Hide resolved
@sbrandhsn
Copy link
Contributor Author

The code looks fine, module plumbing in the approximation. I haven't tried checking the compile speed and/or quality of output. Would intuitively expect modest improvement in both most of the time, but there are probably pathological cases where it is worse. Looks like you are pushing the swap case for a follow up

Thanks, I decided to not include the swap gate case here because we will need to build a new dag for every input circuit. I assume most input circuits won't have swap-like circuits, so we would end up wasting that runtime while not yielding an improvement in circuit size for most cases. Also, when a user inputs a prerouted quantum circuit he would be very surprised to find that his routing solution has been replaced by something qiskit came up with which may or may not be better. I still think this is an interesting approach and should be reconsidered in the future.

Copy link
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the quick updates.

@mtreinish mtreinish dismissed levbishop’s stale review August 1, 2024 13:39

Comments addressed

@mtreinish mtreinish added this pull request to the merge queue Aug 1, 2024
Procatv pushed a commit to Procatv/qiskit-terra-catherines that referenced this pull request Aug 1, 2024
* Oxidize two qubit local invariance functions

This commit migrates the two functions in the private module
`qiskit.synthesis.two_qubit.local_invariance` to primarily be
implemented in rust. Since the two_qubit_local_invariants() function is
being used in Qiskit#12727 premptively porting these functions to rust will
both potentially speed up the new transpiler pass in that PR and also
facilitate a future porting of that pass to rust. The two python space
functions continue to exist as a small wrapper to do input type
checking/conversion and rounding of the result (since the python API for
rounding is simpler). There is no release note included for these
functions as they are internal utilities in Qiskit and not exposed as a
public interface.

* Add docstring to rust function with citation

* Store adjoint magic array statically

* Use arraview1 instead of slice for local_equivalence()

* Fix rustfmt
Merged via the queue into Qiskit:main with commit 1214d51 Aug 1, 2024
15 checks passed
@mtreinish mtreinish added the stable backport potential The bug might be minimal and/or import enough to be port to stable label Aug 1, 2024
mergify bot pushed a commit that referenced this pull request Aug 1, 2024
…cks - before routing (#12727)

* init

* up

* up

* Update builtin_plugins.py

* Update builtin_plugins.py

* reno

* Update builtin_plugins.py

* Update builtin_plugins.py

* Update peephole-before-routing-c3d184b740bb7a8b.yaml

* neko check

* check neko

* Update builtin_plugins.py

* test neko

* Update builtin_plugins.py

* Update builtin_plugins.py

* Update builtin_plugins.py

* lint

* tests and format

* remove FakeTorino test

* Update peephole-before-routing-c3d184b740bb7a8b.yaml

* Apply suggestions from code review

Co-authored-by: Matthew Treinish <mtreinish@kortar.org>

* comments from code review

* fix precision

* up

* up

* update

* up

* .

* cyclic import

* cycl import

* cyl import

* .

* circular import

* .

* lint

* Include new pass in docs

* Fix Split2QUnitaries dag manipulation

This commit fixes the dag handling to do the 1q unitary insertion.
Previously the dag manipulation was being done manually using the
insert_node_on_in_edges() rustworkx method. However as the original node
had 2 incoming edges for each qubit this caused the dag after running
the pass to become corrupted. Each of the new 1q unitary nodes would end
up with 2 incident edges and they would be in a sequence. This would result
in later passes not being able to correctly understand the state of the
circuit correctly. This was causing the unit tests to fail. This commit
fixes this by just using `substitute_node_with_dag()` to handle the
node substition, while doing it manually to avoid the overhead of
checking is probably possible, the case where a unitary is the product
of two 1q gates is not very common so optimizing it isn't super
critical.

* Update releasenotes/notes/peephole-before-routing-c3d184b740bb7a8b.yaml

* stricter check for doing split2q

* Update qiskit/transpiler/preset_passmanagers/builtin_plugins.py

Co-authored-by: Matthew Treinish <mtreinish@kortar.org>

* code review

* Update qiskit/transpiler/passes/optimization/split_2q_unitaries.py

Co-authored-by: Matthew Treinish <mtreinish@kortar.org>

* new tests

* typo

* lint

* lint

---------

Co-authored-by: Matthew Treinish <mtreinish@kortar.org>
(cherry picked from commit 1214d51)
github-merge-queue bot pushed a commit that referenced this pull request Aug 1, 2024
…cks - before routing (#12727) (#12881)

* init

* up

* up

* Update builtin_plugins.py

* Update builtin_plugins.py

* reno

* Update builtin_plugins.py

* Update builtin_plugins.py

* Update peephole-before-routing-c3d184b740bb7a8b.yaml

* neko check

* check neko

* Update builtin_plugins.py

* test neko

* Update builtin_plugins.py

* Update builtin_plugins.py

* Update builtin_plugins.py

* lint

* tests and format

* remove FakeTorino test

* Update peephole-before-routing-c3d184b740bb7a8b.yaml

* Apply suggestions from code review

Co-authored-by: Matthew Treinish <mtreinish@kortar.org>

* comments from code review

* fix precision

* up

* up

* update

* up

* .

* cyclic import

* cycl import

* cyl import

* .

* circular import

* .

* lint

* Include new pass in docs

* Fix Split2QUnitaries dag manipulation

This commit fixes the dag handling to do the 1q unitary insertion.
Previously the dag manipulation was being done manually using the
insert_node_on_in_edges() rustworkx method. However as the original node
had 2 incoming edges for each qubit this caused the dag after running
the pass to become corrupted. Each of the new 1q unitary nodes would end
up with 2 incident edges and they would be in a sequence. This would result
in later passes not being able to correctly understand the state of the
circuit correctly. This was causing the unit tests to fail. This commit
fixes this by just using `substitute_node_with_dag()` to handle the
node substition, while doing it manually to avoid the overhead of
checking is probably possible, the case where a unitary is the product
of two 1q gates is not very common so optimizing it isn't super
critical.

* Update releasenotes/notes/peephole-before-routing-c3d184b740bb7a8b.yaml

* stricter check for doing split2q

* Update qiskit/transpiler/preset_passmanagers/builtin_plugins.py

Co-authored-by: Matthew Treinish <mtreinish@kortar.org>

* code review

* Update qiskit/transpiler/passes/optimization/split_2q_unitaries.py

Co-authored-by: Matthew Treinish <mtreinish@kortar.org>

* new tests

* typo

* lint

* lint

---------

Co-authored-by: Matthew Treinish <mtreinish@kortar.org>
(cherry picked from commit 1214d51)

Co-authored-by: Sebastian Brandhofer <148463728+sbrandhsn@users.noreply.github.com>
mtreinish added a commit to mtreinish/qiskit-core that referenced this pull request Aug 3, 2024
In Qiskit#12727 a check was added to the default init stage's construction to
avoid running 2q gate consolidation in the presence of targets with
only discrete gates. However the way the target was being used in this
check was incorrect. The name for an instruction in the target should be
used as its identifier and then if we need the object representation
that should query the target for that object based on the name. However
the check was doing this backwards getting a list of operation objects
and then using the name contained in the object. This will cause issues
for instructions that use custom names such as when there are tuned
variants or a custom gate instance with a unique name.

While there is some question over whether we need this check as we will
run the consolidate 2q blocks pass as part of the optimization stage
which will have the same effect, this opts to just fix the target usage
for it to minimize the diff. Also while not the explicit goal of this
check it is protecting against some bugs in the consolidate blocks pass
that occur when custom gates are used. So for the short term this check
is retained, but in the future when these bugs in consolidate blocks are
fixed we can revisit whether we want to remove the conditional logic.
github-merge-queue bot pushed a commit that referenced this pull request Aug 8, 2024
* Fix target handling in discrete basis check

In #12727 a check was added to the default init stage's construction to
avoid running 2q gate consolidation in the presence of targets with
only discrete gates. However the way the target was being used in this
check was incorrect. The name for an instruction in the target should be
used as its identifier and then if we need the object representation
that should query the target for that object based on the name. However
the check was doing this backwards getting a list of operation objects
and then using the name contained in the object. This will cause issues
for instructions that use custom names such as when there are tuned
variants or a custom gate instance with a unique name.

While there is some question over whether we need this check as we will
run the consolidate 2q blocks pass as part of the optimization stage
which will have the same effect, this opts to just fix the target usage
for it to minimize the diff. Also while not the explicit goal of this
check it is protecting against some bugs in the consolidate blocks pass
that occur when custom gates are used. So for the short term this check
is retained, but in the future when these bugs in consolidate blocks are
fixed we can revisit whether we want to remove the conditional logic.

* Remove check and fix ConsolidateBlocks bug

This commit pivots this PR branch to just remove the additional logic
around skipping the optimization passes for discrete basis sets. The
value the check was actually providing was not around a discrete basis
set target and instead was to workaround a bug in the consolidate blocks
pass. If a discrete basis set target was used this would still fail
because we will unconditionally call `ConsolidateBlocks` during the
optimization stage. This commit opts to just remove the extra complexity
of the conditional execution of the peephole optimization passes and
instead just fix the underlying bug in `ConsolidateBlocks` and remove
the check.
mergify bot pushed a commit that referenced this pull request Aug 8, 2024
* Fix target handling in discrete basis check

In #12727 a check was added to the default init stage's construction to
avoid running 2q gate consolidation in the presence of targets with
only discrete gates. However the way the target was being used in this
check was incorrect. The name for an instruction in the target should be
used as its identifier and then if we need the object representation
that should query the target for that object based on the name. However
the check was doing this backwards getting a list of operation objects
and then using the name contained in the object. This will cause issues
for instructions that use custom names such as when there are tuned
variants or a custom gate instance with a unique name.

While there is some question over whether we need this check as we will
run the consolidate 2q blocks pass as part of the optimization stage
which will have the same effect, this opts to just fix the target usage
for it to minimize the diff. Also while not the explicit goal of this
check it is protecting against some bugs in the consolidate blocks pass
that occur when custom gates are used. So for the short term this check
is retained, but in the future when these bugs in consolidate blocks are
fixed we can revisit whether we want to remove the conditional logic.

* Remove check and fix ConsolidateBlocks bug

This commit pivots this PR branch to just remove the additional logic
around skipping the optimization passes for discrete basis sets. The
value the check was actually providing was not around a discrete basis
set target and instead was to workaround a bug in the consolidate blocks
pass. If a discrete basis set target was used this would still fail
because we will unconditionally call `ConsolidateBlocks` during the
optimization stage. This commit opts to just remove the extra complexity
of the conditional execution of the peephole optimization passes and
instead just fix the underlying bug in `ConsolidateBlocks` and remove
the check.

(cherry picked from commit 70c2f78)
github-merge-queue bot pushed a commit that referenced this pull request Aug 8, 2024
* Fix target handling in discrete basis check

In #12727 a check was added to the default init stage's construction to
avoid running 2q gate consolidation in the presence of targets with
only discrete gates. However the way the target was being used in this
check was incorrect. The name for an instruction in the target should be
used as its identifier and then if we need the object representation
that should query the target for that object based on the name. However
the check was doing this backwards getting a list of operation objects
and then using the name contained in the object. This will cause issues
for instructions that use custom names such as when there are tuned
variants or a custom gate instance with a unique name.

While there is some question over whether we need this check as we will
run the consolidate 2q blocks pass as part of the optimization stage
which will have the same effect, this opts to just fix the target usage
for it to minimize the diff. Also while not the explicit goal of this
check it is protecting against some bugs in the consolidate blocks pass
that occur when custom gates are used. So for the short term this check
is retained, but in the future when these bugs in consolidate blocks are
fixed we can revisit whether we want to remove the conditional logic.

* Remove check and fix ConsolidateBlocks bug

This commit pivots this PR branch to just remove the additional logic
around skipping the optimization passes for discrete basis sets. The
value the check was actually providing was not around a discrete basis
set target and instead was to workaround a bug in the consolidate blocks
pass. If a discrete basis set target was used this would still fail
because we will unconditionally call `ConsolidateBlocks` during the
optimization stage. This commit opts to just remove the extra complexity
of the conditional execution of the peephole optimization passes and
instead just fix the underlying bug in `ConsolidateBlocks` and remove
the check.

(cherry picked from commit 70c2f78)

Co-authored-by: Matthew Treinish <mtreinish@kortar.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: New Feature Include in the "Added" section of the changelog mod: transpiler Issues and PRs related to Transpiler stable backport potential The bug might be minimal and/or import enough to be port to stable
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Apply Peephole Optimization Before Routing
7 participants