-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add two-qubit lambda in qubitvector #133
Conversation
Static index with std::array can show the similar performance. This latest version add static index on top of #124. |
EDIT: I just realised i was running the stabilizer simulator, not statevector doh, now i see a big difference. I think PR where we removed static indexing earlier was flawed. Test caseUsing a single-shot test circuit of N CNOTS for N qubits (N=10,15,20,25) from qiskit import *
sim = Aer.get_backend('qasm_simulator')
opts = {'method': 'statevector'}
def make_qobj(nq):
q = QuantumRegister(nq)
qc = QuantumCircuit(q)
for j in range(nq):
qc.cx(q[j], q[(j + 1) % nq])
return compile(qc, Aer.get_backend('qasm_simulator'), shots=1)
nqs = [10, 15, 20, 25]
qobjs = [make_qobj(nq) for nq in nqs] and simple timing using Jupyter %%timeit
result = sim.run(qobj, validate=False, backend_options=opts).result() EDIT: UPDATED Using dynamic indexes:
Using static indexes:
|
Some further testing and I think we can remove the single-qubit apply matrix functions and use the old static indexing. Changing the above example to apply and |
* add two qubit lambda for cnot * use two qubit lambda for swap and cz * use lambda with indexes for single-qubit operation
* add two qubit lambda for cnot * use two qubit lambda for swap and cz * use lambda with indexes for single-qubit operation
Summary
This PR optimizes two-qubit gate operations in qubitvector
Details and comments
Currently two-qubit operations use index. An index is generated for each iteration in a loop.
This PR introduces a new
apply_lambda
for two-qubits operations.cnot
is implemented as follows:In the above,
i01 = i00 | GAP_trgt
andi11 = i 10 | GAP_trgt
. This PR may increases arithmetic operations, but they are cheap comparing with index generation.My local environment (MacBook Pro (Retina, 13-inch, Early 2015)), cnot x 25 for a 25-qubits circuit took
3.19 sec
with this optimization and9.55 sec
without this optimization.