Add qsim, qsim-gpu and qsim-cuquantum #14

mlazzarin · 2021-11-15T05:32:54Z

In this PR I added qsim (cpu), qsim-gpu and qsim-cuquantum.
For qsim (cpu) I set the number of threads to multiprocessing.cpu_count().
For all, I set the max_fused_gate_size to zero.
EDIT: For ``qibojit```, I disabled to compilation during import.

I also performed some benchmarks (cupy 9.6.0, cuda toolkit 11.5) for gpu.

total_dry_time: import + creation + dry run
total_simulation_time: import + creation + simulation time

qft

variational

supremacy

bv

qv

Some comments:

It doesn't seem that cuQuantum provides a speed-up w.r.t. qsim's C++/CUDA implementation.
Apart from the compilation overhead, we are competitive with C++/CUDA implementation, in particular with the qft circuit (maybe it's due to our approach to controlled gates?)
In these benchmarks (cupy 9.6.0, cuda toolkit 11.5) our dry run overhead is ~ 3.2 s. This is way higher than in other benchmarks I performed. I will open a new issue to discuss about it (EDIT Dry run overhead is inconsistent between different environments qibojit#44).
Qibojit crashes with 32 qubits, as you can see in the plots. EDIT (see Fix CupyBackend crash with 32 qubits qibojit#43).

EDIT: I will also prepare some benchmarks with CPU.

scarrazza · 2021-11-15T11:05:37Z

@mlazzarin thanks for these tests. The cuQuantum is using a single GPU device, correct?

mlazzarin · 2021-11-15T11:13:25Z

The cuQuantum is using a single GPU device, correct?

Yes, I'm using the machine with a single NVIDIA RTX A6000. By the way, I'm not sure if qsim supports multi-GPU.

scarrazza · 2021-11-15T11:44:51Z

Ok, thanks, anyway quite good to see that we are strong XD.

mlazzarin · 2021-11-15T11:49:16Z

Here's the results for CPU. For qsim I'm using a number of threads equal to the number of logical cores, while for qibo a kepy the default value, which is half of the logical cores. (I also tried with all logical cores and it's actually slower, for small circuits)

qft

variational

supremacy

bv

qv

Two comments:

qsim is usually faster than qibo with large circuits, except for the qft, while qibo seems competitive with smaller circuits.
I'm not 100% sure that I was able to deactivate gate fusion in qsim. I simply set the max_fused_gate_size parameter to 0, because I didn't find a flag to disable fusion completely.

scarrazza · 2021-11-15T11:52:49Z

This sounds really like there is circuit fusion, maybe we should try to activate from qibojit and see what happens.

mlazzarin · 2021-11-15T11:55:36Z

Ok, I'm on it.

mlazzarin · 2021-11-16T12:50:23Z

Here's the results for CPU with gate fusion up to two-qubit gates and using all threads.
Indeed the situation now is different.

qft - CPU

variational - CPU

supremacy - CPU

bv - CPU

qv - CPU

I re-run the GPU benchmarks with gate fusion up to two-qubit gates, and now qibojit seems a bit faster.

qft - GPU

variational - GPU

supremacy - GPU

bv - GPU

qv - GPU

scarrazza · 2021-11-16T13:05:37Z

Cool, however would be great to understand if/how they are doing the gate fusion.

mlazzarin · 2021-11-16T13:10:45Z

Cool, however would be great to understand if/how they are doing the gate fusion.

With qsim there is an option to set the maximum size of fused gates. In the last benchmarks that I posted I set that value to 2 (which is the default value). I've not found a specific flag to disable gate fusion, so in the other benchmarks I simply set that value to 0, but I don't know if it actually disable fusion or not.
Concerning how they do fusion, their approach is described here https://arxiv.org/abs/2111.02396 .

scarrazza · 2021-11-16T13:14:18Z

Ok, so these last plots are comparing like with like, good.

mlazzarin · 2021-12-19T06:41:23Z

I double-checked and I believe that this implementation is the optimal one, so we may proceed with the review and then merge it to the library branch. I have only two comment left:

We still need to understand how to properly disable gate fusion, but we can worry about it in PR Add fusion max_qubits option in compare.py #17

Among the possible options of Qsim, I found this one:

    denormals_are_zeros: if true, set flush-to-zero and denormals-are-zeros
         MXCSR control flags. This prevents rare cases of performance
         slowdown potentially at the cost of a tiny precision loss.

I'm not sure if we should use it in the benchmarks or not.

mlazzarin · 2021-12-19T10:39:41Z

I fixed some gates in Cirq, now the CI works fine. Once we fix the tests for the gates, we should review each library to ensure that everything is properly implemented.

stavros11

Thanks for adding this and fixing the QAOA issue, the tests are now working for me. My only comment would be that we could consider removing tfq completely to simplify code and CI/tests. Given that its backend is equivalent to qsim it would be redundant to include it in any benchmarks we do. As long it is not causing any issues we could keep it but if there is something in tests I wouldn't spend much time about it.

If you don't plan any other changes here, we can merge this to reduce the number of active branches.

andrea-pasquale

Thanks for this implementation. It looks good to me.
I've left below a few comments regarding some missing factors pi and also the overall phase of the CU3 gate.
Let me know what you think.

andrea-pasquale · 2021-12-20T09:56:56Z

benchmarks/libraries/cirq.py


    def CU1(self, theta):
+        # TODO: Check if this is the right gate
        return self.cirq.CZPowGate(exponent=theta)


I believe that here we are missing a factor pi.

Suggested change

return self.cirq.CZPowGate(exponent=theta)

return self.cirq.CZPowGate(exponent=theta/np.pi)

I am currently working on adding some random rotations before every circuit during tests so that the initial state is non-trivial and such problems are captured from tests. I will open a new PR based on this once all tests are ready. For now I can confirm that this fix works, thanks!

I see, in fact I had to check manually to find these small errors. If we can detect all of them from tests it would be great.

benchmarks/libraries/cirq.py

andrea-pasquale · 2021-12-20T10:06:08Z

benchmarks/libraries/cirq.py

        return self.cirq.CZPowGate(exponent=theta)

    def CU3(self, theta, phi, lam):
+        # TODO: Check if this is the right gate
        gate = self.cirq.circuits.qasm_output.QasmUGate(theta, phi, lam)


Again missing pi factor

Suggested change

gate = self.cirq.circuits.qasm_output.QasmUGate(theta, phi, lam)

gate = self.cirq.circuits.qasm_output.QasmUGate(theta/np.pi, phi/np.pi, lam/np.pi)

Yes, thanks. However, I'm not sure yet about this gate, as I couldn't find the exact matrix representation on the docs.
Anyway, we will find out after we fix the single-qubit gate tests.

Co-authored-by: Andrea Pasquale <andreapasquale97@gmail.com>

mlazzarin · 2021-12-21T09:01:08Z

Shall we merge this?

stavros11 · 2021-12-21T09:03:54Z

Yes, please go ahead and merge this and I will update randomtests to use the latest libraries so that we find any issues with gates.

mlazzarin added 4 commits November 10, 2021 10:25

Add more libraries

2ba7366

Minor changes in set_precision method for qsim and tfq

b8ae1ea

Fix name of qsim-gpu and qsim-cuquantum

4afac80

Disable fusion for qsim, qsim-gpu and qsim-custatevec

f1355a8

mlazzarin requested review from scarrazza and stavros11 November 15, 2021 05:32

stavros11 mentioned this pull request Nov 19, 2021

Add fusion max_qubits option in compare.py #17

Merged

mlazzarin requested review from andrea-pasquale, stavros11 and scarrazza and removed request for scarrazza and stavros11 December 19, 2021 06:41

mlazzarin added 2 commits December 19, 2021 12:10

Fix CI

687d0d0

Fix Cirq gates

34c8e4a

Fix tests for qsim gpu and cuquantum

964841c

stavros11 approved these changes Dec 20, 2021

View reviewed changes

andrea-pasquale approved these changes Dec 20, 2021

View reviewed changes

Update CU1 gate for Cirq

7939850

Co-authored-by: Andrea Pasquale <andreapasquale97@gmail.com>

mlazzarin and others added 3 commits December 20, 2021 14:43

Update CU3 for Cirq

99091d5

Co-authored-by: Andrea Pasquale <andreapasquale97@gmail.com>

Fix NameError

e163aef

Merge branch 'libraries' into qsim

2e4d9d7

mlazzarin merged commit e269329 into libraries Dec 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add qsim, qsim-gpu and qsim-cuquantum #14

Add qsim, qsim-gpu and qsim-cuquantum #14

mlazzarin commented Nov 15, 2021 •

edited

Loading

scarrazza commented Nov 15, 2021

mlazzarin commented Nov 15, 2021

scarrazza commented Nov 15, 2021

mlazzarin commented Nov 15, 2021

scarrazza commented Nov 15, 2021

mlazzarin commented Nov 15, 2021

mlazzarin commented Nov 16, 2021

scarrazza commented Nov 16, 2021

mlazzarin commented Nov 16, 2021

scarrazza commented Nov 16, 2021

mlazzarin commented Dec 19, 2021

mlazzarin commented Dec 19, 2021

stavros11 left a comment

andrea-pasquale left a comment

andrea-pasquale Dec 20, 2021

stavros11 Dec 20, 2021 •

edited

Loading

andrea-pasquale Dec 20, 2021

andrea-pasquale Dec 20, 2021

mlazzarin Dec 20, 2021

mlazzarin commented Dec 21, 2021

stavros11 commented Dec 21, 2021

	return self.cirq.CZPowGate(exponent=theta)
	return self.cirq.CZPowGate(exponent=theta/np.pi)

	gate = self.cirq.circuits.qasm_output.QasmUGate(theta, phi, lam)
	gate = self.cirq.circuits.qasm_output.QasmUGate(theta/np.pi, phi/np.pi, lam/np.pi)

Add qsim, qsim-gpu and qsim-cuquantum #14

Add qsim, qsim-gpu and qsim-cuquantum #14

Conversation

mlazzarin commented Nov 15, 2021 • edited Loading

scarrazza commented Nov 15, 2021

mlazzarin commented Nov 15, 2021

scarrazza commented Nov 15, 2021

mlazzarin commented Nov 15, 2021

scarrazza commented Nov 15, 2021

mlazzarin commented Nov 15, 2021

mlazzarin commented Nov 16, 2021

scarrazza commented Nov 16, 2021

mlazzarin commented Nov 16, 2021

scarrazza commented Nov 16, 2021

mlazzarin commented Dec 19, 2021

mlazzarin commented Dec 19, 2021

stavros11 left a comment

Choose a reason for hiding this comment

andrea-pasquale left a comment

Choose a reason for hiding this comment

andrea-pasquale Dec 20, 2021

Choose a reason for hiding this comment

stavros11 Dec 20, 2021 • edited Loading

Choose a reason for hiding this comment

andrea-pasquale Dec 20, 2021

Choose a reason for hiding this comment

andrea-pasquale Dec 20, 2021

Choose a reason for hiding this comment

mlazzarin Dec 20, 2021

Choose a reason for hiding this comment

mlazzarin commented Dec 21, 2021

stavros11 commented Dec 21, 2021

mlazzarin commented Nov 15, 2021 •

edited

Loading

stavros11 Dec 20, 2021 •

edited

Loading