-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add qsim, qsim-gpu and qsim-cuquantum #14
Conversation
@mlazzarin thanks for these tests. The cuQuantum is using a single GPU device, correct? |
Yes, I'm using the machine with a single NVIDIA RTX A6000. By the way, I'm not sure if qsim supports multi-GPU. |
Ok, thanks, anyway quite good to see that we are strong XD. |
Here's the results for CPU. For Two comments:
|
This sounds really like there is circuit fusion, maybe we should try to activate from qibojit and see what happens. |
Ok, I'm on it. |
Here's the results for CPU with gate fusion up to two-qubit gates and using all threads. I re-run the GPU benchmarks with gate fusion up to two-qubit gates, and now qibojit seems a bit faster. |
Cool, however would be great to understand if/how they are doing the gate fusion. |
With |
Ok, so these last plots are comparing like with like, good. |
I double-checked and I believe that this implementation is the optimal one, so we may proceed with the review and then merge it to the
|
I fixed some gates in Cirq, now the CI works fine. Once we fix the tests for the gates, we should review each library to ensure that everything is properly implemented. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this and fixing the QAOA issue, the tests are now working for me. My only comment would be that we could consider removing tfq completely to simplify code and CI/tests. Given that its backend is equivalent to qsim it would be redundant to include it in any benchmarks we do. As long it is not causing any issues we could keep it but if there is something in tests I wouldn't spend much time about it.
If you don't plan any other changes here, we can merge this to reduce the number of active branches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this implementation. It looks good to me.
I've left below a few comments regarding some missing factors pi and also the overall phase of the CU3 gate.
Let me know what you think.
benchmarks/libraries/cirq.py
Outdated
|
||
def CU1(self, theta): | ||
# TODO: Check if this is the right gate | ||
return self.cirq.CZPowGate(exponent=theta) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that here we are missing a factor pi.
return self.cirq.CZPowGate(exponent=theta) | |
return self.cirq.CZPowGate(exponent=theta/np.pi) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am currently working on adding some random rotations before every circuit during tests so that the initial state is non-trivial and such problems are captured from tests. I will open a new PR based on this once all tests are ready. For now I can confirm that this fix works, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, in fact I had to check manually to find these small errors. If we can detect all of them from tests it would be great.
benchmarks/libraries/cirq.py
Outdated
return self.cirq.CZPowGate(exponent=theta) | ||
|
||
def CU3(self, theta, phi, lam): | ||
# TODO: Check if this is the right gate | ||
gate = self.cirq.circuits.qasm_output.QasmUGate(theta, phi, lam) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again missing pi factor
gate = self.cirq.circuits.qasm_output.QasmUGate(theta, phi, lam) | |
gate = self.cirq.circuits.qasm_output.QasmUGate(theta/np.pi, phi/np.pi, lam/np.pi) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks. However, I'm not sure yet about this gate, as I couldn't find the exact matrix representation on the docs.
Anyway, we will find out after we fix the single-qubit gate tests.
Co-authored-by: Andrea Pasquale <andreapasquale97@gmail.com>
Co-authored-by: Andrea Pasquale <andreapasquale97@gmail.com>
Shall we merge this? |
Yes, please go ahead and merge this and I will update |
In this PR I added
qsim
(cpu),qsim-gpu
andqsim-cuquantum
.For
qsim
(cpu) I set the number of threads tomultiprocessing.cpu_count()
.For all, I set the
max_fused_gate_size
to zero.EDIT: For ``qibojit```, I disabled to compilation during import.
I also performed some benchmarks (cupy 9.6.0, cuda toolkit 11.5) for gpu.
total_dry_time
: import + creation + dry runtotal_simulation_time
: import + creation + simulation timeqft
variational
supremacy
bv
qv
Some comments:
CupyBackend
crash with 32 qubits qibojit#43).EDIT: I will also prepare some benchmarks with CPU.