Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spin_op performance enhancement #115

Merged
merged 21 commits into from
May 12, 2023
Merged

Conversation

amccaskey
Copy link
Collaborator

@amccaskey amccaskey commented Apr 26, 2023

Update spin_op to use a unordered_map<vector<bool>, complex<double>> as the primary container for pauli terms.

Below are results for a large (15 qubit) random spin_op addition + multiplication operations over a range of term numbers.

perfEnhancement

@amccaskey amccaskey force-pushed the spinOpRefactor branch 2 times, most recently from fb0e33c to 158b813 Compare May 1, 2023 20:15
@amccaskey amccaskey marked this pull request as ready for review May 1, 2023 20:15
Copy link
Collaborator

@boschmitt boschmitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a few preliminary comments.

Could you provide a comparison between this enhancement and the current implementation?
Do you know why changing using std::unordered_map provides better performance ?

spin_op's public API currently relies (or leaks) the fact that its current underlying implementation behaves like a vector, e.g., operator[] (size_t index) and spin_op::slice. If we use a std::unordered_map as the underlying data structure to store terms, these methods don't make much sense and will likely cause various bugs.

python/runtime/cudaq/spin/py_spin_op.cpp Show resolved Hide resolved
runtime/cudaq/spin/spin_op.cpp Outdated Show resolved Hide resolved
runtime/cudaq/spin/spin_op.cpp Outdated Show resolved Hide resolved
runtime/cudaq/spin/spin_op.cpp Outdated Show resolved Hide resolved
runtime/cudaq/spin/spin_op.cpp Outdated Show resolved Hide resolved
runtime/cudaq/spin/spin_op.cpp Outdated Show resolved Hide resolved
runtime/cudaq/spin/spin_op.cpp Outdated Show resolved Hide resolved
runtime/cudaq/spin/spin_op.cpp Outdated Show resolved Hide resolved
runtime/cudaq/spin/spin_op.cpp Outdated Show resolved Hide resolved
runtime/cudaq/spin/spin_op.cpp Outdated Show resolved Hide resolved
Copy link
Collaborator

@anthony-santana anthony-santana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code and performance improvements look great! Most comments are related to either nits in the tests or replacing certain member functions with properties.

python/runtime/cudaq/spin/py_spin_op.cpp Show resolved Hide resolved
python/runtime/cudaq/spin/py_spin_op.cpp Show resolved Hide resolved
python/tests/unittests/test_SpinOperator.py Outdated Show resolved Hide resolved
python/tests/unittests/test_SpinOperator.py Show resolved Hide resolved
python/runtime/cudaq/spin/py_spin_op.cpp Show resolved Hide resolved
python/tests/unittests/test_SpinOperator.py Outdated Show resolved Hide resolved
python/tests/unittests/test_SpinOperator.py Outdated Show resolved Hide resolved
python/tests/unittests/test_SpinOperator.py Show resolved Hide resolved
python/tests/unittests/test_observe.py Show resolved Hide resolved
python/tests/unittests/test_sample.py Show resolved Hide resolved
@amccaskey
Copy link
Collaborator Author

I added a few preliminary comments.

Could you provide a comparison between this enhancement and the current implementation? Do you know why changing using std::unordered_map provides better performance ?

spin_op's public API currently relies (or leaks) the fact that its current underlying implementation behaves like a vector, e.g., operator[] (size_t index) and spin_op::slice. If we use a std::unordered_map as the underlying data structure to store terms, these methods don't make much sense and will likely cause various bugs.

@boschmitt

Great points to bring up.

The current implementation was never tested for performance / speed as the number of terms / qubits grew (everything was fine for the small applications we'd run up to this point). My early tests show that the same benchmark as above is a factor of 100 slower than this unordered_map implementation (hence the need for this PR 😄 ).

The benefits of the unordered_map come in with regards to storing unique Pauli terms, where the uniqueness is enforced via the map key / hash, and like terms are updated by just updating the value (the term coefficient). This becomes more important as we start multiplying spin_op instances with multiple terms (P0 + P1 + ...) * (Q0 + Q1 + ...), which results in many multiplication and addition operations to update the underlying spin_op data structure (vector of vectors in the current implementation vs an unordered_map in this PR). To perform this operation, one has to find like terms (vector<bool>) and compare against others newly created by the multiplication / addition operations. std::vector search complexity is O(N) while unordered_map should be O(1) (granted I think this ignores constant factors, but I think the proof is in the benchmark above). Moreover, I found it easier to reason about multi-threaded multiplications with the unordered_map at the cost of extra memory. Hence the addition of OpenMP parallelization for operator*=().

As for this public API, you are correct that this changes how downstream programmers can / should reason about the underlying terms. I should remove the operator[] method and replace with begin() / end() methods returning iterators that allow range-based for loops with structured bindings. The documentation on the class should also be updated to tell the programmer that any underlying order of the terms should not be relied on as the order is no non-deterministic. I think that slice or something like it can still exist, since there are N key/values in the map and the use case for slice is primarily to extract equally sized chunks for parallel processing. Maybe the name needs to change?

Open to more comments / feedback here in an effort to make this even better. Maybe I've missed something?

@amccaskey amccaskey force-pushed the spinOpRefactor branch 2 times, most recently from 799158d to 0c3b4f1 Compare May 8, 2023 12:51
@amccaskey
Copy link
Collaborator Author

@boschmitt Latest commit removes slice and replaces with distribute_terms. All of your comments should be addressed now.

@amccaskey amccaskey force-pushed the spinOpRefactor branch 5 times, most recently from 687d2de to 0b015bb Compare May 12, 2023 12:08
Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>
Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>
Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>
Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>
Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>
Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>
Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>
Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>
Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>
Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>
Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>
Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>
Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>
Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>
Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>
Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>
Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>
Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>
Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>
Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>
Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>
Copy link
Collaborator

@schweitzpgi schweitzpgi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@amccaskey amccaskey merged commit dcbd363 into NVIDIA:main May 12, 2023
@github-actions github-actions bot locked and limited conversation to collaborators May 12, 2023
@bettinaheim bettinaheim added the enhancement New feature or request label Jun 28, 2023
@amccaskey amccaskey deleted the spinOpRefactor branch September 13, 2023 23:23
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants