spin_op performance enhancement #115

amccaskey · 2023-04-26T20:08:25Z

Update spin_op to use a unordered_map<vector<bool>, complex<double>> as the primary container for pauli terms.

Below are results for a large (15 qubit) random spin_op addition + multiplication operations over a range of term numbers.

boschmitt

I added a few preliminary comments.

Could you provide a comparison between this enhancement and the current implementation?
Do you know why changing using std::unordered_map provides better performance ?

spin_op's public API currently relies (or leaks) the fact that its current underlying implementation behaves like a vector, e.g., operator[] (size_t index) and spin_op::slice. If we use a std::unordered_map as the underlying data structure to store terms, these methods don't make much sense and will likely cause various bugs.

python/runtime/cudaq/spin/py_spin_op.cpp

runtime/cudaq/spin/spin_op.cpp

anthony-santana

The code and performance improvements look great! Most comments are related to either nits in the tests or replacing certain member functions with properties.

python/runtime/cudaq/spin/py_spin_op.cpp

python/tests/unittests/test_SpinOperator.py

python/runtime/cudaq/spin/py_spin_op.cpp

python/tests/unittests/test_SpinOperator.py

python/tests/unittests/test_observe.py

python/tests/unittests/test_sample.py

amccaskey · 2023-05-07T19:08:36Z

I added a few preliminary comments.

Could you provide a comparison between this enhancement and the current implementation? Do you know why changing using std::unordered_map provides better performance ?

spin_op's public API currently relies (or leaks) the fact that its current underlying implementation behaves like a vector, e.g., operator[] (size_t index) and spin_op::slice. If we use a std::unordered_map as the underlying data structure to store terms, these methods don't make much sense and will likely cause various bugs.

@boschmitt

Great points to bring up.

The current implementation was never tested for performance / speed as the number of terms / qubits grew (everything was fine for the small applications we'd run up to this point). My early tests show that the same benchmark as above is a factor of 100 slower than this unordered_map implementation (hence the need for this PR 😄 ).

The benefits of the unordered_map come in with regards to storing unique Pauli terms, where the uniqueness is enforced via the map key / hash, and like terms are updated by just updating the value (the term coefficient). This becomes more important as we start multiplying spin_op instances with multiple terms (P0 + P1 + ...) * (Q0 + Q1 + ...), which results in many multiplication and addition operations to update the underlying spin_op data structure (vector of vectors in the current implementation vs an unordered_map in this PR). To perform this operation, one has to find like terms (vector<bool>) and compare against others newly created by the multiplication / addition operations. std::vector search complexity is O(N) while unordered_map should be O(1) (granted I think this ignores constant factors, but I think the proof is in the benchmark above). Moreover, I found it easier to reason about multi-threaded multiplications with the unordered_map at the cost of extra memory. Hence the addition of OpenMP parallelization for operator*=().

As for this public API, you are correct that this changes how downstream programmers can / should reason about the underlying terms. I should remove the operator[] method and replace with begin() / end() methods returning iterators that allow range-based for loops with structured bindings. The documentation on the class should also be updated to tell the programmer that any underlying order of the terms should not be relied on as the order is no non-deterministic. I think that slice or something like it can still exist, since there are N key/values in the map and the use case for slice is primarily to extract equally sized chunks for parallel processing. Maybe the name needs to change?

Open to more comments / feedback here in an effort to make this even better. Maybe I've missed something?

amccaskey · 2023-05-08T12:53:22Z

@boschmitt Latest commit removes slice and replaces with distribute_terms. All of your comments should be addressed now.

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

schweitzpgi

LGTM

amccaskey force-pushed the spinOpRefactor branch 2 times, most recently from fb0e33c to 158b813 Compare May 1, 2023 20:15

amccaskey marked this pull request as ready for review May 1, 2023 20:15

amccaskey requested review from boschmitt, schweitzpgi, bettinaheim and anthony-santana May 1, 2023 20:15

boschmitt reviewed May 3, 2023

View reviewed changes

anthony-santana reviewed May 3, 2023

View reviewed changes

amccaskey force-pushed the spinOpRefactor branch 2 times, most recently from 799158d to 0c3b4f1 Compare May 8, 2023 12:51

amccaskey force-pushed the spinOpRefactor branch 5 times, most recently from 687d2de to 0b015bb Compare May 12, 2023 12:08

amccaskey added 12 commits May 12, 2023 17:14

wip

ceed763

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

wip

b7844cf

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

wip

16b836d

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

remove old files

0c8a77a

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

clean up

5c5a875

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

Clean up, implement final methods

155ccee

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

missed a couple test updates

23eac8c

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

address PR comments

e38e4b5

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

start on spin_op iterator

6f69c5c

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

add spin_op iterators and bind to python

615f43e

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

remove dead code

0ba3642

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

fix test errors, fix docs errors, add run_clang_format back

05e40f4

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

amccaskey added 4 commits May 12, 2023 17:14

remove spin_op::slice, replace with spin_op::distribute_terms()

2ef4868

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

fix docs gen issue

ba7ae63

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

work on spell check

b964682

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

clang format

e09df65

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

amccaskey force-pushed the spinOpRefactor branch from 0b015bb to e09df65 Compare May 12, 2023 17:14

amccaskey added 5 commits May 12, 2023 17:18

more spell check

0298644

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

work on python spell check

6514959

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

more

e5cf8bb

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

fix sort order

e11e164

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

fix sort order

bb70a30

Signed-off-by: Alex McCaskey <amccaskey@nvidia.com>

schweitzpgi approved these changes May 12, 2023

View reviewed changes

amccaskey merged commit dcbd363 into NVIDIA:main May 12, 2023

github-actions bot locked and limited conversation to collaborators May 12, 2023

bettinaheim added the enhancement New feature or request label Jun 28, 2023

amccaskey deleted the spinOpRefactor branch September 13, 2023 23:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spin_op performance enhancement #115

spin_op performance enhancement #115

amccaskey commented Apr 26, 2023 •

edited

Loading

boschmitt left a comment

anthony-santana left a comment

amccaskey commented May 7, 2023

amccaskey commented May 8, 2023

schweitzpgi left a comment

spin_op performance enhancement #115

spin_op performance enhancement #115

Conversation

amccaskey commented Apr 26, 2023 • edited Loading

boschmitt left a comment

Choose a reason for hiding this comment

anthony-santana left a comment

Choose a reason for hiding this comment

amccaskey commented May 7, 2023

amccaskey commented May 8, 2023

schweitzpgi left a comment

Choose a reason for hiding this comment

amccaskey commented Apr 26, 2023 •

edited

Loading