-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding circuit executor classes and shot-branching #1766
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to leave comments on aer_controller.hpp
first.
src/simulators/simulators.hpp
Outdated
#include "simulators/density_matrix/densitymatrix_state.hpp" | ||
#include "simulators/extended_stabilizer/extended_stabilizer_state.hpp" | ||
#include "simulators/matrix_product_state/matrix_product_state.hpp" | ||
#include "simulators/stabilizer/stabilizer_state.hpp" | ||
#include "simulators/statevector/qubitvector.hpp" | ||
#include "simulators/statevector/statevector_state.hpp" | ||
#include "simulators/superoperator/superoperator_state.hpp" | ||
#include "simulators/tensor_network/tensor_net_state.hpp" | ||
#include "simulators/unitary/unitary_state.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above include
are not necessary if we do not refactor Controller::execute
.
@@ -917,18 +565,7 @@ Result Controller::execute(std::vector<Circuit> &circuits, | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Controller::execute(std::vector<Circuit> &circuits, ..)
method can be
- Determine methods (calling
simulation_methods()
) - Construct
std::vector<std::shared_ptr<Executor>>
- Determine experiment parallelization (
set_parallelization_experiments()
) by usingExecutor.required_memory_mb()
- Call
Executor. run_circuit()
in parallel or serial
Construction of Executor
can be via a static method in simulators.hpp
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I create Controller::make_circuit_executor
function and removed required_memory_mb
and validate_state
from Controller
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments for circuit_executor.hpp
is here:
uint_t distributed_procs_; // number of processes in communicator group | ||
uint_t distributed_group_; // group id of distribution | ||
int_t distributed_proc_bits_; // distributed_procs_=2^distributed_proc_bits_ | ||
// (if nprocs != power of 2, set -1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uint_t myrank_; // process ID
uint_t nprocs_; // number of processes
uint_t distributed_rank_; // process ID in communicator group
uint_t distributed_procs_; // number of processes in communicator group
uint_t distributed_group_; // group id of distribution
int_t distributed_proc_bits_; // distributed_procs_=2^distributed_proc_bits_
// (if nprocs != power of 2, set -1)
Theses values are only for MPI, I think. Can we put them in #ifdef AER_MPI
. Maybe we set some codes in aer_controller.hpp
in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are many depending codes to these params, so I would like to keep them available for non MPI environment. But I added #ifdef AER_MPI
to the codes we can separate.
#endif | ||
|
||
// settings for cuStateVec | ||
bool cuStateVec_enable_ = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also can be in #ifndef AER_THRUST_CPU
I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added #ifdef AER_CUSTATEVEC
to this param.
|
||
// Rng engine (this one is used to add noise on circuit) | ||
RngEngine rng; | ||
rng.set_seed(circ.seed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rng.set_seed(circ.seed);
is called in later (run_circuit_with_sampling
or run_circuit_shots
). It is better to use different seeds for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now initial rng used for noise sampling is reused in run_circuit_xxx
functions.
I think this is the strategy used in the older code.
src/simulators/circuit_executor.hpp
Outdated
rng.set_seed(circ.seed); | ||
run_with_sampling(circ, state, result, rng, circ.shots); | ||
} else { | ||
// Vector to store parallel thread output data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused that run_circuit_with_sampling
needs to consider about parallel_shots_
. If we use sampling
, state
will be calculated once and then get bitstrings from the state
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed unnecessary functions and codes
Utils::apply_omp_parallel_for((par_shots > 1), 0, par_shots, | ||
run_circuit_lambda); | ||
|
||
// gather cregs on MPI processes and save to result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The below can be in #AER_MPI
(gather_creg_memory
is also), I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#ifdef AER_MPI
inserted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed Executors. I'm wondering MultiStateExecutor
is effective if shot-branching improves any methods.
src/controllers/aer_controller.hpp
Outdated
@@ -948,12 +535,14 @@ Result Controller::execute(std::vector<Circuit> &circuits, | |||
result.metadata.add(max_memory_mb_, "max_memory_mb"); | |||
result.metadata.add(max_gpu_memory_mb_, "max_gpu_memory_mb"); | |||
|
|||
#ifdef AER_MPI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very minor comments. We can consolidate multiple _OPENMP
and AER_MPI
ifdef blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merged ifdef blocks
for (int_t i = 0; i < nshots; i++) | ||
data.add_single(datum, key); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why we add the same datum
multiple times here. Even if a caller wants to add the same data multiple times, it is better to add it outside of save_data_pershot
as follows:
for (size_t i = 0; i < root.num_shots(); ++i) {
result.save_data_pershot(Base::states_[root.state_index()].creg(),
op.string_params[0], amps, op.type, op.save_type);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed nshots and loop from save_dat_pershot
src/noise/noise_model.hpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can reuse qerror_loc
to sampling in runtime. But it is fine to use sample_noise
explicitly to specify points to inject noise again in C++ (also, we may be able to cover noise injection without qerror_loc
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In principle, BatchShotsExecutor
should not inherit ParallelStateExecutor
because batch execution and chunk execution are independent. However, currently, batch execution is supported only by statevector and densitymatrix and the both supports batch execution. I agree with this class structure because we should avoid complexity from multiple inheritance. We will be able to consider another class structure when batch execution is supported in other methods (that do not support chunk execution).
protected: | ||
void set_config(const Config &config) override; | ||
|
||
void set_distribution(uint_t num_states); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add comments to explain this method? Originally, I found the following comment instate_chunk.hpp
to the method of the same name excepting its argument name uint_t nprocs
.
// set number of processes to be distributed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added comment
return state.required_memory_mb(circ.num_qubits, circ.ops); | ||
} | ||
return std::make_shared< | ||
CircuitExecutor::Executor<MatrixProductState::State>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not MultiStateExecutor
is used for MPS and others?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only a simulation method who supports shot-branching inherits MultiStateExecutor
at this time.
Since 0.13.0, Aer does not support Python 3.7. This commit removes github actions for CI. * Removing python 3.7 from test workflow * Removing python 3.7 from build workflow * Removing python 3.7 from deploy workflow * Removing python 3.7 from tox * revert * Remove python 3.7 from pyproject.toml * Remove python 3.7 from pyproject.toml - tool --------- Co-authored-by: Hiroshi Horii <hhorii@users.noreply.github.com>
Qiskit#1877) Co-authored-by: Hiroshi Horii <hhorii@users.noreply.github.com>
* Fix OpenMP nested parallel * add comment in release note * fix true and false * fix format --------- Co-authored-by: Hiroshi Horii <hhorii@users.noreply.github.com>
* Support u3 gate application * Apply clang-format * Revert clang-format for aer_runtime_api.h * Add release note --------- Co-authored-by: Hiroshi Horii <hhorii@users.noreply.github.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* Fix required_memory_mb * add release note --------- Co-authored-by: Hiroshi Horii <hhorii@users.noreply.github.com>
Since 0.12, Qiskit-Aer notices deprecation warnings to use of PulseSimulato. Because 0.13 will be released after +3 months since 0.12 was released, Qiskit-Aer will stop supports of pulse simulation. * first pass at removing pulse simulator * autoformat with black * remove ref to aer pulse in docs * fix lint issues * remove pulse rst * remove pulse tests * add release note * remove open pulse from CMakeLists.txt * remove pulse tests * remove remaining pulse codes --------- Co-authored-by: AngeloDanducci <angelo.danducci.ii@ibm.com>
Correct C API `aer_state_initialize` to take an argument of `handler`. * update aer_state_initialize API * add reno
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though I was not able to review the detail of implementation due to the size of changes, I believe that we can merge this PR and have time to refine implementation by the release of 0.13.0.
Summary
This PR restructures parallel simulation classes that were implemented in
StateChunk
class.Instead of
StateChunk
this PR introducesCircuitExecutor
classes outside ofState
classes.This PR also implements shot-branching revised from PR #1606
Now the implementation is simplified in
CircuitExecutor::MultiStateExecutor
classDetails and comments