A program for describing how different selections of N out of n degrees of freedom (mappings) affect the amount of information retained about a full data set.
Three quantities are calculated for each low-resolution representation, namely the mapping entropy:
the resolution:
and the relevance:
where K is the set of unique frequencies observed in the sample.
If you use pymap please cite this paper.
A minimal conda environment (see here) to run the calculations can be generated from the .yml file pymap.yml using the following command:
conda env create --file pymap.yml
Once the environment is correctly created, it must be activated via:
conda activate pymap
Pytest is employed to test the correct installation of pymap. In order to do so, run the following command from the main directory:
python -m pytest tests
Or directly run pytest inside the tests folder:
cd tests
pytest
If you like to add a contribution to this open-source project, follow these steps:
1. create an issue with a brief explanation of the contribution;
2. add a reasonable label to the issue, or create a new one;
3. create a new branch entirely dedicated to this contribution either on this repo on your fork;
4. develop the code
5. use tox to test the code. In particular, you should run the following commands:
tox -e py310
tox -e lint
The first command tests the code with a standard python 3.10 environment, while the second checks the code-style.
6. open a new Pull-Request on this page, correctly linking the issue. Ask for a review from anyone of the contributors. PS: the Pull Request must pass the continuous integration tests to be accepted.
Enjoy!
The program must be provided with two command line arguments, namely a task (-t) among measure and optimize and a relative path to a parameter file (-p), containing the parameters to be employed. A list of the accepted parameters is provided here:
Parameter | Description | Type | Mandatory |
---|---|---|---|
input_filename | relative path to the input data | str | yes |
output_filename | relative path to the desired output file | str | yes |
max_binom | max number of mappings that must be generated for each degree of coarse-graining | int | no |
nsteps | number of simulated annealing steps in the optimisation | int | no |
ncg | number of degrees of freedom to be used in the optimisation | int | no |
For task measure, *the default choice is to generate all the coarse-grained mappings for each N, a prescription that becomes prohibitive when n > 15.
Verbosity can be turned on with the -v (--verbose) flag.
In general, running
python src/pymap.py -h
shows the available command line arguments.
The first data set described in this article contains 20 non-interacting spins. The variables of interest can be calculated with the following command
python3 src/pymap.py -p parameters/parameters_spins.dat -t measure
In this context, the mapping space is quite big, and max_binom allows one to explore just a portion of it in few minutes:
python3 src/pymap.py -p parameters/parameters_spins_test.dat -t measure
To obtain the full results for the simple model of the Nasdaq stock market reported here one can use the following command:
python3 src/pymap.py -p parameters/parameters_m1.dat -t measure
and
python3 src/pymap.py -p parameters/parameters_m2.dat -t measure