-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nexus: use data hashing in heuristic equilibration detection algorithm #4557
Conversation
There are still some ways this might not reproduce the value.
These are probably sufficiently unlikely scenarios that having a consistent output via hashing is the more useful solution, but I did want to get them listed and considered. |
The data are numeric (numpy float64). If a deterministic solution other than hashing is desired, please state it and I will implement that. |
This would work for me. Have you thought about printing out the equilibration length? Giving an indication that equilibration length was taken into account was a good suggestion. (Definitely could be done in another PR) |
Test this please |
Test this please |
Test this please |
Test this please |
Test this please |
Proposed changes
The heuristic equilibration detection algorithm used by
qmca
when-e
is unspecified uses a random number to avoid stastistical bias in the selection of the equilibration length. This approach generates an estimate of the mean that varies each timeqmca
is run. While statistically correct, users expect deterministic behavior. This PR seeds the RNG based on the hash of the data being analyzed. This preserves use of the RNG for unbiasedness while presenting deterministic behavior to the user.Addresses #4556. The reported means are now consistent when multiple files are used (or when the same file is used repeatedly):
What type(s) of changes does this code introduce?
Does this introduce a breaking change?
What systems has this change been tested on?
Workstation
Checklist