Chevychase #48

kmdalton · 2021-02-02T11:38:34Z

This pull request will

improve quadrature integration for French Wilson scaling
add updated reference data from pymc3
add improved tests for posterior parameter estimation

The major point is that we now have a much larger gold standard test data set in the form of Markov chain Monte Carlo integration performed in pymc3. I've ensured that the output of our algorithm stays very close to the MCMC output. I've improved numerical stability and decreased memory requirement by using Gauss-Chebyshev quadrature with degree 100. It is possible we can lower this degree setting if there are performance concerns. Currently, all posterior parameters are within 2.5% of the MCMC values as compared to 6% for the cctbx implementation of the classical algorithm.

I think this implementation constitutes a new state of the art. Two things we could potentially do to improve down the line:

optimize the integration window size in order to maximize consistency with MCMC
optimize the degree of the approximation
Both of these things are set quite arbitrarily right now.

Here are some histograms of percent errors between rs and MCMC for the four posterior parameter estimates:

codecov-io · 2021-02-02T11:48:17Z

Codecov Report

Merging #48 (71f10b4) into master (e7b8c8f) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master      #48   +/-   ##
=======================================
  Coverage   99.17%   99.17%           
=======================================
  Files          34       34           
  Lines        1330     1341   +11     
=======================================
+ Hits         1319     1330   +11     
  Misses         11       11

Flag	Coverage Δ
unittests	`99.17% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...alspaceship/algorithms/scale_merged_intensities.py	`100.00% <100.00%> (ø)`
reciprocalspaceship/dataset.py	`99.02% <0.00%> (+0.02%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e7b8c8f...71f10b4. Read the comment docs.

JBGreisman

Looks good to me. I'm not crazy about including the standalone plotting script diagnostics.py, but if you find it useful I think its fine. It's at least in the testing > data subdirectory specific to french-wilson so its context is fairly clear.

kmdalton added 12 commits January 29, 2021 12:41

added fw test data from cctbx

73c2957

add fixture for french wilson cctbx test data

e59fb0f

don't polute the csv with rangeindex

2616101

move fw fixture into algorithms

f5a949a

first version testing fw against cctbx

423e10f

made new fw reference data from pymc3

fd2065f

chebyshev quadrature & mcmc reference data

e9fb663

oops accidentally mangled this file

ee9f78f

unequivocally better than linear interpolation

10929ba

i hope this isn't clutter, but it has been useful

42a82df

updated labels

7a1c6bd

added a missing histogram

71f10b4

JBGreisman added the enhancement Improvement to existing feature label Feb 2, 2021

JBGreisman approved these changes Feb 2, 2021

View reviewed changes

kmdalton merged commit 22d0cd9 into master Feb 2, 2021

kmdalton deleted the chevychase branch February 2, 2021 12:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chevychase #48

Chevychase #48

kmdalton commented Feb 2, 2021

codecov-io commented Feb 2, 2021 •

edited

Loading

JBGreisman left a comment

Chevychase #48

Chevychase #48

Conversation

kmdalton commented Feb 2, 2021

codecov-io commented Feb 2, 2021 • edited Loading

Codecov Report

JBGreisman left a comment

Choose a reason for hiding this comment

codecov-io commented Feb 2, 2021 •

edited

Loading