forked from nanograv/PINT
-
Notifications
You must be signed in to change notification settings - Fork 0
/
covariance.py
218 lines (173 loc) · 8.23 KB
/
covariance.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
# ---
# jupyter:
# jupytext:
# formats: ipynb,py:percent
# text_representation:
# extension: .py
# format_name: percent
# format_version: '1.3'
# jupytext_version: 1.14.4
# kernelspec:
# display_name: .env
# language: python
# name: python3
# ---
# %% [markdown]
# # Accessing correlation matrices and model derivatives
#
# The results of a fit consist of best-fit parameter values and uncertainties, and residuals; these are conventional data products from pulsar timing. Additional information can be useful though: we can describe the correlations between model parameters in a matrix, and we can compute the derivatives of the residuals with respect to the model parameters. Both of these additional pieces of information can be obtained from a Fitter object in PINT; this notebook will demonstrate how to do this efficiently.
# %%
import contextlib
import os
from pathlib import Path
import matplotlib.pyplot as plt
import numpy as np
import scipy.linalg
import scipy.stats
import astropy.units as u
import matplotlib.pyplot as plt
from astropy.visualization import quantity_support
import pint.fitter
import pint.models
import pint.toa
pint.logging.setup(level="INFO")
quantity_support()
# %%
parfile = Path(pint.config.examplefile("NGC6440E.par"))
timfile = Path(pint.config.examplefile("NGC6440E.tim"))
assert parfile.exists()
assert timfile.exists()
# %%
m, t = pint.models.get_model_and_toas(parfile, timfile)
# %% [markdown]
# ## Extracting the parameter covariance matrix
#
# Unfortunately, parameter correlation matrices are not stored when `.par` files are recorded, only the individual parameter uncertainties. In PINT, the machinery for computing these matrices resides in `Fitter` objects. We will therefore construct one and carry out a fit - but we will take zero steps, so that the fit doesn't change the solution (and it runs fairly quickly!).
#
# Normally you should probably actually do something if the model isn't converged! Specifically, the covariance matrix probably isn't very meaningful if you're not at a best-fit set of parameters. Unfortunately, with `maxiter=0` you will always get an exception claiming that the fitter hasn't converged.
# %%
fitter = pint.fitter.Fitter.auto(t, m)
with contextlib.suppress(pint.fitter.MaxiterReached):
fitter.fit_toas(maxiter=0)
# %% [markdown]
# You can get a human-readable version of the parameter correlation matrix:
# %%
fitter.parameter_correlation_matrix
# %% [markdown]
# If you want a machine-readable version:
# %%
fitter.parameter_correlation_matrix.labels
# %%
fitter.parameter_correlation_matrix.matrix
# %% [markdown]
# Be warned: the labels here are not necessarily in the same order as `fo.model.free_params`. Also, if the model includes red noise parameters, there may be more rows and columns than labels in the parameter covariance matrix. These unlabelled rows and columns will always be at the end. Let's check that there aren't surprises waiting for this pulsar.
#
# Note also that correlation matrices are unitless, so the units recorded in the `.labels` attribute are wrong.
#
# Even if there are no noise component entries, the correlation matrix includes a row and column for the non-parameter `Offset`. This arises because internally PINT fits allow for a constant offset in phase, but custom in pulsar timing is to report mean-subtracted residuals and ignore the absolute phase.
# %%
print(f"Model free parameters: {len(fitter.model.free_params)}")
print(
f"Correlation matrix labels: {len(fitter.parameter_correlation_matrix.labels[0])}"
)
print(f"Correlation matrix shape: {fitter.parameter_correlation_matrix.shape}")
# %% [markdown]
# Let's extract the correlation matrix in a more convenient form. This requires some fancy indexing to rearrange the rows and columns as needed.
# %%
pint_correlations = fitter.parameter_correlation_matrix
params = fitter.model.free_params
corr_labels = [label for label, _ in pint_correlations.labels[0]]
ix = [corr_labels.index(p) for p in params]
raw_correlation = pint_correlations.matrix
assert np.allclose(raw_correlation, raw_correlation.T)
raw_correlation = (raw_correlation + raw_correlation.T) / 2
# extract rows in the right order then columns in the right order
correlation = (raw_correlation[ix, :])[:, ix]
assert correlation.shape == (len(params), len(params))
for i, p1 in enumerate(params):
assert p1 in corr_labels
for j, p2 in enumerate(params[: i + 1]):
assert (
correlation[i, j]
== raw_correlation[corr_labels.index(p1), corr_labels.index(p2)]
)
assert correlation[i, j] == correlation[j, i]
# %% [markdown]
# Let's summarize the worst covariances.
# %%
correlation_list = [
(p1, p2, correlation[i, j])
for i, p1 in enumerate(params)
for j, p2 in enumerate(params[:i])
]
correlation_list.sort(key=lambda t: -abs(t[-1]))
for p1, p2, c in correlation_list:
if abs(c) < 0.5:
break
print(f"{p1:10s} {p2:10s} {c:+.15f}")
# %% [markdown]
# ## Error ellipses
#
# In the frequentist least-squares fitting we do in PINT, the model is assumed to be linear over the range of plausible values, and as a result the estimate of the plausible parameter distribution is a multivariate normal distribution (with correlations as computed above). The confidence regions we obtain are therefore ellipsoids. An n-dimensional ellipsoid is rather cumbersome to visualize, but we can find it useful to plot two-dimensional projections. These are analogous to Bayesian posterior distributions and credible regions.
#
# Let's plot the credible region for the pair of parameters `DM` and `F1`.
# %%
p1 = "F0"
p2 = "F1"
i = params.index(p1)
j = params.index(p2)
cor = np.array([[1, correlation[i, j]], [correlation[i, j], 1]])
sigmas = np.array([fitter.get_fitparams_uncertainty()[p] for p in [p1, p2]])
vals, vecs = scipy.linalg.eigh(cor)
def plot_ellipses():
for n_sigma in [1, 2, 3]:
thresh = np.sqrt(scipy.stats.chi2(2).isf(2 * scipy.stats.norm.cdf(-n_sigma)))
angles = np.linspace(0, 2 * np.pi, 200)
points = thresh * (
np.sqrt(vals[0]) * np.cos(angles)[:, None] * vecs[None, :, 0]
+ np.sqrt(vals[1]) * np.sin(angles)[:, None] * vecs[None, :, 1]
)
plt.plot(
points[:, 0] * sigmas[0], points[:, 1] * sigmas[1], label=f"{n_sigma} sigma"
)
plt.axvspan(-sigmas[0], sigmas[0], alpha=0.5, label="one-sigma single-variable")
plt.axhspan(-sigmas[1], sigmas[1], alpha=0.5)
plt.xlabel(r"$\Delta$" + f"{p1}")
plt.ylabel(r"$\Delta$" + f"{p2}")
plot_ellipses()
plt.legend()
# %% [markdown]
# You can generate something like a posterior sample fairly easily:
# %%
all_sigmas = np.array([fitter.get_fitparams_uncertainty()[p] for p in params])
sample = (
scipy.stats.multivariate_normal(cov=correlation).rvs(size=1000)
* all_sigmas[None, :]
)
plot_ellipses()
plt.plot(sample[:, i], sample[:, j], ".", label="sample points")
plt.legend()
# %% [markdown]
# ## Model derivatives
#
# PINT's fitters rely on having analytical derivatives of the timing model with respect to each parameter. These can be obtained by querying appropriate methods in the `TimingModel` object, but it is more conveniently packaged as the "design matrix" for the fit. Here too the order of the parameters may well not match what is in `fitter.model.free_params`.
# %%
design, names, units = fitter.get_designmatrix()
print(names)
print(units)
print(fitter.model.free_params)
print(design.shape)
# %% [markdown]
# Let's look at the derivatives (normalized so their scales are comparable) as a function of time. This may give us some hints for why the covariances are what they are.
# %%
mjds = t.get_mjds()
ix = np.argsort(mjds)
assert np.all(np.diff(mjds[ix]) > 0)
for deriv, param in zip(design.T, names):
deriv_normalized = deriv / np.sqrt((deriv**2).mean())
plt.plot(mjds[ix], deriv_normalized[ix], label=param)
plt.legend()
plt.xlabel("MJD")
plt.ylabel("Derivative (normalized)")
# %% [markdown]
# We didn't actually include `Offset` in our covariance descriptions, but its covariance with DM is quite strong, and this plot shows why: we just don't have much frequency coverage, so the primary effect of changing DM is to change the absolute delay. Happily this isn't a covariance that need trouble us unless we want absolute phases.