Utilities for working with matrices of squared Euclidean distances.
D̃,S = complete_distmat(D, W)
: Fills in missing entries in an incomplete and noisy squared distance matrix.W
is a binary mask indicating available values. (Algorithm 5 from the reference below).D̃,E = rankcomplete_distmat(D, W, dim)
: Same as above, but works on larger matrices and is less accurate. (Algorithm 2 from the reference below).P = reconstruct_pointset(D, dim)
Takes a squared distance matrix or the SVD of one and reconstructs the set of points embedded in dimensiondim
that generatedD
; up to a translation and rotation/reflection. Seeprocrustes
for help with aligning the result to a collection of anchors.R,t = procrustes(X, Y)
Find rotation matrixR
and translation vectort
such thatR*X .+ t ≈ Y
denoise_distmat(D, dim, p=2)
Takes a noisy squared distance matrix and returns a de-noised version.p
denotes the "norm" used in measuring the error.p=2
assumes that the error is Gaussian, whereasp=1
assumes that the error is large but sparse. The robust factorization comes from TotalLeastSquares.jl.posterior
Estimate the posterior distribution of locations given both noisy location measurements and distance measurements (not squared), see more details below.
using Pkg
Pkg.add([
PackageSpec(url="https://github.com/baggepinnen/Turing2MonteCarloMeasurements.jl")
PackageSpec(url="https://github.com/baggepinnen/EuclideanDistanceMatrices.jl")
])
If both noisy position estimates and noisy distance measurements are available, we can estimate the full Bayesian posterior over positions. To this end, the function posterior
is available. We demonstrate how it's used with an example, and start by generating some synthetic data:
using EuclideanDistanceMatrices, Turing, MonteCarloMeasurements, Test
N = 10 # Number of points
σL = 0.1 # Location noise std
σD = 0.01 # Distance noise std (measured in the same unit as positions)
P = randn(2,N) # These are the true locations
Pn = P + σL*randn(size(P)) # Noisy locations
D = pairwise(Euclidean(), P, dims=2) # True distance matrix (this function exoects distances, not squared distances).
Dn = D + σD*randn(size(D)) # Noisy distance matrix
Dn[diagind(Dn)] .= 0 # The diagonal is always 0
# We select a small number of distances to feed the algorithm, this corresponds to only some distances between points being measured
distances = []
p = 0.5 # probability of including a distance
for i = 1:N
for j = i+1:N
rand() < p || continue
push!(distances, (i,j,Dn[i,j]))
end
end
@show length(distances)
@show expected_number_of_entries = p*((N^2-N)÷2)
Given the locations P
and distances
(vector of tuples with indices and distances), we can now estimate the posterior:
part, chain = posterior(
Pn,
distances;
nsamples = 2000,
sampler = NUTS(),
σL = σL, # This can also be a vector of std:s for each location, see ?MvNormal for alternatives
σD = σD # This can also be a vector of std:s for each location, see ?MvNormal for alternatives
)
The returned object part
is a named tuple containing all the internal variables that were sampled. The fields are of type Particles
from MonteCarloMeasurements.jl, representing the full posterior distribution of each quantity. The interesting fields are part.P
which contains the posterior positions, and part.d
which contains the estimated distances. The object chain
contains the same information as part
, but in the form of a Turing.Chain
object.
Note that the number of samples in the posterior will not be the same as the number requested by nsamples
since Turing automatically drops bad samples etc.
We can verify that the estimated locations are closer to the true locations than the ones provided by the measurements alone, and plot the results
@test norm(pmean.(part.P) - P) < norm(Pn - P)
scatter(part.P[1,:], part.P[2,:], markersize=6)
scatter!(P[1,:], P[2,:], lab="True positions")
scatter!(Pn[1,:], Pn[2,:], lab="Measured positions")
Under the hood, Turing.jl is used to sample from the posterior. If you have a lot of points, it will take a while to run this function. If the sampling takes too long time, you may try estimating an MAP estimate instead. To do this, run using Optim
and then pass sampler = MAP()
. More docs on MAP estimation is found here.
In this setting, we add one location in the matrix of locations, corresponding to the location of the source that generated the ping.
We then set the keyword tdoa=true
when calling posterior
, and let the vector of (i, j, dist)
instead be (i,j,tdoa)
. Below is a similar example to the one above, but adapted to this setting.
N = 10 # Number of points
# The standard deviations below can also be supplied as vectors with one element per location
σL = 0.1 # Location noise std
σD = 0.01 # TDOA noise std (measured in the same unit as positions)
P = 3randn(2, N) # These are the true locations
source = randn(2) # The true source location
Pn = P + σL * randn(size(P)) # Noisy locations
tdoas = []
noisy_tdoas = []
p = 0.5 # probability of including a TDOA
for i = 1:N
for j = i+1:N
if rand() < p
di = norm(P[:, i] - source) # Distance from source to i
dj = norm(P[:, j] - source) # Distance from source to j
tdoa = di - dj # This is the predicted TDOA given the posterior locations
push!(tdoas, (i, j, tdoa))
push!(noisy_tdoas, (i, j, tdoa + σD * randn()))
end
end
end
@show length(tdoas)
@show expected = p * ((N^2 - N) ÷ 2)
part, chain = posterior(
[Pn source], # We add the source location to the end of this matrix
noisy_tdoas;
nsamples = 2000,
sampler = NUTS(),
σL = σL,
σD = σD, # This can also be a vector of std:s for each location, see ?MvNormal for alternatives
tdoa = true, # Indicating that we are providing TDOA measurements
)
@test norm(pmean.(part.P[:, 1:end-1]) - P) < norm(Pn - P)
Once again, we visualize the resulting estimate
scatter(part.P[1, 1:end-1], part.P[2, 1:end-1], markersize = 6)
scatter!(P[1, :], P[2, :], lab = "True positions")
scatter!(Pn[1, :], Pn[2, :], lab = "Measured positions")
scatter!(
[part.P[1, end]],
[part.P[2, end]],
m = (:x, 8),
lab = "Est. Source",
)
scatter!([source[1]], [source[2]], m = (:x, 8), lab = "True Source") |> display
The function posterior
estimates the absolute positions of the sensors in the coordinate system used to provide the location measurements. Oftentimes, the relative positions between the sensors are sufficient, and are also easier to estimate. Estimates of the relative positions are available in the resulting samples from the posterior distribution, but hidden within the samples. If we draw 2000 samples from the posterior, the absolute coordinates of each sample can be aligned to the mean of all samples (using procrustes
), after which 2000 samples of the relative positions are available. This relative estimate will have lower variance than the absolute estimate. To facilitate this alignment, we have the function
P_relative = align_to_mean(part.P)
tr(cov(vec(part.P))) > tr(cov(vec(P_relative))) # Test that the covariance matrix is "smaller"
using Pkg
pkg"add https://github.com/baggepinnen/EuclideanDistanceMatrices.jl"
Most of the algorithms implemented in this package are described in the excellent paper
"Euclidean Distance Matrices: Essential Theory, Algorithms and Applications"
Ivan Dokmanic, Reza Parhizkar, Juri Ranieri and Martin Vetterli
https://arxiv.org/pdf/1502.07541.pdf