Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tex labels in MultiIndex columns #215

Merged
merged 72 commits into from
Aug 21, 2022
Merged
Show file tree
Hide file tree
Changes from 61 commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
1f3d6c0
Made the default index be a set of 1s
williamjameshandley Aug 7, 2022
4d203ca
Tested out the new weighted pandas functionality
williamjameshandley Aug 7, 2022
26a0aae
Correction for 3.6. concat is preferred over append (which is depreca…
williamjameshandley Aug 7, 2022
082f554
Merge branch 'master' into default_index
lukashergt Aug 9, 2022
8256244
Added tests to check status of weighted series
williamjameshandley Aug 10, 2022
4f20ab7
Stuck with transposes
williamjameshandley Aug 10, 2022
04c73e4
Merge branch 'master' into default_index
williamjameshandley Aug 10, 2022
1d05949
Reverted anesthetic samples to master
williamjameshandley Aug 11, 2022
2b4efcf
First draft
williamjameshandley Aug 11, 2022
db1a183
All tests passing
williamjameshandley Aug 11, 2022
54cee67
Merge branch 'master' into default_index
williamjameshandley Aug 11, 2022
8b2a3e8
Tidied up docstrings and hidden functions
williamjameshandley Aug 11, 2022
96fc8d5
Added formatting code for fixed with column labels
williamjameshandley Aug 11, 2022
63cea7b
Removed annotations for 3.6 compatibility
williamjameshandley Aug 11, 2022
7044548
Fix for python 3.6
williamjameshandley Aug 11, 2022
c8fd855
Merge branch 'master' into tex
lukashergt Aug 11, 2022
6b81c3b
started merging
williamjameshandley Aug 12, 2022
7780cc8
Majority of tests now passing
williamjameshandley Aug 12, 2022
cae0657
Fixed apart from tex deep copying
williamjameshandley Aug 12, 2022
64fbcd0
Merge branch 'master' into default_index
williamjameshandley Aug 12, 2022
2caa19d
Removed A.tex is B.tex tests, as these now fail in pandas
williamjameshandley Aug 12, 2022
a1338aa
Reorganised to improve diff, and allow set_weights to unweight with None
williamjameshandley Aug 12, 2022
fa95dfd
Merge branch 'default_index' into tex
williamjameshandley Aug 12, 2022
988faca
Fixed bug in getdist test
williamjameshandley Aug 14, 2022
ba0eff5
Updated to test for unweighted correlation behaviour
williamjameshandley Aug 14, 2022
0b69992
Increased coverage
williamjameshandley Aug 14, 2022
4640519
Bringing up to coverage
williamjameshandley Aug 15, 2022
469b6db
Upgrades to corrwith
williamjameshandley Aug 15, 2022
634a06f
lint fix
williamjameshandley Aug 15, 2022
a18776f
Increasing coverage
williamjameshandley Aug 15, 2022
9dd3756
First idea
williamjameshandley Aug 15, 2022
c152bc8
Merge branch 'default_index' into tex
williamjameshandley Aug 15, 2022
41f3dff
Attempting a copy with a weighted labelled frame
williamjameshandley Aug 15, 2022
56c853d
Merge branch 'tex' of github.com:williamjameshandley/anesthetic into tex
williamjameshandley Aug 15, 2022
15cfe4f
Trying to squash importance and merging bug
williamjameshandley Aug 15, 2022
897ae49
Merge branch 'master' into default_index
williamjameshandley Aug 16, 2022
c4faabf
Removed reordering of index
williamjameshandley Aug 16, 2022
a8830da
Added tests of reordering of indices
williamjameshandley Aug 16, 2022
bbe259c
Added some tests to check ordering
williamjameshandley Aug 16, 2022
2625ce8
Issues with assignment
williamjameshandley Aug 16, 2022
3998192
Merge branch 'master' into tex
williamjameshandley Aug 16, 2022
8c7c0ae
Removed incomplete merge
williamjameshandley Aug 16, 2022
8ffaedd
Merge branch 'default_index' into tex
williamjameshandley Aug 16, 2022
82b29fb
Removed label structures
williamjameshandley Aug 16, 2022
44936f0
Merge branch 'master' into tex
williamjameshandley Aug 16, 2022
618a56c
LabelledSeries now tested
williamjameshandley Aug 17, 2022
26da126
Trying a different strategy
williamjameshandley Aug 17, 2022
65ca252
More robust setup.
williamjameshandley Aug 17, 2022
5ecc6f9
Removed merge file
williamjameshandley Aug 17, 2022
4e0eee9
First draft of new setup
williamjameshandley Aug 18, 2022
4d80e76
Added column tests for multiindex labelled frame
williamjameshandley Aug 18, 2022
0068918
Most tests passing
williamjameshandley Aug 18, 2022
db3ba6a
Minor corrections
williamjameshandley Aug 19, 2022
f90d270
increasing coverage
williamjameshandley Aug 19, 2022
d05a43a
lint corrections
williamjameshandley Aug 19, 2022
95b65a6
Added axis functionality
williamjameshandley Aug 19, 2022
54c9239
Added better labels keyword
williamjameshandley Aug 19, 2022
3519fc2
set_label functionality now available
williamjameshandley Aug 20, 2022
847a503
Merge branch 'master' into tex
williamjameshandley Aug 20, 2022
1b463c3
Updated for when getdist is not installed
williamjameshandley Aug 20, 2022
2318671
Fixes for python 3.6
williamjameshandley Aug 20, 2022
d5d19f6
Merge branch 'master' into tex
williamjameshandley Aug 20, 2022
3158dd3
Updated test post merge
williamjameshandley Aug 20, 2022
3dabdb8
Added tests for uncovered code
williamjameshandley Aug 20, 2022
05c75f5
Made a new WeightedLabelled class
williamjameshandley Aug 20, 2022
ae15fe7
Improved passing on of labels to slices
williamjameshandley Aug 20, 2022
3fe2472
Merge branch 'master' into tex
williamjameshandley Aug 20, 2022
0124c98
Merge branch 'master' into tex
lukashergt Aug 21, 2022
3273cae
Merge branch 'master' into tex
williamjameshandley Aug 21, 2022
442e3fd
Merge branch 'tex' of github.com:williamjameshandley/anesthetic into tex
williamjameshandley Aug 21, 2022
50ed374
Changed axis=0 to axis=1 for labelled portion of LabelledDataFrame
williamjameshandley Aug 21, 2022
0bb2275
Removed all superfluous axis=1 specifications
williamjameshandley Aug 21, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions anesthetic/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
import pandas
import pandas.plotting._core
import pandas.plotting._misc
from anesthetic._format import _DataFrameFormatter


def _anesthetic_override(_get_plot_backend):
Expand All @@ -40,6 +41,8 @@ def wrapper(backend=None):
# Set anesthetic.plotting._matplotlib as the actual backend
pandas.options.plotting.backend = 'anesthetic.plotting._matplotlib'

pandas.io.formats.format.DataFrameFormatter = _DataFrameFormatter
pandas.options.display.max_colwidth = 14

Samples = anesthetic.samples.Samples
MCMCSamples = anesthetic.samples.MCMCSamples
Expand Down
55 changes: 55 additions & 0 deletions anesthetic/_format.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# flake8: noqa
from pandas.io.formats.format import (
DataFrameFormatter as DataFrameFormatter,
_make_fixed_width, is_numeric_dtype
)
from pandas import MultiIndex


class _DataFrameFormatter(DataFrameFormatter):

def _get_formatted_column_labels(self, frame):
try:
from pandas.core.indexes.multi import sparsify_labels
except ImportError:
sparsify_labels = lambda x, *args: x

columns = frame.columns

if isinstance(columns, MultiIndex):
fmt_columns = columns.format(sparsify=False, adjoin=False)
fmt_columns = list(zip(*fmt_columns))
dtypes = self.frame.dtypes._values

# if we have a Float level, they don't use leading space at all
restrict_formatting = any(level.is_floating for level in columns.levels)
need_leadsp = dict(zip(fmt_columns, map(is_numeric_dtype, dtypes)))

def space_format(x, y):
if (
y not in self.formatters
and need_leadsp[x]
and not restrict_formatting
):
return " " + y
return y

str_columns = list(
zip(*([space_format(x, y) for y in x] for x in fmt_columns))
)
if self.sparsify and len(str_columns):
str_columns = sparsify_labels(str_columns)

str_columns = [list(x) for x in zip(*str_columns)]
str_columns = [_make_fixed_width(x) for x in str_columns]
else:
fmt_columns = columns.format()
dtypes = self.frame.dtypes
need_leadsp = dict(zip(fmt_columns, map(is_numeric_dtype, dtypes)))
str_columns = [
[" " + x if not self._get_formatter(i) and need_leadsp[x] else x]
for i, x in enumerate(fmt_columns)
]
str_columns = [_make_fixed_width(x) for x in str_columns]
# self.str_columns = str_columns
return str_columns
1 change: 1 addition & 0 deletions anesthetic/convert.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ def to_getdist(nested_samples):
getdist equivalent samples
"""
import getdist
nested_samples = nested_samples.drop_labels(1)
samples = nested_samples.to_numpy()
weights = nested_samples.get_weights()
loglikes = -nested_samples.logL.to_numpy()
Expand Down
5 changes: 3 additions & 2 deletions anesthetic/gui/plot.py
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,7 @@ def __init__(self, samples, params=None):
if params:
self.params = np.array(params)
else:
self.params = np.array(self.samples.columns[:10])
self.params = np.array(self.samples.drop_labels(1).columns[:10])

self.fig = plt.figure()
self._set_up()
Expand Down Expand Up @@ -213,7 +213,8 @@ def _set_up(self):

def redraw(self, _):
"""Redraw the triangle plot upon parameter updating."""
self.triangle.draw(self.param_choice(), self.samples.tex)
self.triangle.draw(self.param_choice(),
self.samples.get_labels_map(axis=1))
self.update(None)
self.reset_range(None)
self.fig.tight_layout()
Expand Down
10 changes: 5 additions & 5 deletions anesthetic/gui/widgets.py
Original file line number Diff line number Diff line change
Expand Up @@ -226,13 +226,13 @@ def __init__(self, fig, gridspec):
self.fig.delaxes(self.ax)
_, self.ax = make_2d_axes([], fig=self.fig, subplot_spec=self.gridspec)

def draw(self, labels, tex={}):
"""Draw a new triangular grid for list of parameters labels.
def draw(self, params, labels={}):
"""Draw a new triangular grid for list of parameters.

Parameters
----------
labels: list(str)
labels for the triangular grid.
params: list(str)
params for the triangular grid.

"""
# Remove any existing axes
Expand All @@ -244,7 +244,7 @@ def draw(self, labels, tex={}):
self.fig.delaxes(ax)

# Set up the axes
_, self.ax = make_2d_axes(labels, upper=False, tex=tex,
_, self.ax = make_2d_axes(params, upper=False, labels=labels,
fig=self.fig, subplot_spec=self.gridspec)

# Plot no points points.
Expand Down
216 changes: 216 additions & 0 deletions anesthetic/labelled_pandas.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,216 @@
"""Pandas DataFrame and Series with labelled columns."""
from pandas import Series, DataFrame, MultiIndex
from pandas.core.indexing import (_LocIndexer as _LocIndexer_,
_AtIndexer as _AtIndexer_)
import numpy as np
from functools import cmp_to_key


def ac(funcs, *args):
"""Accessor function helper.

Given a list of callables `funcs`, and their arguments `*args`, evaluate
each of these, catching exceptions, and then sort results by their
dimensionality, smallest first. Return the non-exceptional result with the
smallest dimensionality.
"""
results = []
errors = []
for f in funcs:
try:
results.append(f(*args))
except Exception as e:
errors.append(e)

def cmp(x, y):
if x.ndim > y.ndim:
return 1
elif x.ndim < y.ndim:
return -1
else:
x_levels = 0
y_levels = 0
if x.ndim > 0:
x_levels += x.index.nlevels
y_levels += y.index.nlevels
if x.ndim > 1:
x_levels += x.columns.nlevels
y_levels += y.columns.nlevels

if x_levels < y_levels:
return 1
elif x_levels > y_levels:
return -1
else:
return 0

results.sort(key=cmp_to_key(cmp))

for s in results:
if s is not None:
return s
raise errors[-1]


class _LocIndexer(_LocIndexer_):
def __getitem__(self, key):
return ac([_LocIndexer_("loc", self.obj.drop_labels(i)).__getitem__
for i in self.obj._all_axes()] + [super().__getitem__], key)


class _AtIndexer(_AtIndexer_):
def __getitem__(self, key):
return ac([_AtIndexer_("at", self.obj.drop_labels(i)).__getitem__
for i in self.obj._all_axes()] + [super().__getitem__], key)


class _LabelledObject(object):
"""Common methods for LabelledSeries and LabelledDataFrame."""

def __init__(self, *args, **kwargs):
self._labels = ("labels", "labels")
labels = kwargs.pop(self._labels[0], None)
super().__init__(*args, **kwargs)
if labels is not None:
self.set_labels(labels, inplace=True)

def islabelled(self, axis=0):
"""Determine if labels are actually present."""
return (self._labels[axis] is not None
and self._labels[axis] in self._get_axis(axis).names)

def get_labels(self, axis=0):
"""Retrieve labels from an axis."""
if self.islabelled(axis):
return self._get_axis(axis).get_level_values(
self._labels[axis]).to_numpy()
else:
return None

def get_labels_map(self, axis=0):
"""Retrieve mapping from paramnames to labels from an axis."""
index = self._get_axis(axis)
if self.islabelled(axis):
return index.to_frame().droplevel('labels')['labels']
else:
return Series('', index=index)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_labels_map or get_labels_dict?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the latest upgrade, I've made the 'map' be a pandas dataframe (which is easy to generate). Still not happy with the nomenclature though.


def get_label(self, param, axis=0):
"""Retrieve mapping from paramnames to labels from an axis."""
return self.get_labels_map(axis)[param]
lukashergt marked this conversation as resolved.
Show resolved Hide resolved

def set_label(self, param, value, axis=0, inplace=False):
labels = self.get_labels_map(axis)
labels[param] = value
return self.set_labels(labels, axis=axis, inplace=inplace)

def drop_labels(self, axis=0):
axes = np.atleast_1d(axis)
result = self
for axis in axes:
if self.islabelled(axis):
result = result.droplevel(self._labels[axis], axis)
return result

def _all_axes(self):
if isinstance(self, LabelledSeries):
return [0]
else:
return [0, 1, [0, 1]]

@property
def loc(self):
return _LocIndexer("loc", self)

@property
def at(self):
return _AtIndexer("at", self)

def xs(self, key, axis=0, level=None, drop_level=True):
return ac([super(_LabelledObject, self.drop_labels(i)).xs
for i in self._all_axes()] + [super().xs],
key, axis, level, drop_level)

def __getitem__(self, key):
return ac([super(_LabelledObject, self.drop_labels(i)).__getitem__
for i in self._all_axes()] + [super().__getitem__], key)

def __setitem__(self, key, val):
super().__setitem__(key, val)

def set_labels(self, labels, axis=0, inplace=False, level=None):
"""Set labels along an axis."""
if inplace:
result = self
else:
result = self.copy()

if labels is None:
if result.islabelled(axis=axis):
result = result.drop_labels(axis)
else:
names = [n for n in result._get_axis(axis).names
if n != self._labels[axis]]
index = [result._get_axis(axis).get_level_values(n) for n in names]
if level is None:
if result.islabelled(axis):
level = result._get_axis(axis
).names.index(self._labels[axis])
else:
level = len(index)
index.insert(level, labels)
names.insert(level, self._labels[axis])

index = MultiIndex.from_arrays(index, names=names)
result.set_axis(index, axis=axis, inplace=True)

if inplace:
self._update_inplace(result)
else:
return result.__finalize__(self, "set_labels")

def reset_index(self, level=None, drop=False, inplace=False,
*args, **kwargs):
"""Reset the index, retaining labels."""
labels = self.get_labels()
answer = super().reset_index(level=level, drop=drop,
inplace=False, *args, **kwargs)
answer.set_labels(labels, inplace=True)
if inplace:
self._update_inplace(answer)
else:
return answer.__finalize__(self, "reset_index")


class LabelledSeries(_LabelledObject, Series):
"""Labelled version of pandas.Series."""

_metadata = Series._metadata + ['_labels']

@property
def _constructor(self):
return LabelledSeries

@property
def _constructor_expanddim(self):
return LabelledDataFrame


class LabelledDataFrame(_LabelledObject, DataFrame):
"""Labelled version of pandas.DataFrame."""

_metadata = DataFrame._metadata + ['_labels']

@property
def _constructor_sliced(self):
return LabelledSeries

@property
def _constructor(self):
return LabelledDataFrame

def transpose(self, copy=False):
"""Transpose."""
result = super().transpose(copy=copy)
result._labels = (result._labels[1], result._labels[0])
return result
Loading