Skip to content

Commit

Permalink
add documentation for precision recall
Browse files Browse the repository at this point in the history
  • Loading branch information
gwaybio committed Feb 16, 2021
1 parent 0d0c42e commit aa633b6
Show file tree
Hide file tree
Showing 2 changed files with 35 additions and 10 deletions.
24 changes: 16 additions & 8 deletions cytominer_eval/operations/precision_recall.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,22 @@ def precision_recall(
"""Determine the precision and recall at k for all unique replicate groups
based on a predefined similarity metric (see cytominer_eval.transform.metric_melt)
Arguments:
similarity_melted_df - a long pandas dataframe output from transform.metric_melt
replicate_groups - a list of metadata column names in the original profile dataframe
to use as replicate columns
k - an integer indicating how many pairwise comparisons to threshold
Output:
pandas DataFrame of precision and recall metrics for all replicate groups
Parameters
----------
similarity_melted_df : pandas.DataFrame
An elongated symmetrical matrix indicating pairwise correlations between
samples. Importantly, it must follow the exact structure as output from
:py:func:`cytominer_eval.transform.transform.metric_melt`.
replicate_groups : List
a list of metadata column names in the original profile dataframe to use as
replicate columns.
k : int
an integer indicating how many pairwise comparisons to threshold.
Returns
-------
pandas.DataFrame
precision and recall metrics for all replicate groups given k
"""
# Determine pairwise replicates and make sure to sort based on the metric!
similarity_melted_df = assign_replicates(
Expand Down
21 changes: 19 additions & 2 deletions cytominer_eval/operations/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,8 +77,25 @@ def assign_replicates(


def calculate_precision_recall(replicate_group_df: pd.DataFrame, k: int) -> pd.Series:
"""
Usage: Designed to be called within a pandas.DataFrame().groupby().apply()
"""Given an elongated pairwise correlation dataframe of replicate groups, calculate
precision and recall.
Usage: Designed to be called within a pandas.DataFrame().groupby().apply(). See
:py:func:`cytominer_eval.operations.precision_recall.precision_recall`.
Parameters
----------
replicate_group_df : pandas.DataFrame
An elongated dataframe storing pairwise correlations of all profiles to a single
replicate group.
k : int
an integer indicating how many pairwise comparisons to threshold.
Returns
-------
dict
A return bundle of identifiers (k) and results (precision and recall at k).
The dictionary has keys ("k", "precision", "recall").
"""
assert (
"group_replicate" in replicate_group_df.columns
Expand Down

0 comments on commit aa633b6

Please sign in to comment.