Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can't perform sequential mu.pp.filter_obs #43

Closed
crichgriffin opened this issue Nov 23, 2021 · 2 comments
Closed

can't perform sequential mu.pp.filter_obs #43

crichgriffin opened this issue Nov 23, 2021 · 2 comments
Labels
bug Something isn't working

Comments

@crichgriffin
Copy link

I want to perform multiple filtering steps on my object e.g.:

mu.pp.filter_obs(mdata, 'n_genes_by_counts', lambda x: (x >= 100))
mu.pp.filter_obs(mdata,  'total_counts', lambda x: (x >= 500) & (x <= 50000)

Error message:

muon/_core/preproc.py in filter_obs(data, var, func)
    733         # filter_obs() for each modality
    734         for m, mod in data.mod.items():
--> 735             obsmap = data.obsmap[m][obs_subset]
    736             obsmap = obsmap[obsmap != 0] - 1
    737             filter_obs(mod, mod.obs_names[obsmap])

anndata/_core/aligned_mapping.py in __getitem__(self, key)
    146 
    147     def __getitem__(self, key: str) -> V:
--> 148         return self._data[key]
    149 
    150     def __setitem__(self, key: str, value: V):

KeyError: 'x'

Minimal working example:

import numpy as np
import muon as mu
x = mu.AnnData(X=np.random.normal(size=1000).reshape(-1, 100))
y = mu.AnnData(X=np.random.normal(size=2000).reshape(-1, 100))

md = mu.MuData({"x": x, "y": y})

md['x'].obs['total_count'] = md['x'].X.sum(axis=1)
md['x'].obs['min_count'] = md['x'].X.min(axis=1)
md.update()

# filter one of the modalities.
mu.pp.filter_obs(md, 'x:min_count', lambda x: (x < -2))
mu.pp.filter_obs(md, 'x:total_count', lambda x: (x >0))

If you put md.update() between the two filter statements it does work. But to me this is not ideal if you are filtering on lots of categories sequentially

Is this desired behaviour ot something that could be updated such that one could just one run update command after sequential filters??

  • OS: CentOS Linux release 7.8.2003 (Core)
  • Python 3.9.5
  • Versions of libraries involved
    • numpy 1.20.3
    • muon 0.1.1

Thanks!

@crichgriffin crichgriffin added the bug Something isn't working label Nov 23, 2021
@gtca
Copy link
Collaborator

gtca commented Nov 25, 2021

Hey @crichgriffin, thanks for reporting, and thanks @ilia-kats for addressing it!
Should work now, and we've also added this scenario as a test.

@crichgriffin
Copy link
Author

Great, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants