Remove empty features from the visualization? #171

fedarko · 2019-07-02T02:24:12Z

This will probably come up eventually?

I imagine this will probably happen infrequently, but may be worth addressing. Similarly to removing empty samples, this would be done with the understanding that if all (or in this case two, I guess?) features are empty an error would be thrown.

I think that once #58 is taken care of (and _df_utils.remove_empty_samples() operates on BIOM tables), it should be possible to do this by just uh removing the axis="sample" argument to biom.Table.remove_empty(). We'd need to add some checks to make sure at least two features remain in the table post-filtering (just an extension of the current logic but on the observation axis), and we'd also need to add some logic to notify the user about dropped features (as well as document this behavior somewhere).

Implement this in remove_empty_samples() (and rename that function to just remove_empty() or something).
Notify the user about number of dropped features (analogous to current notifications re: dropped samples)
Throw an error when all features are empty
Add unit tests that this works
Document this

The text was updated successfully, but these errors were encountered:

This is gonna be kind of inconvient to apply to the feature ranks, also. Think I'm going to go back and try an alternative approach to all of this stuff (#172, #171, #58) on another branch.

Turns out this is a ton more efficient. Added bonus of now relying on pandas' implementation of this instead of ours. Turns out transposing huge dataframes is a pretty significant endeavor, so calling .T on the feature table for like the EMP dataset was taking a super long time. Fortunately, we can finesse our way around this by instead transposing the sample metadata and then aligning on the columns. I'm glad that we reached a solution for this that preserved all of the matching-up-front niceness re: testing. Solid stuff. Uh, next up are #171 and then #58? But we can def merge this back into master now.

This has apparently thrown off the Byrd test, since it looks like there were a bunch of empty features in that dataset. Need to update the testing utilities to allow for empty features to be removed. also looks like the JSONs for the q2-moving-pictures and sleep apnea integration tests are different -- figure out why, and if that's due to a bug or just due to something else. once that's done this issue will be done

fedarko · 2019-07-06T00:05:20Z

Just need to reconcile this with the test system (Byrd example is causing it to fail right now), then this will be done.

relates to #58, #171. now it's pretty quick to run the EMP with -x 2000 into Qurro. going to do some more basic benchmarking with this.

fedarko added the optimization Making code faster or cleaner label Jul 2, 2019

fedarko self-assigned this Jul 2, 2019

fedarko mentioned this issue Jul 2, 2019

Store sparse count data JSON in visualization #58

Closed

fedarko added this to the v0.2.0 milestone Jul 2, 2019

fedarko added a commit that referenced this issue Jul 3, 2019

MAINT: Remove empty features #171

5e9977a

This is gonna be kind of inconvient to apply to the feature ranks, also. Think I'm going to go back and try an alternative approach to all of this stuff (#172, #171, #58) on another branch.

fedarko added a commit that referenced this issue Jul 6, 2019

REL/DOC: Update changelog + -t desc re #171

bbe0a3c

fedarko closed this as completed in 9c1ea69 Jul 6, 2019

fedarko added a commit that referenced this issue Jul 7, 2019

REL: Update changelog re: #58 and #171 changes

c175cc8

fedarko added a commit that referenced this issue Jul 7, 2019

ENH: Speed up empty sample/feature removing

1942df0

relates to #58, #171. now it's pretty quick to run the EMP with -x 2000 into Qurro. going to do some more basic benchmarking with this.

fedarko added a commit that referenced this issue Jul 7, 2019

DOC: Add note re: #171 to -x help text

25e1fa3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove empty features from the visualization? #171

Remove empty features from the visualization? #171

fedarko commented Jul 2, 2019 •

edited

Loading

fedarko commented Jul 6, 2019

Remove empty *features* from the visualization? #171

Remove empty *features* from the visualization? #171

Comments

fedarko commented Jul 2, 2019 • edited Loading

fedarko commented Jul 6, 2019

Remove empty features from the visualization? #171

Remove empty features from the visualization? #171

fedarko commented Jul 2, 2019 •

edited

Loading