Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An attempt at rewriting simple imputation methods as iterators #60

Closed
wants to merge 1 commit into from

Conversation

rofinn
Copy link
Member

@rofinn rofinn commented Mar 6, 2020

I'll be immediately closing this PR, but I figured I'd document why rewriting even the simple imputation methods as iterators didn't seem reasonable in the end.

  1. Many methods require looking forward or behind (e.g., locf, nocb, interpolation) which isn't guaranteed to work consistently for all iterators.
  2. Generalizing univariate iterators to multivariate datasets is challenging to do in a performant way (e.g., iterating at different rates for each variable may result in multiple copies of an observation).
  3. Preserving type information may be challenging when splitting and combine observations.
  4. In some cases we need to pass significant information around in each iteration state. Particularly, if we're nesting many iterators.

Overall, I think a better approach moving forward will be to provide a handful of methods/utilities on a small selection of types that can be applied out of the box at the cost of making multiple passes. We can also focus on making the Imputor API more extensible to help get folks up and running if they think they can impute all of their data in a single pass.

@rofinn rofinn closed this Mar 6, 2020
@rofinn rofinn mentioned this pull request Mar 6, 2020
8 tasks
@rofinn rofinn mentioned this pull request Sep 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant