Estimation Data types

Data types in early processing (e.g. TICA)

Some data processing steps are currently inefficient - in memory usage, CPU usage, or both.

The question arises how we still keep generality in the data processing pipeline.

Build specialized low-level estimators for specific datatypes, e.g. covariance estimators for integer and sparse boolean data. (simple one-pass algorithm is robust for integral data, C implementation can efficiently deal with 1's and 0's.)
High-level estimator (e.g. TICA) encapsulates multiple types, e.g. float/int.
There is a fallback implementation if specialized low-level algorithms are not implemented. For example a boolean array can be cast to a float array containing 0.0 and 1.0, a sparse data chunk can be copied into a dense data chunk etc.