You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Transformer used on numerical features. Can either be a transformer object instance (e.g. StandardScaler), a Pipeline containing the preprocessing steps, ‘drop’ for dropping the columns, ‘remainder’ for applying remainder, or ‘passthrough’ to return the unencoded columns (default).
So i would assume that i can pass a pipeline.
Steps/Code to Reproduce
from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.impute import SimpleImputer
from sklearn.pipeline import make_pipeline
from skrub import TableVectorizer
# get data
cancer = load_breast_cancer(return_X_y = True, as_frame = True)
X = cancer[0]
y = cancer[1]
# Numerical transformer. No NAN in the data but it could be any pipeline
num_prep = make_pipeline(SimpleImputer(add_indicator = True),
StandardScaler())
#TableVectoriser
encoder = TableVectorizer(numerical_transformer = num_prep)
# Model
clf = make_pipeline(encoder, LogisticRegression())
clf.fit(X, y)```
### Expected Results
Should fit the data
### Actual Results
ValueError: 'transformer' must be an instance of sklearn.base.TransformerMixin, 'remainder' or 'passthrough'. Got transformer=Pipeline(steps=[('simpleimputer', SimpleImputer(add_indicator=True)),
('standardscaler', StandardScaler())]).
### Versions
```shell
System:
python: 3.12.1 | packaged by conda-forge | (main, Dec 23 2023, 08:01:35) [Clang 16.0.6 ]
executable: /opt/homebrew/Caskroom/miniforge/base/envs/test_skrub/bin/python
machine: macOS-14.3-arm64-arm-64bit
Python dependencies:
sklearn: 1.4.0
pip: 23.3.2
setuptools: 69.0.3
numpy: 1.26.3
scipy: 1.12.0
Cython: None
pandas: 2.2.0
matplotlib: None
joblib: 1.3.2
threadpoolctl: 3.2.0
Built with OpenMP: True
threadpoolctl info:
user_api: blas
internal_api: openblas
num_threads: 8
prefix: libopenblas
filepath: /opt/homebrew/Caskroom/miniforge/base/envs/test_skrub/lib/libopenblas.0.dylib
version: 0.3.26
threading_layer: openmp
architecture: VORTEX
user_api: openmp
internal_api: openmp
num_threads: 8
prefix: libomp
filepath: /opt/homebrew/Caskroom/miniforge/base/envs/test_skrub/lib/libomp.dylib
version: None
0.1.0
The text was updated successfully, but these errors were encountered:
Describe the bug
As per the Documentation of TableVectoriser here:
Transformer used on numerical features. Can either be a transformer object instance (e.g. StandardScaler), a Pipeline containing the preprocessing steps, ‘drop’ for dropping the columns, ‘remainder’ for applying remainder, or ‘passthrough’ to return the unencoded columns (default).
So i would assume that i can pass a pipeline.
Steps/Code to Reproduce
The text was updated successfully, but these errors were encountered: