Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transforming auxiliary tables #1026

Open
jeromedockes opened this issue Aug 1, 2024 · 0 comments
Open

Transforming auxiliary tables #1026

jeromedockes opened this issue Aug 1, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@jeromedockes
Copy link
Member

Problem Description

the Joiners join the main table (X) to an auxiliary table. The main table is being pushed through a pipeline and we can transform it however we want before and after the join.
We may need to perform transformations on the auxiliary table.
As a concrete example, applying minhash to the auxiliary table before joining it with the AggJoiner and the min aggregation function (a use case from @Vincent-Maladiere ). here the vectorization cannot be done in the "main" pipeline because it needs to happen before aggregation.

ATM this has to be done separately before creating the Joiner. It would be nice if those transformations could be packaged with the rest of the pipeline somehow. Moreover, if the aux table transformations have to be done "manually" outside of the main pipeline, we cannot do hyperparameter search for those transformations.

One way would be to have a aux_preprocessor parameter (passthrough by default) for the joiners.

Feature Description

_

Alternative Solutions

No response

Additional Context

No response

@jeromedockes jeromedockes added the enhancement New feature or request label Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant