Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Join component #5852

Closed
wants to merge 15 commits into from
Closed

feat: Join component #5852

wants to merge 15 commits into from

Conversation

ZanSara
Copy link
Contributor

@ZanSara ZanSara commented Sep 20, 2023

Related Issues

Proposed Changes:

  • Adds a simple Join component that joins N lists (or any type supporting the + operator) into a single output.

How did you test it?

  • Unit test

Notes for the reviewer

Checklist

@github-actions github-actions bot added topic:tests type:documentation Improvements on the docs labels Sep 20, 2023
@ZanSara ZanSara marked this pull request as ready for review September 20, 2023 14:36
@ZanSara ZanSara requested a review from a team as a code owner September 20, 2023 14:36
@ZanSara ZanSara requested review from silvanocerza and removed request for a team September 20, 2023 14:36
@ZanSara ZanSara requested a review from a team as a code owner September 20, 2023 14:48
@ZanSara ZanSara requested review from dfokina and removed request for a team September 20, 2023 14:48
:param inputs_count: the number of inputs to expect.
:param inputs_type: the type of the inputs. Every type that supports the + operator works.
"""
self.inputs_count = inputs_count
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not set it to 2 by default?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why having inputs_count at all? The degenerate case would be having only one input, that would make the component a no-op

Copy link
Contributor Author

@ZanSara ZanSara Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @masci I don't get your point. I can set it to be minimum two, but in principle being able to set how many inputs to expect seems way more useful than fixing it to two. There are many cases where you want to aggregate several values (just look at this pipeline)

Copy link
Contributor

@masci masci Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why having inputs_count at all?

I'm not saying to hardcode to two, I'm talking about removing it, why do we need an upper limit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not an upper limit: you need to specify in advance how many inputs you expect. I mean, I could set it to a huge number and make them all optional, but it will make quite some noise in the error messages: if you fail to connect it, the error message will list all the possible input connections and the error becomes seriously unreadable 😅 I'd rather let the user specify how many they want and stick to that. It makes debugging easier.

If this is a big deal let's discuss it offline.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it would be great to remove input_counts parameter if possible.

@ZanSara ZanSara added the 2.x Related to Haystack v2.0 label Sep 20, 2023
@coveralls
Copy link
Collaborator

coveralls commented Sep 20, 2023

Pull Request Test Coverage Report for Build 6301669713

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.1%) to 49.957%

Totals Coverage Status
Change from base Build 6297667234: 0.1%
Covered Lines: 12258
Relevant Lines: 24537

💛 - Coveralls

:param inputs_count: the number of inputs to expect.
:param inputs_type: the type of the inputs. Every type that supports the + operator works.
"""
self.inputs_count = inputs_count
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why having inputs_count at all? The degenerate case would be having only one input, that would make the component a no-op

haystack/preview/components/joiners/join.py Outdated Show resolved Hide resolved
@ZanSara ZanSara mentioned this pull request Sep 21, 2023
@ZanSara
Copy link
Contributor Author

ZanSara commented Oct 4, 2023

Closing for now in favor of deepset-ai/canals#116

@ZanSara ZanSara closed this Oct 4, 2023
@silvanocerza silvanocerza deleted the join-lists branch October 5, 2023 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x Related to Haystack v2.0 topic:tests type:documentation Improvements on the docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants