Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Antijoin panics using left_on/right_on #2475

Closed
Vince7778 opened this issue Jul 3, 2024 · 0 comments · Fixed by #2477
Closed

Antijoin panics using left_on/right_on #2475

Vince7778 opened this issue Jul 3, 2024 · 0 comments · Fixed by #2477
Assignees
Labels
bug Something isn't working

Comments

@Vince7778
Copy link
Contributor

Running an antijoin on two dataframes causes a panic if using the left_on and right_on arguments. This does not occur if regular on is used, or if the join is not an antijoin.

Example:

>>> df1 = daft.from_pydict({"a": [1, 2, 3, 4]})
>>> df2 = daft.from_pydict({"b": [3, 2]})
>>> df = df1.join(df2, left_on="a", right_on="b", how="anti")
>>> df.show()
thread '' panicked at src/daft-plan/src/physical_planner/translate.rs:509:67:
called `Result::unwrap()` on an `Err` value: FieldNotFound("Column \"b\" not found in schema: [\"a\"]")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/conor/Documents/Programming/daft/Daft/daft/api_annotations.py", line 26, in _wrap
    return timed_method(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/conor/Documents/Programming/daft/Daft/daft/analytics.py", line 185, in tracked_method
    return method(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/conor/Documents/Programming/daft/Daft/daft/dataframe/dataframe.py", line 1874, in show
    dataframe_display = self._construct_show_display(n)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/conor/Documents/Programming/daft/Daft/daft/dataframe/dataframe.py", line 1831, in _construct_show_display
    for table in get_context().runner().run_iter_tables(builder, results_buffer_size=1):
  File "/Users/conor/Documents/Programming/daft/Daft/daft/runners/pyrunner.py", line 198, in run_iter_tables
    for result in self.run_iter(builder, results_buffer_size=results_buffer_size):
  File "/Users/conor/Documents/Programming/daft/Daft/daft/runners/pyrunner.py", line 186, in run_iter
    plan_scheduler = builder.to_physical_plan_scheduler(daft_execution_config)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/conor/Documents/Programming/daft/Daft/daft/logical/builder.py", line 47, in to_physical_plan_scheduler
    return PhysicalPlanScheduler.from_logical_plan_builder(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/conor/Documents/Programming/daft/Daft/daft/plan_scheduler/physical_plan_scheduler.py", line 35, in from_logical_plan_builder
    scheduler = _PhysicalPlanScheduler.from_logical_plan_builder(builder._builder, daft_execution_config)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: FieldNotFound("Column \"b\" not found in schema: [\"a\"]")

Expected output:

>>> df.show()
╭───────╮
│ a     │
│ ---   │
│ Int64 │
╞═══════╡
│ 1     │
├╌╌╌╌╌╌╌┤
│ 4     │
╰───────╯
@Vince7778 Vince7778 added the bug Something isn't working label Jul 3, 2024
@Vince7778 Vince7778 self-assigned this Jul 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant