Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix null handling hash join #195

Closed
alamb opened this issue Apr 26, 2021 · 0 comments · Fixed by #24
Closed

Fix null handling hash join #195

alamb opened this issue Apr 26, 2021 · 0 comments · Fixed by #24
Labels
datafusion Changes in the datafusion crate

Comments

@alamb
Copy link
Contributor

alamb commented Apr 26, 2021

Note: migrated from original JIRA: https://issues.apache.org/jira/browse/ARROW-12266

Improve null handling of 

SELECT id1, id2 FROM (SELECT null AS id1) t1
INNER JOIN (SELECT 0 AS id2) t2 ON id1 = id2

NULL, NULL

(should be empty result set)

We should filter beforehand to make this result correct. Also this can make things more efficient as the non-null filter can be pushed down which can lead to efficiency gains (making data-set smaller, not having to deal with nullable data, or even entire files could be skipped when they only contain nulls).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datafusion Changes in the datafusion crate
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant