-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Count distinct floats #252
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks @pjmore !
Codecov Report
@@ Coverage Diff @@
## master #252 +/- ##
=======================================
Coverage 76.80% 76.81%
=======================================
Files 133 133
Lines 23284 23294 +10
=======================================
+ Hits 17884 17894 +10
Misses 5400 5400
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried it out locally and it works great. Thank you @pjmore !
alamb@ip-10-0-0-124:~/Software/arrow-datafusion$ echo "foo,1.23" > /tmp/foo.csv
alamb@ip-10-0-0-124:~/Software/arrow-datafusion$ cargo run -p datafusion-cli
Finished dev [unoptimized + debuginfo] target(s) in 0.13s
Running `target/debug/datafusion-cli`
> CREATE EXTERNAL TABLE t (a varchar, b float) STORED AS CSV LOCATION '/tmp/foo.csv';
0 rows in set. Query took 0 seconds.
> select count(distinct a) from t;
+-------------------+
| COUNT(DISTINCT a) |
+-------------------+
| 1 |
+-------------------+
1 rows in set. Query took 0 seconds.
> select count(distinct b) from t;
+-------------------+
| COUNT(DISTINCT b) |
+-------------------+
| 1 |
+-------------------+
1 rows in set. Query took 0 seconds.
Thanks again @pjmore ! |
🎉 |
Which issue does this PR close?
Closes #199 .
What changes are included in this PR?
Modified try_from_array method on ScalarValue and added tests to physical_plan/distinct_expressions.rs
The tests cover all of the floating point weirdness that I could think of, namely NaN, the infinities, and subnormal numbers, but if I missed any I'm happy to add tests for them.
Are there any user-facing changes?
No