Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fuzz support for Timestamp, Binary and Float #13279

Closed
Tracked by #12114
alamb opened this issue Nov 6, 2024 · 3 comments · Fixed by #13280
Closed
Tracked by #12114

Add fuzz support for Timestamp, Binary and Float #13279

alamb opened this issue Nov 6, 2024 · 3 comments · Fixed by #13280
Assignees
Labels
enhancement New feature or request

Comments

@alamb
Copy link
Contributor

alamb commented Nov 6, 2024

Is your feature request related to a problem or challenge?

Part of #7065

We have a great aggregation fuzz tester that is run like this:

cargo test --test fuzz -- aggregate

This fuzz tester is important to cover all the various combinations of types and columns and aggregates, given DataFusion has many different specialized code paths based on types.

The fuzz tester currently covers many different types, but not all of them. This ticket covers adding support for a few more types

Describe the solution you'd like

Add support for the following data types:

  1. Float32/Float64
  2. Timestamp (Timestamp(Seconds, ..), Timestamp(Milliseconds, ..), ...
  3. Binary / LargeBinary / BinaryView

Describe alternatives you've considered

I think following the model in this PR from @LeslieKid where they added time/interval/ decimal/utf8view in #13226

Additional context

No response

@alamb alamb added the enhancement New feature or request label Nov 6, 2024
@jonathanc-n
Copy link
Contributor

take

@jonathanc-n
Copy link
Contributor

jonathanc-n commented Nov 6, 2024

@alamb Is there a certain range for which float point numbers are compared equal to each other? any best practices?

@alamb
Copy link
Contributor Author

alamb commented Nov 6, 2024

@alamb Is there a certain range for which float point numbers are compared equal to each other? any best practices?

For determining groups, the comparison is exact equality even for floats (this is different for aggregates like SUM / AVG where floating point roundoff needs to be accounted for)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants