Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use atomics for SQLMetric implementation, remove unused name field #25

Merged
merged 1 commit into from
Apr 22, 2021

Conversation

returnString
Copy link
Contributor

As mentioned in apache/arrow#10049 as a potential followup, this PR simplifies metrics management in physical ops by removing the need for a mutex around SQLMetric instances, instead using atomics to track metric values.

I've also removed the name field as this info is duplicated by ExecutionPlan::metrics returning a string-keyed map. Happy to revert that portion of the change if people feel strongly about it though!

It might be possible to futher simplify this by removing the Arc allocs and sharing SQLMetric instances as Sync references if the existing lifetime setup plays nicely with it, but I didn't want to spend too much time digging into this; I think this PR as-is gives us the immediate win of not needing to deal with mutexes :)

@returnString
Copy link
Contributor Author

Probably one for @andygrove, if you have some time :)

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

Copy link
Contributor

@Dandandan Dandandan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking quite a bit cleaner! 😎

@codecov-commenter
Copy link

Codecov Report

Merging #25 (613caaf) into master (c365a4f) will decrease coverage by 0.01%.
The diff coverage is 100.00%.

❗ Current head 613caaf differs from pull request most recent head 0a4d560. Consider uploading reports for the commit 0a4d560 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master      #25      +/-   ##
==========================================
- Coverage   70.41%   70.39%   -0.02%     
==========================================
  Files         123      123              
  Lines       21261    21255       -6     
==========================================
- Hits        14970    14963       -7     
- Misses       6291     6292       +1     
Impacted Files Coverage Δ
datafusion/src/physical_plan/hash_aggregate.rs 84.54% <100.00%> (-0.09%) ⬇️
datafusion/src/physical_plan/mod.rs 87.69% <100.00%> (+0.59%) ⬆️
datafusion/src/physical_plan/sort.rs 91.45% <100.00%> (-0.74%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c365a4f...0a4d560. Read the comment docs.

@andygrove
Copy link
Member

Closes #30

@andygrove andygrove linked an issue Apr 22, 2021 that may be closed by this pull request
@andygrove andygrove merged commit 434fbf7 into apache:master Apr 22, 2021
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @returnString

@returnString returnString deleted the atomic_metrics branch April 22, 2021 14:21
@houqp houqp added api change Changes the API exposed to users of the crate ballista datafusion Changes in the datafusion crate labels Jul 29, 2021
andygrove added a commit that referenced this pull request Jan 12, 2023
* Initial commit

* initial commit

* failing test

* table scan projection

* closer

* test passes, with some hacks

* use DataFrame (#2)

* update README

* update dependency

* code cleanup (#3)

* Add support for Filter operator and BinaryOp expressions (#4)

* GitHub action (#5)

* Split code into producer and consumer modules (#6)

* Support more functions and scalar types (#7)

* Use substrait 0.1 and datafusion 8.0 (#8)

* use substrait 0.1

* use datafusion 8.0

* update datafusion to 10.0 and substrait to 0.2 (#11)

* Add basic join support (#12)

* Added fetch support (#23)

Added fetch to consumer

Added limit to producer

Added unit tests for limit

Added roundtrip_fill_none() for testing when None input can be converted to 0

Update src/consumer.rs

Co-authored-by: Andy Grove <andygrove73@gmail.com>

Co-authored-by: Andy Grove <andygrove73@gmail.com>

* Upgrade to DataFusion 13.0.0 (#25)

* Add sort consumer and producer (#24)

Add consumer

Add producer and test

Modified error string

* Add serializer/deserializer (#26)

* Add plan and function extension support (#27)

* Add plan and function extension support

* Removed unwraps

* Implement GROUP BY (#28)

* Add consumer, producer and tests for aggregate relation

Change function extension registration from absolute to relative anchor
(reference)

Remove operator to/from reference

* Fixed function registration bug

* Add test

* Addressed PR comments

* Changed field reference from mask to direct reference (#29)

* Changed field reference from masked reference to direct reference

* Handle unsupported case (struct with child)

* Handle SubqueryAlias (#30)

Fixed aggregate function register bug

* Add support for SELECT DISTINCT (#31)

Add test case

* Implement BETWEEN (#32)

* Add case (#33)

* Implement CASE WHEN

* Add more case to test

* Addressed comments

* feat: support explicit catalog/schema names in ReadRel (#34)

* feat: support explicit catalog/schema names in ReadRel

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* fix: use re-exported expr crate

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

* move files to subfolder

* RAT

* remove rust.yaml

* revert .gitignore changes

* tomlfmt

* tomlfmt

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Daniël Heres <danielheres@gmail.com>
Co-authored-by: JanKaul <jankaul@mailbox.org>
Co-authored-by: nseekhao <37189615+nseekhao@users.noreply.github.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
alamb pushed a commit that referenced this pull request Aug 8, 2024
…calculations, limit/order/distinct (#11756)

* Fix unparser derived table with columns include calculations, limit/order/distinct (#24)

* compare format output to make sure the two level of projects match

* add method to find inner projection that could be nested under limit/order/distinct

* use format! for matching in unparser sort optimization too

* refactor

* use to_string and also put comments in

* clippy

* fix unparser derived table contains cast (#25)

* fix unparser derived table contains cast

* remove dbg
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate datafusion Changes in the datafusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update SQLMetric to use atomics rather than a Mutex
6 participants