Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable spill to disk globally #3264

Closed
thinkharderdev opened this issue Aug 25, 2022 · 2 comments
Closed

Disable spill to disk globally #3264

thinkharderdev opened this issue Aug 25, 2022 · 2 comments
Labels
enhancement New feature or request

Comments

@thinkharderdev
Copy link
Contributor

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
(This section helps Arrow developers understand the context and why for this feature, in addition to the what)

Certain operators (Sort, HashJoin, etc) support spill-to-disk if they have to buffer too much data in memory. For some use cases this may not be desired and it would be better to have the query fail.

It would be great if this was configurable in the SessionConfig and if the flag was set to false, any operator which uses a MemoryConsumer would fail instead of trying to spill to disk.

Describe the solution you'd like
A clear and concise description of what you want to happen.

Add an enable_disk_spill to the SessionConfig (which can default to true for backwards compat). Current implementations of MemoryConsumer should respect this flag and fail when spill is called if disk spill is disabled.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

This could be configured on a per-operator basis but I think it probably is something that needs to be either globally disabled or enabled in the vast majority of use cases.

Additional context
Add any other context or screenshots about the feature request here.

@thinkharderdev thinkharderdev added the enhancement New feature or request label Aug 25, 2022
@andygrove
Copy link
Member

It would be great if we could use the key-value based configuration framework introduced recently in DataFusion. It allows us to generate documentation for the user guide.

https://arrow.apache.org/datafusion/user-guide/configs.html

@alamb
Copy link
Contributor

alamb commented Nov 28, 2022

I think @tustvold implemented this in #4330

@alamb alamb closed this as completed Nov 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants