-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce the DynamicFileCatalog
in datafusion-catalog
#11035
Changes from 56 commits
37b5526
ad1a854
97ea11c
2729c49
6f86577
c91cdc6
3306df6
d82b273
c0491d5
a60eeea
9fa01aa
2ab3639
a8ee733
7faab9f
e1f3908
cf73ba2
c641e6b
0806263
f4d24e6
4b71e59
9964150
da1e5d3
ed670fe
ea5816e
fb8b9e0
fa73ae7
04cc155
1ede35e
51b1d41
75b0b84
3e8d094
4eb8ca5
9913405
5d861b8
ea1c075
16be2e7
db90c28
9353123
daa7ed8
e4a2174
506d1d6
72ce464
fb1b6ce
8f0952d
f062fec
b1baa84
76d7fee
f0f070b
6b77b6b
4e51a77
fafc9dc
a3a4f4d
7dc238f
f7b4b8c
25d0ff6
a78bd3c
edeff33
b1a922c
e5ab14d
87d7503
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -63,5 +63,14 @@ async fn main() -> Result<()> { | |
// print the results | ||
df.show().await?; | ||
|
||
// dynamic query by the file path | ||
ctx.enable_url_table(); | ||
let df = ctx | ||
.sql(format!(r#"SELECT * FROM '{}' LIMIT 10"#, &path).as_str()) | ||
.await?; | ||
|
||
// print the results | ||
df.show().await?; | ||
Comment on lines
+66
to
+73
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, @alamb. I added a simple s3 example here. I hope it is what you want or that it could inspire you for a new example. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No actually this is perfect -- thank you @goldmedal |
||
|
||
Ok(()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am trying to understand why datafusion-cli can't simply use
DynamicFileCatalog
-- why does it need another layer of wrapping?Is it the need to dynamically create
ObjectStore
s?I have found that dynamic creation to be one of the more complicated things about datafusion-cli (and what users of DataFusion would have to figure out to recreate it). Maybe we can figure out some simpler API for that too (as another PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can provide something like installing extensions for datafusion-cli? There're some similar use cases in DuckDB httpfs. Before scanning the remote file, user should load the corresponding extensions (BTW, they also do some autolaoding for specific extensions, e.g. httpfs)
I think it maybe related to the
dfdb
purpose #11979 . It's a nice way to provide more feature for the user.