-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port tests in csv_files.rs
to sqllogictest
#8251
Conversation
# Conflicts: # datafusion/physical-plan/src/joins/hash_join_utils.rs
# Conflicts: # datafusion/physical-plan/src/joins/hash_join_utils.rs
@@ -73,6 +73,8 @@ impl TableProviderFactory for ListingTableFactory { | |||
CsvFormat::default() | |||
.with_has_header(cmd.has_header) | |||
.with_delimiter(cmd.delimiter as u8) | |||
.with_quote(cmd.options.get("quote").map_or(b'"', |x| x.as_bytes()[0])) | |||
.with_escape(cmd.options.get("escape").map_or(None, |x| Some(x.as_bytes()[0]))) | |||
.with_file_compression_type(file_compression_type), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is the best way, maybe we can open another issue to improve it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I'd prefer to file a separate issue for integrating QUOTE
& ESCAPE
into CreateExternalTable
pipeline. My concern here is that it probably will be confusing for end-users to pass read options into clause reserved for write options (and without any effect on writing CSV).
WITH HEADER ROW | ||
DELIMITER ',' | ||
OPTIONS ('quote' '~') | ||
LOCATION '../../testing/data/csv/csv_custom_quote.csv'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is needed here, but I don't know how to create it.
The contents of the file(csv_custom_quote.csv
) are as follows:
c1,c2
~id0~,~value0~
~id1~,~value1~
~id2~,~value2~
~id3~,~value3~
~id4~,~value4~
~id5~,~value5~
~id6~,~value6~
~id7~,~value7~
~id8~,~value8~
~id9~,~value9~
WITH HEADER ROW | ||
DELIMITER ',' | ||
OPTIONS ('escape' '\') | ||
LOCATION '../../testing/data/csv/csv_custom_escape.csv'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same
The contents of the file(csv_custom_escape.csv
):
c1,c2
"id0","value\"0"
"id1","value\"1"
"id2","value\"2"
"id3","value\"3"
"id4","value\"4"
"id5","value\"5"
"id6","value\"6"
"id7","value\"7"
"id8","value\"8"
"id9","value\"9"
Thank you @Asura7969 . This PR appears to be failing CI, so marking it as draft until that is fixed. |
The reason for this failure is that the test file is missing. Can you help me check test? |
I believe another way is to generate custom-delimited (or escaped) .csv while configuring test context -- reference example might be csv generation for |
Thanks for your tip, I'll improve it |
# Conflicts: # datafusion/core/tests/sql/mod.rs
This PR also involves two attribute( CREATE EXTERNAL TABLE custom_escape (
c1 VARCHAR DEFAULT NULL,
c2 VARCHAR DEFAULT NULL
)
STORED AS CSV
WITH HEADER ROW
DELIMITER ','
OPTIONS ('escape' '\') -- this
LOCATION '../core/tests/data/custom_escape.csv'; Using the API to create similar tables supports Thanks again for the tips. |
I created the test file under the path |
I suppose, it's more neat to use enable_testdir method of |
I used enable_testdir method,thanks for your advice |
cc @alamb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Asura7969 -- this code looks great to me. Also thank you very much for filing #8310
In order to avoid losing test coverage, I think we should fix #8310 before merging this one.
Is that something you are able to try @Asura7969 ? Conveniently, you already have written the test for the code in this PR :)
@@ -104,6 +104,10 @@ impl TestContext { | |||
info!("Registering metadata table tables"); | |||
register_metadata_tables(test_ctx.session_ctx()).await; | |||
} | |||
"csv_files.slt" => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
I'll try it this week |
For anyone following along this is done in #8351 |
Thanks for implementing #8351 @Asura7969! Since all tests for #8197 have been ported & with #8885 being merged this draft PR here can be closed? |
I agree -- thank you @simicd |
Which issue does this PR close?
Closes #8197 .
Rationale for this change
What changes are included in this PR?
2 parts:
quote
&escape
forcreate external table
in sql(The api method is already supported)Are these changes tested?
Yes
Are there any user-facing changes?