-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enabled CLI to read predefined and customized CSV files #480
Conversation
docs/user/CLI.md
Outdated
read_file('customized.csv', {'type':'customized', 'delimiter':',', 'header':true, \ | ||
'ignore_empty_line':true, 'ignore_surrounding_space':true, 'trim':true, \ | ||
'line_breaker: \n', 'escape':'\', 'quote':'"'}) | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May want to note what the arguments do or link to CSVParser's reference on these.
Also would be helpful to note that delimiter
, escape
, and quote
arguments can only be a 1 character.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied in the next commit.
docs/user/CLI.md
Outdated
``` | ||
read_file('customized.csv', {'type':'customized', 'delimiter':' ', 'header':true}) | ||
``` | ||
The following command explicitly shows all the available options for a standard CSV file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: what is meant by "standard" here? Is there a standard CSV file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just means the default CSV format in CSVParser library.
Co-authored-by: Alan Cai <caialan@amazon.com>
Co-authored-by: Alan Cai <caialan@amazon.com>
Co-authored-by: Alan Cai <caialan@amazon.com>
Codecov Report
@@ Coverage Diff @@
## main #480 +/- ##
============================================
+ Coverage 82.43% 82.44% +0.01%
+ Complexity 1330 1329 -1
============================================
Files 171 171
Lines 10904 10923 +19
Branches 1785 1795 +10
============================================
+ Hits 8989 9006 +17
Misses 1368 1368
- Partials 547 549 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! Just have some minor comments and questions. Otherwise, looks good to me.
.let{ it.withIgnoreSurroundingSpaces(ignoreSurroundingSpace) } | ||
.let{ it.withTrim(trim) } | ||
.let { if (hasHeader) it.withFirstRecordAsHeader() else it } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(sorry missed this in the initial review) nit: .let
calls on these lines are redundant
fun readPostgreCsvFile() { | ||
writeFile("simple_postgre.csv", "id,name,balance\n1,Bob,10000.00") | ||
|
||
val args = listOf("\"${dirPath("simple_postgre.csv")}\"", "{type:\"postgresql_csv\", header:true}").map { it.exprValue() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not all the typos were addressed here. 'simple_postgre.csv' -> 'simple_postgresql.csv'
docs/user/CLI.md
Outdated
All the available options for customized CSV files are shown as following: | ||
1. Ignore empty lines: `'ignore_empty_line':true` | ||
2. Ignore spaces surrounding comma: `'ignore_surrounding_space':true` | ||
3. Trim leading and trailing blanks: `'trim':true` | ||
4. Set line breaker (only working with '\r', '\n' and '\r\n'): `'line_breaker: \n'` | ||
5. Set escape sign (single character only): `'escape':'\'` | ||
6. Set quote sign (single character only): `'quote':'"'` | ||
7. Set delimiter sign (single character only): `'delimiter':','` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, is there a reason for this set of customized CSV parsing options? I saw there were some other options CSVFormat supports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which other options? I think these are all the options to customize the format of a CSV file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the CSVFormat
constructor and CSVFormat.Builder
(documentation here), there are about 25 total configuration options like nullString
, recordSeparator
, quoteMode
, skipHeaderRecord
, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think other options are not so important to configure CSV format. If we find we need to configure any one of them in the future, we can create a ticket for it and make corresponding enhancement.
Will create a new issue to add tests to read CSV files in other formats. Other comments were already applied. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue #366
Description of changes: