-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for trailing text after the closing quote, and EOF without a final closing quote, for Excel compatibility. Fix a unit test and add a RAT exclude for the sample CSV file. #303
Conversation
if quoted, has to have a closing quote before the file ends.
Codecov Report
@@ Coverage Diff @@
## master #303 +/- ##
=========================================
Coverage 97.91% 97.91%
Complexity 553 553
=========================================
Files 11 11
Lines 1200 1200
Branches 206 206
=========================================
Hits 1175 1175
Misses 13 13
Partials 12 12 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
"The test was failing because the test was wrong" This is why I kept asking (twice) for file-based tests... please add file-based tests that cover use cases for these two new options beyond the one you changed. Let's make sure there are actual files that cover what Excel allows. |
TY! |
Add support for trailing text after the closing quote, and EOF without a final closing quote, for Excel compatibility. Fix a unit test and add a RAT exclude for the sample CSV file.
Bumps [org.apache.commons:commons-csv](https://github.com/apache/commons-csv) from 1.10.0 to 1.11.0. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/apache/commons-csv/blob/master/RELEASE-NOTES.txt">org.apache.commons:commons-csv's changelog</a>.</em></p> <blockquote> <p>Apache Commons CSV Version 1.11.0 Release Notes</p> <p>This document contains the release notes for the 1.11.0 version of Apache Commons CSV. Commons CSV reads and writes files in variations of the Comma Separated Value (CSV) format.</p> <p>Commons CSV requires at least Java 8.</p> <p>The Apache Commons CSV library provides a simple interface for reading and writing CSV files of various types.</p> <p>Feature and bug fix release (Java 8 or above)</p> <p>Changes in this version include:</p> <h2>New Features</h2> <ul> <li>CSV-308: [Javadoc] Add example to CSVFormat#setHeaderComments() <a href="https://redirect.github.com/apache/commons-csv/issues/344">#344</a>. Thanks to Buddhi De Silva, Gary Gregory.</li> <li> <pre><code> Add and use CSVFormat#setTrailingData(boolean) in CSVFormat.EXCEL for Excel compatibility [#303](apache/commons-csv#303). Thanks to DamjanJovanovic, Gary Gregory. </code></pre> </li> <li> <pre><code> Add and use CSVFormat#setLenientEof(boolean) in CSVFormat.EXCEL for Excel compatibility [#303](apache/commons-csv#303). Thanks to DamjanJovanovic, Gary Gregory. </code></pre> </li> </ul> <h2>Fixed Bugs</h2> <ul> <li>CSV-306: Replace deprecated method in user guide, update external link <a href="https://redirect.github.com/apache/commons-csv/issues/324">#324</a>, <a href="https://redirect.github.com/apache/commons-csv/issues/325">#325</a>. Thanks to Sam Ng, Bruno P. Kinoshita.</li> <li> <pre><code> Document duplicate header behavior [#309](apache/commons-csv#309). Thanks to Seth Falco, Bruno P. Kinoshita. </code></pre> </li> <li> <pre><code> Add missing docs [#328](apache/commons-csv#328). Thanks to jkbkupczyk. </code></pre> </li> <li> <pre><code> [StepSecurity] CI: Harden GitHub Actions [#329](apache/commons-csv#329), [#330](apache/commons-csv#330). Thanks to step-security-bot. </code></pre> </li> <li>CSV-147: Better error message during faulty CSV record read <a href="https://redirect.github.com/apache/commons-csv/issues/347">#347</a>. Thanks to Steven Peterson, Benedikt Ritter, Gary Gregory, Joerg Schaible, Buddhi De Silva, Elliotte Rusty Harold.</li> <li>CSV-310: Misleading error message when QuoteMode set to None <a href="https://redirect.github.com/apache/commons-csv/issues/352">#352</a>. Thanks to Buddhi De Silva.</li> <li>CSV-311: OutOfMemory for very long rows despite using column value of type Reader. Thanks to Christian Feuersaenger, Gary Gregory.</li> <li> <pre><code> Use try-with-resources to manage JDBC Clob in CSVPrinter.printRecords(ResultSet). Thanks to Gary Gregory. </code></pre> </li> <li> <pre><code> JDBC Blob columns are now output as Base64 instead of Object#toString(), which usually is InputStream#toString(). Thanks to Gary Gregory. </code></pre> </li> <li> <pre><code> Support unusual Excel use cases: Add support for trailing data after the closing quote, and EOF without a final closing quote [#303](apache/commons-csv#303). Thanks to DamjanJovanovic, Gary Gregory. </code></pre> </li> <li> <pre><code> MongoDB CSV empty first column parsing fix [#412](apache/commons-csv#412). Thanks to Igor Kamyshnikov, Gary Gregory. </code></pre> </li> </ul> <h2>Changes</h2> <ul> <li> <pre><code> Bump commons-io:commons-io: from 2.11.0 to 2.16.1 [#408](apache/commons-csv#408), [#413](apache/commons-csv#413). Thanks to Gary Gregory. </code></pre> </li> <li> <pre><code> Bump commons-parent from 57 to 69 [#410](apache/commons-csv#410). Thanks to Gary Gregory, Dependabot. </code></pre> </li> <li> <pre><code> Bump h2 from 2.1.214 to 2.2.224 [#333](apache/commons-csv#333), [#349](apache/commons-csv#349), [#359](apache/commons-csv#359). Thanks to Dependabot. </code></pre> </li> <li> <pre><code> Bump commons-lang3 from 3.12.0 to 3.14.0. Thanks to Gary Gregory. </code></pre> </li> <li> <pre><code> Update exception message in CSVRecord#getNextRecord() [#348](apache/commons-csv#348). Thanks to Buddhi De Silva, Michael Osipov, Gary Gregory. </code></pre> </li> <li> <pre><code> Bump tests using com.opencsv:opencsv from 5.8 to 5.9 [#373](apache/commons-csv#373). Thanks to Dependabot. </code></pre> </li> </ul> <p>Historical list of changes: <a href="https://commons.apache.org/proper/commons-csv/changes-report.html">https://commons.apache.org/proper/commons-csv/changes-report.html</a></p> <p>For complete information on Apache Commons CSV, including instructions on how to submit bug reports,</p> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/apache/commons-csv/commit/74e12741b24e724bb2e60109daa0c834fd75a68a"><code>74e1274</code></a> Prepare for the next release candidate</li> <li><a href="https://github.com/apache/commons-csv/commit/89cbc7bb3f7f840045ee1fa17863830110e8aebe"><code>89cbc7b</code></a> Prepare for the next release candidate</li> <li><a href="https://github.com/apache/commons-csv/commit/447682ec4a4bba7ea3c4edf89a87c63ff5bf718e"><code>447682e</code></a> Match version to POM</li> <li><a href="https://github.com/apache/commons-csv/commit/4c186f27f7b340aa7d78dc68d380200bcb49bb46"><code>4c186f2</code></a> Merge pull request <a href="https://redirect.github.com/apache/commons-csv/issues/420">#420</a> from apache/dependabot/github_actions/actions/checkou...</li> <li><a href="https://github.com/apache/commons-csv/commit/8af37f7992e3e7fb37e0f2a5a9b02f27b9cb5e84"><code>8af37f7</code></a> Merge pull request <a href="https://redirect.github.com/apache/commons-csv/issues/418">#418</a> from apache/dependabot/github_actions/github/codeql-a...</li> <li><a href="https://github.com/apache/commons-csv/commit/2238314ef83214142a4b6304c3cc36a20749b953"><code>2238314</code></a> Merge pull request <a href="https://redirect.github.com/apache/commons-csv/issues/419">#419</a> from apache/dependabot/github_actions/actions/upload-...</li> <li><a href="https://github.com/apache/commons-csv/commit/2ccf6686364c9183a03ab52c944f63695abc2843"><code>2ccf668</code></a> Bump actions/checkout from 4.1.2 to 4.1.4</li> <li><a href="https://github.com/apache/commons-csv/commit/26cf90ecbffaf0243dd01cdf941d0c13fb875a88"><code>26cf90e</code></a> Bump actions/upload-artifact from 4.3.2 to 4.3.3</li> <li><a href="https://github.com/apache/commons-csv/commit/586310afbc7f93c356ede7602706b3a2a5a6b916"><code>586310a</code></a> Bump github/codeql-action from 3.25.1 to 3.25.3</li> <li><a href="https://github.com/apache/commons-csv/commit/bea505a55b6ab3c4eca27f395b9d6fa6787d496a"><code>bea505a</code></a> Merge pull request <a href="https://redirect.github.com/apache/commons-csv/issues/416">#416</a> from apache/dependabot/github_actions/actions/upload-...</li> <li>Additional commits viewable in <a href="https://github.com/apache/commons-csv/compare/rel/commons-csv-1.10.0...rel/commons-csv-1.11.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=org.apache.commons:commons-csv&package-manager=maven&previous-version=1.10.0&new-version=1.11.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `dependabot rebase` will rebase this PR - `dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `dependabot merge` will merge this PR after your CI passes on it - `dependabot squash and merge` will squash and merge this PR after your CI passes on it - `dependabot cancel merge` will cancel a previously requested merge and block automerging - `dependabot reopen` will reopen this PR if it is closed - `dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Closes #1923 from dependabot[bot]/dependabot/maven/java/org.apache.commons-commons-csv-1.11.0. Authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: William Hyun <william@apache.org>
Bumps [org.apache.commons:commons-csv](https://github.com/apache/commons-csv) from 1.10.0 to 1.11.0. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/apache/commons-csv/blob/master/RELEASE-NOTES.txt">org.apache.commons:commons-csv's changelog</a>.</em></p> <blockquote> <p>Apache Commons CSV Version 1.11.0 Release Notes</p> <p>This document contains the release notes for the 1.11.0 version of Apache Commons CSV. Commons CSV reads and writes files in variations of the Comma Separated Value (CSV) format.</p> <p>Commons CSV requires at least Java 8.</p> <p>The Apache Commons CSV library provides a simple interface for reading and writing CSV files of various types.</p> <p>Feature and bug fix release (Java 8 or above)</p> <p>Changes in this version include:</p> <h2>New Features</h2> <ul> <li>CSV-308: [Javadoc] Add example to CSVFormat#setHeaderComments() <a href="https://redirect.github.com/apache/commons-csv/issues/344">#344</a>. Thanks to Buddhi De Silva, Gary Gregory.</li> <li> <pre><code> Add and use CSVFormat#setTrailingData(boolean) in CSVFormat.EXCEL for Excel compatibility [#303](apache/commons-csv#303). Thanks to DamjanJovanovic, Gary Gregory. </code></pre> </li> <li> <pre><code> Add and use CSVFormat#setLenientEof(boolean) in CSVFormat.EXCEL for Excel compatibility [#303](apache/commons-csv#303). Thanks to DamjanJovanovic, Gary Gregory. </code></pre> </li> </ul> <h2>Fixed Bugs</h2> <ul> <li>CSV-306: Replace deprecated method in user guide, update external link <a href="https://redirect.github.com/apache/commons-csv/issues/324">#324</a>, <a href="https://redirect.github.com/apache/commons-csv/issues/325">#325</a>. Thanks to Sam Ng, Bruno P. Kinoshita.</li> <li> <pre><code> Document duplicate header behavior [#309](apache/commons-csv#309). Thanks to Seth Falco, Bruno P. Kinoshita. </code></pre> </li> <li> <pre><code> Add missing docs [#328](apache/commons-csv#328). Thanks to jkbkupczyk. </code></pre> </li> <li> <pre><code> [StepSecurity] CI: Harden GitHub Actions [#329](apache/commons-csv#329), [#330](apache/commons-csv#330). Thanks to step-security-bot. </code></pre> </li> <li>CSV-147: Better error message during faulty CSV record read <a href="https://redirect.github.com/apache/commons-csv/issues/347">#347</a>. Thanks to Steven Peterson, Benedikt Ritter, Gary Gregory, Joerg Schaible, Buddhi De Silva, Elliotte Rusty Harold.</li> <li>CSV-310: Misleading error message when QuoteMode set to None <a href="https://redirect.github.com/apache/commons-csv/issues/352">#352</a>. Thanks to Buddhi De Silva.</li> <li>CSV-311: OutOfMemory for very long rows despite using column value of type Reader. Thanks to Christian Feuersaenger, Gary Gregory.</li> <li> <pre><code> Use try-with-resources to manage JDBC Clob in CSVPrinter.printRecords(ResultSet). Thanks to Gary Gregory. </code></pre> </li> <li> <pre><code> JDBC Blob columns are now output as Base64 instead of Object#toString(), which usually is InputStream#toString(). Thanks to Gary Gregory. </code></pre> </li> <li> <pre><code> Support unusual Excel use cases: Add support for trailing data after the closing quote, and EOF without a final closing quote [#303](apache/commons-csv#303). Thanks to DamjanJovanovic, Gary Gregory. </code></pre> </li> <li> <pre><code> MongoDB CSV empty first column parsing fix [#412](apache/commons-csv#412). Thanks to Igor Kamyshnikov, Gary Gregory. </code></pre> </li> </ul> <h2>Changes</h2> <ul> <li> <pre><code> Bump commons-io:commons-io: from 2.11.0 to 2.16.1 [#408](apache/commons-csv#408), [#413](apache/commons-csv#413). Thanks to Gary Gregory. </code></pre> </li> <li> <pre><code> Bump commons-parent from 57 to 69 [#410](apache/commons-csv#410). Thanks to Gary Gregory, Dependabot. </code></pre> </li> <li> <pre><code> Bump h2 from 2.1.214 to 2.2.224 [#333](apache/commons-csv#333), [#349](apache/commons-csv#349), [#359](apache/commons-csv#359). Thanks to Dependabot. </code></pre> </li> <li> <pre><code> Bump commons-lang3 from 3.12.0 to 3.14.0. Thanks to Gary Gregory. </code></pre> </li> <li> <pre><code> Update exception message in CSVRecord#getNextRecord() [#348](apache/commons-csv#348). Thanks to Buddhi De Silva, Michael Osipov, Gary Gregory. </code></pre> </li> <li> <pre><code> Bump tests using com.opencsv:opencsv from 5.8 to 5.9 [#373](apache/commons-csv#373). Thanks to Dependabot. </code></pre> </li> </ul> <p>Historical list of changes: <a href="https://commons.apache.org/proper/commons-csv/changes-report.html">https://commons.apache.org/proper/commons-csv/changes-report.html</a></p> <p>For complete information on Apache Commons CSV, including instructions on how to submit bug reports,</p> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/apache/commons-csv/commit/74e12741b24e724bb2e60109daa0c834fd75a68a"><code>74e1274</code></a> Prepare for the next release candidate</li> <li><a href="https://github.com/apache/commons-csv/commit/89cbc7bb3f7f840045ee1fa17863830110e8aebe"><code>89cbc7b</code></a> Prepare for the next release candidate</li> <li><a href="https://github.com/apache/commons-csv/commit/447682ec4a4bba7ea3c4edf89a87c63ff5bf718e"><code>447682e</code></a> Match version to POM</li> <li><a href="https://github.com/apache/commons-csv/commit/4c186f27f7b340aa7d78dc68d380200bcb49bb46"><code>4c186f2</code></a> Merge pull request <a href="https://redirect.github.com/apache/commons-csv/issues/420">#420</a> from apache/dependabot/github_actions/actions/checkou...</li> <li><a href="https://github.com/apache/commons-csv/commit/8af37f7992e3e7fb37e0f2a5a9b02f27b9cb5e84"><code>8af37f7</code></a> Merge pull request <a href="https://redirect.github.com/apache/commons-csv/issues/418">#418</a> from apache/dependabot/github_actions/github/codeql-a...</li> <li><a href="https://github.com/apache/commons-csv/commit/2238314ef83214142a4b6304c3cc36a20749b953"><code>2238314</code></a> Merge pull request <a href="https://redirect.github.com/apache/commons-csv/issues/419">#419</a> from apache/dependabot/github_actions/actions/upload-...</li> <li><a href="https://github.com/apache/commons-csv/commit/2ccf6686364c9183a03ab52c944f63695abc2843"><code>2ccf668</code></a> Bump actions/checkout from 4.1.2 to 4.1.4</li> <li><a href="https://github.com/apache/commons-csv/commit/26cf90ecbffaf0243dd01cdf941d0c13fb875a88"><code>26cf90e</code></a> Bump actions/upload-artifact from 4.3.2 to 4.3.3</li> <li><a href="https://github.com/apache/commons-csv/commit/586310afbc7f93c356ede7602706b3a2a5a6b916"><code>586310a</code></a> Bump github/codeql-action from 3.25.1 to 3.25.3</li> <li><a href="https://github.com/apache/commons-csv/commit/bea505a55b6ab3c4eca27f395b9d6fa6787d496a"><code>bea505a</code></a> Merge pull request <a href="https://redirect.github.com/apache/commons-csv/issues/416">#416</a> from apache/dependabot/github_actions/actions/upload-...</li> <li>Additional commits viewable in <a href="https://github.com/apache/commons-csv/compare/rel/commons-csv-1.10.0...rel/commons-csv-1.11.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=org.apache.commons:commons-csv&package-manager=maven&previous-version=1.10.0&new-version=1.11.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `dependabot rebase` will rebase this PR - `dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `dependabot merge` will merge this PR after your CI passes on it - `dependabot squash and merge` will squash and merge this PR after your CI passes on it - `dependabot cancel merge` will cancel a previously requested merge and block automerging - `dependabot reopen` will reopen this PR if it is closed - `dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Closes #1923 from dependabot[bot]/dependabot/maven/java/org.apache.commons-commons-csv-1.11.0. Authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: William Hyun <william@apache.org>
Continued from PR 295.
The test was failing because the test was wrong, not because my patches were wrong.
The test should match Excel's interpretation of the CSV file. Excel fuses lines 3 and 4 together, because the last field on line 3 doesn't end in a quote, so it continues into the next line. There, it stops at the initial quote, unquoting that portion, then also adds everything up to the comma, and all this becomes field 2 of line 3. The remaining fields on line 4 are interpreted as successive line 3 fields, and because the last field doesn't have a terminating quote, and the file ends in a new line, the last field also ends in a new line.
Once these corrections are made, the test passes.
Also add a RAT exclude for the sample file, which was missed out in commit 1269c13, and breaks the build.