Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quoted fields at end of row are silently dropped #19

Closed
nathankleyn opened this issue Jul 13, 2018 · 1 comment
Closed

Quoted fields at end of row are silently dropped #19

nathankleyn opened this issue Jul 13, 2018 · 1 comment
Assignees
Labels

Comments

@nathankleyn
Copy link
Contributor

Many thanks for such a great piece of software, we chose to use your library for some big-data processing because we found it had vastly better performance than anything else! Huge thanks!

We ran into a slightly obscure bug parsing some huge CSVs: a field that appears at the end of a row that is quoted but empty gets silently dropped. Here's an example:

"foo",""

If you run this test you'll see it fails:

public void handlesEmptyQuotedFieldsAtEndOfRow() throws IOException {
  assertEquals(readCsvRow("foo,\"\"").getField(1), "");
}

We ran into this because we receive CSVs that have all fields quoted, even empty ones, and couldn't work out why accessing the final field would sometimes lead to an ArrayIndexOutOfBoundsException.

I've had an attempt at a fix for this which I'll raise a PR for momentarily, and I've done my best to try to stick to the performance sensitive methods you are using, but eager for feedback if I've done anything not to your liking!

Please let me know if we can help in any other way!

nathankleyn added a commit to nathankleyn/FastCSV that referenced this issue Jul 13, 2018
If a field at the end of a row was quoted but empty, it was silently
dropped. For example, take this CSV:

```
"foo",""
```

This would only end up having 1 field in the resulting row instead of
the expected 2. This PR fixes this bug.
@nathankleyn
Copy link
Contributor Author

Have raised PR with (hopefully) an okay fix for this in #20. 🤞

nathankleyn added a commit to nathankleyn/FastCSV that referenced this issue Jul 13, 2018
If a field at the end of a row was quoted but empty, it was silently
dropped. For example, take this CSV:

```
"foo",""
```

This would only end up having 1 field in the resulting row instead of
the expected 2. This PR fixes this bug.
nathankleyn added a commit to nathankleyn/FastCSV that referenced this issue Jul 13, 2018
If a field at the end of a row was quoted but empty, it was silently
dropped. For example, take this CSV:

```
"foo",""
```

This would only end up having 1 field in the resulting row instead of
the expected 2. This PR fixes this bug.
nathankleyn added a commit to nathankleyn/FastCSV that referenced this issue Jul 13, 2018
If a field at the end of a row was quoted but empty, it was silently
dropped. For example, take this CSV:

```
"foo",""
```

This would only end up having 1 field in the resulting row instead of
the expected 2. This PR fixes this bug.
@osiegmar osiegmar self-assigned this Jul 14, 2018
@osiegmar osiegmar added the bug label Jul 14, 2018
osiegmar added a commit that referenced this issue Jul 19, 2018
…w-silently-dropped

Quoted fields at end of a row are silently dropped (fixes #19).
dhoard pushed a commit to dhoard/FastCSV that referenced this issue Nov 16, 2018
If a field at the end of a row was quoted but empty, it was silently
dropped. For example, take this CSV:

```
"foo",""
```

This would only end up having 1 field in the resulting row instead of
the expected 2. This PR fixes this bug.
dhoard pushed a commit to dhoard/FastCSV that referenced this issue Nov 16, 2018
…nd-of-row-silently-dropped

Quoted fields at end of a row are silently dropped (fixes osiegmar#19).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants