Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORC-422 - Fix issue with Predicate push down when lower/upper bounds are set #348

Closed

Conversation

moresandeep
Copy link
Contributor

This PR fixes a bug where Predicate push down will not return proper values when upper and lower bounds are set for StringStatisticsImpl.
This patch also contains removing some unused imports as part of cleanup.

@moresandeep
Copy link
Contributor Author

Hello @omalley I am updating a work in progress patch for you to look at. I have a bunch of lines commented in there which I'll take out in the final patch and cleanup the code, apologies for the rough code, I don't want to delay this patch further.

Basically, the crux of the patch is in file RecordReaderImpl line # 325 and # 335, what I am doing here is depending on whether upperbound or lowerbound is set for corner cases where the predicate literal equals upperbound or lowerbound I am picking the appropriate value.
e.g.

/* since min value is truncated when we have compare=0, it means the predicate string is BEFORE the min value*/
    else if (minCompare == 0 && isLowerBoundSet) {
      return Location.BEFORE;
    }

and

/* if upperbound is set then location here will be AFTER */
 else if (maxCompare == 0 && isUpperBoundSet) {
   return Location.AFTER;
 }

I also, tried to write up UnitTests to test all the corner case scenarios that I could think. I could not find a way test greater than since there is no Operator 'GREATER_THAN', code suggest using 'NOT' Operator, is there an example that I can look at for something like NOT LESS_THAN.

@moresandeep
Copy link
Contributor Author

Hello @omalley can you take a look at the new changes which addresses the fixes.

@omalley omalley closed this in 6e825ee Apr 4, 2019
@omalley
Copy link
Contributor

omalley commented Apr 4, 2019

I just committed this. Thanks, Sandeep!

williamhyun pushed a commit that referenced this pull request May 8, 2024
Bumps [org.apache.commons:commons-csv](https://github.com/apache/commons-csv) from 1.10.0 to 1.11.0.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a href="https://github.com/apache/commons-csv/blob/master/RELEASE-NOTES.txt">org.apache.commons:commons-csv's changelog</a>.</em></p>
<blockquote>
<p>Apache Commons CSV Version 1.11.0
Release Notes</p>
<p>This document contains the release notes for the 1.11.0 version of Apache Commons CSV.
Commons CSV reads and writes files in variations of the Comma Separated Value (CSV) format.</p>
<p>Commons CSV requires at least Java 8.</p>
<p>The Apache Commons CSV library provides a simple interface for reading and writing CSV files of various types.</p>
<p>Feature and bug fix release (Java 8 or above)</p>
<p>Changes in this version include:</p>
<h2>New Features</h2>
<ul>
<li>CSV-308:  [Javadoc] Add example to CSVFormat#setHeaderComments() <a href="https://redirect.github.com/apache/commons-csv/issues/344">#344</a>. Thanks to Buddhi De Silva, Gary Gregory.</li>
<li>
<pre><code>      Add and use CSVFormat#setTrailingData(boolean) in CSVFormat.EXCEL for Excel compatibility [#303](apache/commons-csv#303). Thanks to DamjanJovanovic, Gary Gregory.
</code></pre>
</li>
<li>
<pre><code>      Add and use CSVFormat#setLenientEof(boolean) in CSVFormat.EXCEL for Excel compatibility [#303](apache/commons-csv#303). Thanks to DamjanJovanovic, Gary Gregory.
</code></pre>
</li>
</ul>
<h2>Fixed Bugs</h2>
<ul>
<li>CSV-306:  Replace deprecated method in user guide, update external link <a href="https://redirect.github.com/apache/commons-csv/issues/324">#324</a>, <a href="https://redirect.github.com/apache/commons-csv/issues/325">#325</a>. Thanks to Sam Ng, Bruno P. Kinoshita.</li>
<li>
<pre><code>      Document duplicate header behavior [#309](apache/commons-csv#309). Thanks to Seth Falco, Bruno P. Kinoshita.
</code></pre>
</li>
<li>
<pre><code>      Add missing docs [#328](apache/commons-csv#328). Thanks to jkbkupczyk.
</code></pre>
</li>
<li>
<pre><code>      [StepSecurity] CI: Harden GitHub Actions [#329](apache/commons-csv#329), [#330](apache/commons-csv#330). Thanks to step-security-bot.
</code></pre>
</li>
<li>CSV-147:  Better error message during faulty CSV record read <a href="https://redirect.github.com/apache/commons-csv/issues/347">#347</a>. Thanks to Steven Peterson, Benedikt Ritter, Gary Gregory, Joerg Schaible, Buddhi De Silva, Elliotte Rusty Harold.</li>
<li>CSV-310:  Misleading error message when QuoteMode set to None <a href="https://redirect.github.com/apache/commons-csv/issues/352">#352</a>. Thanks to Buddhi De Silva.</li>
<li>CSV-311:  OutOfMemory for very long rows despite using column value of type Reader. Thanks to Christian Feuersaenger, Gary Gregory.</li>
<li>
<pre><code>      Use try-with-resources to manage JDBC Clob in CSVPrinter.printRecords(ResultSet). Thanks to Gary Gregory.
</code></pre>
</li>
<li>
<pre><code>      JDBC Blob columns are now output as Base64 instead of Object#toString(), which usually is InputStream#toString(). Thanks to Gary Gregory.
</code></pre>
</li>
<li>
<pre><code>      Support unusual Excel use cases: Add support for trailing data after the closing quote, and EOF without a final closing quote [#303](apache/commons-csv#303). Thanks to DamjanJovanovic, Gary Gregory.
</code></pre>
</li>
<li>
<pre><code>      MongoDB CSV empty first column parsing fix [#412](apache/commons-csv#412). Thanks to Igor Kamyshnikov, Gary Gregory.
</code></pre>
</li>
</ul>
<h2>Changes</h2>
<ul>
<li>
<pre><code>      Bump commons-io:commons-io: from 2.11.0 to 2.16.1 [#408](apache/commons-csv#408), [#413](apache/commons-csv#413). Thanks to Gary Gregory.
</code></pre>
</li>
<li>
<pre><code>      Bump commons-parent from 57 to 69 [#410](apache/commons-csv#410). Thanks to Gary Gregory, Dependabot.
</code></pre>
</li>
<li>
<pre><code>      Bump h2 from 2.1.214 to 2.2.224 [#333](apache/commons-csv#333), [#349](apache/commons-csv#349), [#359](apache/commons-csv#359). Thanks to Dependabot.
</code></pre>
</li>
<li>
<pre><code>      Bump commons-lang3 from 3.12.0 to 3.14.0. Thanks to Gary Gregory.
</code></pre>
</li>
<li>
<pre><code>      Update exception message in CSVRecord#getNextRecord() [#348](apache/commons-csv#348). Thanks to Buddhi De Silva, Michael Osipov, Gary Gregory.
</code></pre>
</li>
<li>
<pre><code>      Bump tests using com.opencsv:opencsv from 5.8 to 5.9 [#373](apache/commons-csv#373). Thanks to Dependabot.
</code></pre>
</li>
</ul>
<p>Historical list of changes: <a href="https://commons.apache.org/proper/commons-csv/changes-report.html">https://commons.apache.org/proper/commons-csv/changes-report.html</a></p>
<p>For complete information on Apache Commons CSV, including instructions on how to submit bug reports,</p>
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="https://github.com/apache/commons-csv/commit/74e12741b24e724bb2e60109daa0c834fd75a68a"><code>74e1274</code></a> Prepare for the next release candidate</li>
<li><a href="https://github.com/apache/commons-csv/commit/89cbc7bb3f7f840045ee1fa17863830110e8aebe"><code>89cbc7b</code></a> Prepare for the next release candidate</li>
<li><a href="https://github.com/apache/commons-csv/commit/447682ec4a4bba7ea3c4edf89a87c63ff5bf718e"><code>447682e</code></a> Match version to POM</li>
<li><a href="https://github.com/apache/commons-csv/commit/4c186f27f7b340aa7d78dc68d380200bcb49bb46"><code>4c186f2</code></a> Merge pull request <a href="https://redirect.github.com/apache/commons-csv/issues/420">#420</a> from apache/dependabot/github_actions/actions/checkou...</li>
<li><a href="https://github.com/apache/commons-csv/commit/8af37f7992e3e7fb37e0f2a5a9b02f27b9cb5e84"><code>8af37f7</code></a> Merge pull request <a href="https://redirect.github.com/apache/commons-csv/issues/418">#418</a> from apache/dependabot/github_actions/github/codeql-a...</li>
<li><a href="https://github.com/apache/commons-csv/commit/2238314ef83214142a4b6304c3cc36a20749b953"><code>2238314</code></a> Merge pull request <a href="https://redirect.github.com/apache/commons-csv/issues/419">#419</a> from apache/dependabot/github_actions/actions/upload-...</li>
<li><a href="https://github.com/apache/commons-csv/commit/2ccf6686364c9183a03ab52c944f63695abc2843"><code>2ccf668</code></a> Bump actions/checkout from 4.1.2 to 4.1.4</li>
<li><a href="https://github.com/apache/commons-csv/commit/26cf90ecbffaf0243dd01cdf941d0c13fb875a88"><code>26cf90e</code></a> Bump actions/upload-artifact from 4.3.2 to 4.3.3</li>
<li><a href="https://github.com/apache/commons-csv/commit/586310afbc7f93c356ede7602706b3a2a5a6b916"><code>586310a</code></a> Bump github/codeql-action from 3.25.1 to 3.25.3</li>
<li><a href="https://github.com/apache/commons-csv/commit/bea505a55b6ab3c4eca27f395b9d6fa6787d496a"><code>bea505a</code></a> Merge pull request <a href="https://redirect.github.com/apache/commons-csv/issues/416">#416</a> from apache/dependabot/github_actions/actions/upload-...</li>
<li>Additional commits viewable in <a href="https://github.com/apache/commons-csv/compare/rel/commons-csv-1.10.0...rel/commons-csv-1.11.0">compare view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=org.apache.commons:commons-csv&package-manager=maven&previous-version=1.10.0&new-version=1.11.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `dependabot rebase` will rebase this PR
- `dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `dependabot merge` will merge this PR after your CI passes on it
- `dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `dependabot cancel merge` will cancel a previously requested merge and block automerging
- `dependabot reopen` will reopen this PR if it is closed
- `dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- `dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

</details>

Closes #1923 from dependabot[bot]/dependabot/maven/java/org.apache.commons-commons-csv-1.11.0.

Authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: William Hyun <william@apache.org>
dongjoon-hyun pushed a commit that referenced this pull request May 8, 2024
Bumps [org.apache.commons:commons-csv](https://github.com/apache/commons-csv) from 1.10.0 to 1.11.0.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a href="https://github.com/apache/commons-csv/blob/master/RELEASE-NOTES.txt">org.apache.commons:commons-csv's changelog</a>.</em></p>
<blockquote>
<p>Apache Commons CSV Version 1.11.0
Release Notes</p>
<p>This document contains the release notes for the 1.11.0 version of Apache Commons CSV.
Commons CSV reads and writes files in variations of the Comma Separated Value (CSV) format.</p>
<p>Commons CSV requires at least Java 8.</p>
<p>The Apache Commons CSV library provides a simple interface for reading and writing CSV files of various types.</p>
<p>Feature and bug fix release (Java 8 or above)</p>
<p>Changes in this version include:</p>
<h2>New Features</h2>
<ul>
<li>CSV-308:  [Javadoc] Add example to CSVFormat#setHeaderComments() <a href="https://redirect.github.com/apache/commons-csv/issues/344">#344</a>. Thanks to Buddhi De Silva, Gary Gregory.</li>
<li>
<pre><code>      Add and use CSVFormat#setTrailingData(boolean) in CSVFormat.EXCEL for Excel compatibility [#303](apache/commons-csv#303). Thanks to DamjanJovanovic, Gary Gregory.
</code></pre>
</li>
<li>
<pre><code>      Add and use CSVFormat#setLenientEof(boolean) in CSVFormat.EXCEL for Excel compatibility [#303](apache/commons-csv#303). Thanks to DamjanJovanovic, Gary Gregory.
</code></pre>
</li>
</ul>
<h2>Fixed Bugs</h2>
<ul>
<li>CSV-306:  Replace deprecated method in user guide, update external link <a href="https://redirect.github.com/apache/commons-csv/issues/324">#324</a>, <a href="https://redirect.github.com/apache/commons-csv/issues/325">#325</a>. Thanks to Sam Ng, Bruno P. Kinoshita.</li>
<li>
<pre><code>      Document duplicate header behavior [#309](apache/commons-csv#309). Thanks to Seth Falco, Bruno P. Kinoshita.
</code></pre>
</li>
<li>
<pre><code>      Add missing docs [#328](apache/commons-csv#328). Thanks to jkbkupczyk.
</code></pre>
</li>
<li>
<pre><code>      [StepSecurity] CI: Harden GitHub Actions [#329](apache/commons-csv#329), [#330](apache/commons-csv#330). Thanks to step-security-bot.
</code></pre>
</li>
<li>CSV-147:  Better error message during faulty CSV record read <a href="https://redirect.github.com/apache/commons-csv/issues/347">#347</a>. Thanks to Steven Peterson, Benedikt Ritter, Gary Gregory, Joerg Schaible, Buddhi De Silva, Elliotte Rusty Harold.</li>
<li>CSV-310:  Misleading error message when QuoteMode set to None <a href="https://redirect.github.com/apache/commons-csv/issues/352">#352</a>. Thanks to Buddhi De Silva.</li>
<li>CSV-311:  OutOfMemory for very long rows despite using column value of type Reader. Thanks to Christian Feuersaenger, Gary Gregory.</li>
<li>
<pre><code>      Use try-with-resources to manage JDBC Clob in CSVPrinter.printRecords(ResultSet). Thanks to Gary Gregory.
</code></pre>
</li>
<li>
<pre><code>      JDBC Blob columns are now output as Base64 instead of Object#toString(), which usually is InputStream#toString(). Thanks to Gary Gregory.
</code></pre>
</li>
<li>
<pre><code>      Support unusual Excel use cases: Add support for trailing data after the closing quote, and EOF without a final closing quote [#303](apache/commons-csv#303). Thanks to DamjanJovanovic, Gary Gregory.
</code></pre>
</li>
<li>
<pre><code>      MongoDB CSV empty first column parsing fix [#412](apache/commons-csv#412). Thanks to Igor Kamyshnikov, Gary Gregory.
</code></pre>
</li>
</ul>
<h2>Changes</h2>
<ul>
<li>
<pre><code>      Bump commons-io:commons-io: from 2.11.0 to 2.16.1 [#408](apache/commons-csv#408), [#413](apache/commons-csv#413). Thanks to Gary Gregory.
</code></pre>
</li>
<li>
<pre><code>      Bump commons-parent from 57 to 69 [#410](apache/commons-csv#410). Thanks to Gary Gregory, Dependabot.
</code></pre>
</li>
<li>
<pre><code>      Bump h2 from 2.1.214 to 2.2.224 [#333](apache/commons-csv#333), [#349](apache/commons-csv#349), [#359](apache/commons-csv#359). Thanks to Dependabot.
</code></pre>
</li>
<li>
<pre><code>      Bump commons-lang3 from 3.12.0 to 3.14.0. Thanks to Gary Gregory.
</code></pre>
</li>
<li>
<pre><code>      Update exception message in CSVRecord#getNextRecord() [#348](apache/commons-csv#348). Thanks to Buddhi De Silva, Michael Osipov, Gary Gregory.
</code></pre>
</li>
<li>
<pre><code>      Bump tests using com.opencsv:opencsv from 5.8 to 5.9 [#373](apache/commons-csv#373). Thanks to Dependabot.
</code></pre>
</li>
</ul>
<p>Historical list of changes: <a href="https://commons.apache.org/proper/commons-csv/changes-report.html">https://commons.apache.org/proper/commons-csv/changes-report.html</a></p>
<p>For complete information on Apache Commons CSV, including instructions on how to submit bug reports,</p>
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="https://github.com/apache/commons-csv/commit/74e12741b24e724bb2e60109daa0c834fd75a68a"><code>74e1274</code></a> Prepare for the next release candidate</li>
<li><a href="https://github.com/apache/commons-csv/commit/89cbc7bb3f7f840045ee1fa17863830110e8aebe"><code>89cbc7b</code></a> Prepare for the next release candidate</li>
<li><a href="https://github.com/apache/commons-csv/commit/447682ec4a4bba7ea3c4edf89a87c63ff5bf718e"><code>447682e</code></a> Match version to POM</li>
<li><a href="https://github.com/apache/commons-csv/commit/4c186f27f7b340aa7d78dc68d380200bcb49bb46"><code>4c186f2</code></a> Merge pull request <a href="https://redirect.github.com/apache/commons-csv/issues/420">#420</a> from apache/dependabot/github_actions/actions/checkou...</li>
<li><a href="https://github.com/apache/commons-csv/commit/8af37f7992e3e7fb37e0f2a5a9b02f27b9cb5e84"><code>8af37f7</code></a> Merge pull request <a href="https://redirect.github.com/apache/commons-csv/issues/418">#418</a> from apache/dependabot/github_actions/github/codeql-a...</li>
<li><a href="https://github.com/apache/commons-csv/commit/2238314ef83214142a4b6304c3cc36a20749b953"><code>2238314</code></a> Merge pull request <a href="https://redirect.github.com/apache/commons-csv/issues/419">#419</a> from apache/dependabot/github_actions/actions/upload-...</li>
<li><a href="https://github.com/apache/commons-csv/commit/2ccf6686364c9183a03ab52c944f63695abc2843"><code>2ccf668</code></a> Bump actions/checkout from 4.1.2 to 4.1.4</li>
<li><a href="https://github.com/apache/commons-csv/commit/26cf90ecbffaf0243dd01cdf941d0c13fb875a88"><code>26cf90e</code></a> Bump actions/upload-artifact from 4.3.2 to 4.3.3</li>
<li><a href="https://github.com/apache/commons-csv/commit/586310afbc7f93c356ede7602706b3a2a5a6b916"><code>586310a</code></a> Bump github/codeql-action from 3.25.1 to 3.25.3</li>
<li><a href="https://github.com/apache/commons-csv/commit/bea505a55b6ab3c4eca27f395b9d6fa6787d496a"><code>bea505a</code></a> Merge pull request <a href="https://redirect.github.com/apache/commons-csv/issues/416">#416</a> from apache/dependabot/github_actions/actions/upload-...</li>
<li>Additional commits viewable in <a href="https://github.com/apache/commons-csv/compare/rel/commons-csv-1.10.0...rel/commons-csv-1.11.0">compare view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=org.apache.commons:commons-csv&package-manager=maven&previous-version=1.10.0&new-version=1.11.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `dependabot rebase` will rebase this PR
- `dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `dependabot merge` will merge this PR after your CI passes on it
- `dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `dependabot cancel merge` will cancel a previously requested merge and block automerging
- `dependabot reopen` will reopen this PR if it is closed
- `dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- `dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

</details>

Closes #1923 from dependabot[bot]/dependabot/maven/java/org.apache.commons-commons-csv-1.11.0.

Authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: William Hyun <william@apache.org>
zratkai pushed a commit to zratkai/orc that referenced this pull request Jun 20, 2024
…per bounds are set

Fixes apache#348

Signed-off-by: Owen O'Malley <omalley@apache.org>
(cherry picked from commit 6e825ee)
Change-Id: I672962e3f5c652795a9f4df62591349578fd39ad
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants