Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Inverse_binomial_cdf Presto function #5382

Closed
wants to merge 3 commits into from

Conversation

rrando901
Copy link
Contributor

Add Presto function Inverse_binomial_cdf following the Java implementation(source)

Resolves: #5251

@netlify
Copy link

netlify bot commented Jun 24, 2023

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 867a530
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/649ca4f76f132b00085e9970

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 24, 2023
Formatting Corrections
@rrando901
Copy link
Contributor Author

Fuzzer test is hanging. I’ve isolated the cause of the hang to the line of code below. I'm not sure if anyone else has experienced anything like this but I will continue to debug.

Line of code (Probability.h)

    VELOX_USER_CHECK(
        (p >= 0) && (p <= 1) && (p != kInf),
        "p must be in the interval [0, 1]");
    VELOX_USER_CHECK(
        (successProb >= 0) && (successProb <= 1) && (successProb != kInf),
        "successProbability must be in the interval [0, 1]");
    VELOX_USER_CHECK(
        (numOfTrials > 0) && (numOfTrials != kInf),
        "numberOfTrials must be greater than 0");

    boost::math::binomial_distribution<> dist(numOfTrials, successProb);
    result = boost::math::quantile(dist, p);  <<<<----- This line is causing the hang.
  }

Fuzzer output snippet

I20230623 18:41:04.466511 485261 ExpressionVerifier.cpp:40]     At 97: {3979553641363177119, null, null}
I20230623 18:41:04.466526 485261 ExpressionVerifier.cpp:40]     At 98: {2211836998435488860, 0.957719502504915, [98->33] 0.9349962219130248}
I20230623 18:41:04.466545 485261 ExpressionVerifier.cpp:40]     At 99: {6491116015083892777, 0.17358783772215247, [99->53] 0.03657975886017084}
I20230623 18:41:04.466571 485261 ExpressionVerifier.cpp:133] Starting common eval execution.  << hanging here

@rrando901
Copy link
Contributor Author

I've tracked down the cause of the hang in the fuzzer test to boost::math::detail::round_to_floor

When fuzzer passes my function the inputs numberOfTrials = 9079765771874083830, successProb = 0.561815, p = 0.0365346, we are getting stuck in the while loop in round_to_floor() below.

boost/math/distributions/detail/inv_discrete_quantile.hpp

inline typename Dist::value_type round_to_floor(const Dist& d, typename Dist::value_type result, typename Dist::value_type p, bool c)
{
   ...
   ...
   while(result != 0)
   {
      cc = result - 1;
      if(cc < support(d).first)
         break;
      pp = c ? cdf(complement(d, cc)) : cdf(d, cc);
      if(pp == p)
         result = cc;
      else if(c ? pp > p : pp < p)
         break;
      result -= 1;
   }
   return result;
}

@rrando901
Copy link
Contributor Author

The failure in the fuzzer test seems to be caused by a bug in boost::math::detail::round_to_floor.
It is possible for us to get stuck in the while loop below on certain inputs. This results in the fuzzer test hanging and eventually timing out the linux-build test.

boost/math/distributions/detail/inv_discrete_quantile.hpp

299 inline typename Dist::value_type round_to_floor(const Dist& d, typename Dist::value_type result, typename Dist::value_type p, bool c)
300 {
301    BOOST_MATH_STD_USING
302    typename Dist::value_type cc = ceil(result);
303    typename Dist::value_type pp = cc <= support(d).second ? c ? cdf(complement(d, cc)) : cdf(d, cc) : 1;
304    if(pp == p)
305       result = cc;
306    else
307       result = floor(result);
308    //
309    // Now find the smallest integer <= result for which we get an exact root:
310    //
311    while(result != 0) <<< WE ARE GETTING STUCK IN HERE
312    {
313       cc = result - 1;
314       if(cc < support(d).first)
315          break;
316       pp = c ? cdf(complement(d, cc)) : cdf(d, cc);
317       if(pp == p)
318          result = cc;
319       else if(c ? pp > p : pp < p)
320          break;
321       result -= 1;
322    }
323    return result;
324 }

Reproduction steps

boost::math::binomial_distribution<> dist(9079765771874083830, 0.561815);
boost::math::detail::round_to_floor(dist, 5.10115e+18, 0.0365346, false);

@aditi-pandit
Copy link
Collaborator

@mbasmanova, @Yuhta, @amitkdutta : Any suggestions ?

@Yuhta
Copy link
Contributor

Yuhta commented Jul 26, 2023

@rrando901 @aditi-pandit The bug in boost::math::detail::round_to_floor looks quite severe and I will write an alternative to replace that. Before that please hold on merging any inverse function implementation (e.g. anything calling boost::math::quantile).

@Yuhta
Copy link
Contributor

Yuhta commented Jul 31, 2023

Fixing it in boost: boostorg/math#1007.

In the meantime, we can add this line before any includes of boost/math to work around this:

#define BOOST_MATH_DISCRETE_QUANTILE_POLICY real

@stale
Copy link

stale bot commented Oct 29, 2023

This pull request has been automatically marked as stale because it has not had recent activity. If you'd still like this PR merged, please comment on the PR, make sure you've addressed reviewer comments, and rebase on the latest main. Thank you for your contributions!

@stale stale bot added the stale label Oct 29, 2023
@stale stale bot closed this Nov 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement inverse_binomial_cdf prestosql function
4 participants