Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: Decimal processing is incorrect when using external. #11823

Closed
Tracked by #7405
liurenjie1024 opened this issue Aug 22, 2023 · 3 comments · Fixed by #11839
Closed
Tracked by #7405

bug: Decimal processing is incorrect when using external. #11823

liurenjie1024 opened this issue Aug 22, 2023 · 3 comments · Fixed by #11839
Assignees
Labels
priority/high type/bug Something isn't working
Milestone

Comments

@liurenjie1024
Copy link
Contributor

Describe the bug

When talking to udf server, risingwave can't handle decimal type correctly.

Error message/log

2023-08-22T11:22:46.066832+08:00 DEBUG local_execute{query_id="4f6c7fb2-0634-45da-b9e3-7ad24ad15c6d" epoch=BatchQueryEpoch { epoch: Some(Current(4943994262061056)) }}: risingwave_frontend::scheduler::local: Local execution mode converts a plan with two stages
thread 'frontend-compute-threads' panicked at 'failed to build record batch: InvalidArgumentError("column types must match schema types, expected Decimal128(28, 0) but found Decimal128(38, 2) at column index 0")', src/expr/src/expr/expr_udf.rs:88:18
stack backtrace:
   0: rust_begin_unwind
             at /rustc/f0411ffcebcd7f75ac02ed45feb53ffd07b75398/library/std/src/panicking.rs:593:5
   1: core::panicking::panic_fmt
             at /rustc/f0411ffcebcd7f75ac02ed45feb53ffd07b75398/library/core/src/panicking.rs:67:14
   2: core::result::unwrap_failed
             at /rustc/f0411ffcebcd7f75ac02ed45feb53ffd07b75398/library/core/src/result.rs:1651:5
   3: core::result::Result<T,E>::expect
             at /rustc/f0411ffcebcd7f75ac02ed45feb53ffd07b75398/library/core/src/result.rs:1033:23
   4: risingwave_expr::expr::expr_udf::UdfExpression::eval_inner::{{closure}}
             at ./src/expr/src/expr/expr_udf.rs:87:13
   5: <risingwave_expr::expr::expr_udf::UdfExpression as risingwave_expr::expr::Expression>::eval::{{closure}}
             at ./src/expr/src/expr/expr_udf.rs:56:39
   6: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /rustc/f0411ffcebcd7f75ac02ed45feb53ffd07b75398/library/core/src/future/future.rs:125:9
   7: <futures_util::future::maybe_done::MaybeDone<Fut> as core::future::future::Future>::poll
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.28/src/future/maybe_done.rs:95:38
   8: <futures_util::future::join_all::JoinAll<F> as core::future::future::Future>::poll
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.28/src/future/join_all.rs:142:24
   9: risingwave_batch::executor::project::ProjectExecutor::do_execute::{{closure}}::{{closure}}
             at ./src/batch/src/executor/project.rs:64:30
  10: <futures_util::stream::futures_ordered::OrderWrapper<T> as core::future::future::Future>::poll
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.28/src/stream/futures_ordered.rs:55:9
  11: <futures_util::stream::futures_unordered::FuturesUnordered<Fut> as futures_core::stream::Stream>::poll_next
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.28/src/stream/futures_unordered/mod.rs:518:17
  12: futures_util::stream::stream::StreamExt::poll_next_unpin
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.28/src/stream/stream/mod.rs:1632:9
  13: <futures_util::stream::futures_ordered::FuturesOrdered<Fut> as futures_core::stream::Stream>::poll_next
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.28/src/stream/futures_ordered.rs:190:26
  14: futures_util::stream::stream::StreamExt::poll_next_unpin
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.28/src/stream/stream/mod.rs:1632:9
  15: <futures_util::stream::stream::buffered::Buffered<St> as futures_core::stream::Stream>::poll_next
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.28/src/stream/stream/buffered.rs:73:19
  16: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.28/src/stream.rs:120:9
  17: futures_util::stream::stream::StreamExt::poll_next_unpin
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.28/src/stream/stream/mod.rs:1632:9
  18: <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.28/src/stream/stream/next.rs:32:9
  19: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-0.1.37/src/instrument.rs:272:9
  20: futures_util::future::future::FutureExt::poll_unpin
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.28/src/future/future/mod.rs:562:9
  21: <futures_util::future::select::Select<A,B> as core::future::future::Future>::poll
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.28/src/future/select.rs:118:35
  22: <risingwave_batch::executor::managed::ManagedExecutor as risingwave_batch::executor::Executor>::execute::{{closure}}
             at ./src/batch/src/executor/managed.rs:61:82
  23: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.7/src/lib.rs:493:33
  24: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.28/src/stream.rs:120:9
  25: risingwave_frontend::scheduler::local::LocalQueryExecution::run_inner::{{closure}}
             at ./src/frontend/src/scheduler/local.rs:97:5
  26: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.7/src/lib.rs:493:33
  27: <tracing_futures::Instrumented<T> as futures_core::stream::Stream>::poll_next
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-futures-0.2.5/src/lib.rs:342:9
  28: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.28/src/stream.rs:120:9
  29: <futures_util::stream::stream::map::Map<St,F> as futures_core::stream::Stream>::poll_next
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.28/src/stream/stream/map.rs:58:26
  30: futures_util::stream::stream::StreamExt::poll_next_unpin
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.28/src/stream/stream/mod.rs:1632:9
  31: <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll
             at /Users/renjieliu/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.28/src/stream/stream/next.rs:32:9
  32: risingwave_frontend::scheduler::local::LocalQueryExecution::stream_rows::{{closure}}
             at ./src/frontend/src/scheduler/local.rs:144:56
  33: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll

To Reproduce

First adds a function in udf server to process decimal:

@udf(input_types=["DECIMAL", "DECIMAL"], result_type="DECIMAL")
def add_decimal(a: Decimal, b: Decimal) -> Decimal:
    return a + b

Then register it using sql:

statement ok
create function add_decimal(decimal, decimal) returns decimal as add_decimal using link 'http://localhost:8815';


query TTTTT rowsort
show functions
----
add_decimal numeric, numeric numeric (empty) http://localhost:8815

query T
select add_decimal(1.11, 2.22);
----
3.33

Expected behavior

It should works as expected.

How did you deploy RisingWave?

No response

The version of RisingWave

No response

Additional context

No response

@liurenjie1024 liurenjie1024 added the type/bug Something isn't working label Aug 22, 2023
@github-actions github-actions bot added this to the release-1.2 milestone Aug 22, 2023
@wangrunji0408
Copy link
Contributor

The problem is that decimal without any precision or scale is unconstrained numeric in RisingWave/PostgreSQL, but there's no equivalent in Arrow. In Arrow, decimal values are stored as 128-bits mantissas in Decimal128Array. Therefore, it requires explicit precision and scale in decimal type in order to interpret values.

A possible solution is to pick a special data type in Arrow to represent unconstrained numeric, such as decimal(0, 0). Then we dynamically infer a decimal type with precision and scale from a list of values.

@liurenjie1024
Copy link
Contributor Author

As far as I know, arrow doesn't have such a special data type. How about choosing a fixed decimal data type such as decimal(38, 10) and throwing error or do conversion implicitly when sending to/converting from arrow?
cc @neverchanje

@xiangjinwu
Copy link
Contributor

As far as I know, arrow doesn't have such a special data type. How about choosing a fixed decimal data type such as decimal(38, 10) and throwing error or do conversion implicitly when sending to/converting from arrow? cc @neverchanje

From earlier discussion #10903 (comment)

Maybe allow sinking into any decimal(p, s), with rounding when not enough scale, and runtime overflow error / skip when not enough precision.
Example: we may have 1e28 and 1e-28 in the same column, which is impossible in iceberg

  • When the iceberg column is decimal(38, <=9), it can hold 1e28 but need to round 1e-28 to 0
  • When the iceberg column is decimal(38, >=28), it can hold 1e-28 losslessly but 1e28 would be overflow error
  • For scale 10..=27, it would overflow for 1e28 and round 1e-28 to 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/high type/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants