Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sum(null) can't work on remote/distributed table #16689

Closed
weiqxu opened this issue Nov 5, 2020 · 5 comments
Closed

sum(null) can't work on remote/distributed table #16689

weiqxu opened this issue Nov 5, 2020 · 5 comments
Labels
st-accepted The issue is in our backlog, ready to take unexpected behaviour

Comments

@weiqxu
Copy link
Contributor

weiqxu commented Nov 5, 2020

Describe the bug
sum(null) can work on a local table but can't work on the remote/distributed table. I will throw a Exception
"Code: 63, e.displayText() = DB::Exception: Unknown aggregate function nothing: while receiving packet from"

How to reproduce
version v20.8.4.11-lts .

on local table , it worl OK:
select sum(null) from system.one; // retrun null

on remote/distributed table, throw exception:
select sum(null) from remote('xxx', 'system', 'one') ; // throw Exception
(by the way, if using remote, it need more at least 2 node in the node list to reproduece)

Expected behavior
A clear and concise description of what you expected to happen.

Error message and/or stacktrace
0. Poco::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, int) @ 0x124b504c in /home/clickhouse/software/clickhouse-20.8.4.11-lts

  1. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, int) @ 0x8de29e9 in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  2. DB::AggregateFunctionFactory::getImpl(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, std::__1::vector<std::__1::shared_ptr<DB::IDataType const>, std::__1::allocator<std::__1::shared_ptr<DB::IDataType const> > > const&, DB::Array const&, DB::AggregateFunctionProperties&, bool, int) const (.cold) @ 0x8918bbc in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  3. DB::AggregateFunctionFactory::get(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, std::__1::vector<std::__1::shared_ptr<DB::IDataType const>, std::__1::allocator<std::__1::shared_ptr<DB::IDataType const> > > const&, DB::Array const&, DB::AggregateFunctionProperties&, int) const @ 0xf0a66c1 in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  4. DB::create(std::__1::shared_ptrDB::IAST const&) @ 0xf2842ee in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  5. std::__1::__function::__func<std::__1::shared_ptr<DB::IDataType const> ()(std::__1::shared_ptrDB::IAST const&), std::__1::allocator<std::__1::shared_ptr<DB::IDataType const> ()(std::__1::shared_ptrDB::IAST const&)>, std::__1::shared_ptr<DB::IDataType const> (std::__1::shared_ptrDB::IAST const&)>::operator()(std::__1::shared_ptrDB::IAST const&) @ 0xf2849ff in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  6. DB::DataTypeFactory::get(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, std::__1::shared_ptrDB::IAST const&) const @ 0xf2a85ce in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  7. DB::DataTypeFactory::get(std::__1::shared_ptrDB::IAST const&) const @ 0xf2a8bf1 in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  8. DB::DataTypeFactory::get(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&) const @ 0xf2a8dbd in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  9. DB::NativeBlockInputStream::readImpl() @ 0xfcd347a in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  10. DB::IBlockInputStream::read() @ 0xf26396d in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  11. DB::Connection::receiveDataImpl(std::__1::shared_ptrDB::IBlockInputStream&) @ 0xf9f0d0a in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  12. DB::Connection::receivePacket() @ 0xf9f4887 in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  13. DB::MultiplexedConnections::receivePacket() @ 0xfa031a6 in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  14. DB::RemoteQueryExecutor::read() @ 0xf276424 in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  15. DB::RemoteSource::generate() @ 0xfbdd07b in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  16. DB::ISource::work() @ 0xfa61477 in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  17. DB::SourceWithProgress::work() @ 0xfbe315f in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  18. std::__1::__function::__func<DB::PipelineExecutor::addJob(DB::ExecutingGraph::Node*)::'lambda'(), std::__1::allocatorDB::PipelineExecutor::addJob(DB::ExecutingGraph::Node*)::'lambda'(), void ()>::operator()() @ 0xfa845ae in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  19. DB::PipelineExecutor::executeStepImpl(unsigned long, unsigned long, std::__1::atomic*) (.constprop.1) @ 0xfa88608 in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  20. ThreadFromGlobalPool::ThreadFromGlobalPool<DB::PipelineExecutor::executeImpl(unsigned long)::'lambda0'()>(DB::PipelineExecutor::executeImpl(unsigned long)::'lambda0'()&&)::'lambda'()::operator()() const @ 0xfa88e43 in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  21. ThreadPoolImplstd::__1::thread::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0x8deaf9d in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  22. void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_deletestd::__1::__thread_struct >, void ThreadPoolImplstd::__1::thread::scheduleImpl(std::__1::function<void ()>, int, std::__1::optional)::'lambda1'()> >(void*) @ 0x8de96af in /home/clickhouse/software/clickhouse-20.8.4.11-lts
  23. start_thread @ 0x7dc5 in /usr/lib64/libpthread-2.17.so
  24. clone @ 0xf776d in /usr/lib64/libc-2.17.so

Additional context
Add any other context about the problem here.

@weiqxu weiqxu added the bug Confirmed user-visible misbehaviour in official release label Nov 5, 2020
@den-crane
Copy link
Contributor

use cast

select sum(cast(null,'Nullable(Float64)')) from remote('127.0.0.1,127.0.0.2', 'system', 'one') ;
┌─sum(cast(NULL, 'Nullable(Float64)'))─┐
│                                 ᴺᵁᴸᴸ │
└──────────────────────────────────────┘

@den-crane den-crane added unexpected behaviour and removed bug Confirmed user-visible misbehaviour in official release labels Nov 5, 2020
@weiqxu
Copy link
Contributor Author

weiqxu commented Nov 5, 2020

use cast is a workroud.
However it should be fixed since the behaviour should be the same when the table is local table or remote/distributed table.

@den-crane
Copy link
Contributor

@weiqxu the problem that your query select sum(null) from remote('xxx', 'system', 'one') ; looks syntactic.
What is the real usecase / query?

@alexey-milovidov alexey-milovidov added the st-accepted The issue is in our backlog, ready to take label Nov 5, 2020
@alexey-milovidov
Copy link
Member

How to reproduce:

select sum(null) from remote('127.0.0.{1,2}', 'system', 'one')

@zhangjmruc
Copy link
Contributor

I tried to investigate this issue, following are my findings:

  1. From the explain plans for sum(null) and sum(cast(null as Nullable(Int32)), we can see the difference in the data type and name of AggregateFunction. In the problematic case, the name is nothing for only NULL case.
    -- sum(null)
    explain plan header=1 select sum(null) from remote('127.0.0.{1,2}', 'system', 'one');

┌─explain───────────────────────────────────────────────────────────────────────────────────────────┐
│ Expression (Projection) │
│ Header: sum(NULL) Nullable(Nothing) Nullable(size = 0, Nothing(size = 0), UInt8(size = 0)) │
│ Expression (Before ORDER BY and SELECT) │
│ Header: sum(NULL) Nullable(Nothing) Nullable(size = 0, Nothing(size = 0), UInt8(size = 0)) │
│ MergingAggregated │
│ Header: sum(NULL) Nullable(Nothing) Nullable(size = 0, Nothing(size = 0), UInt8(size = 0)) │
│ ReadFromStorage (Read from Distributed) │
│ Header: sum(NULL) AggregateFunction(nothing, Nullable(Nothing)) AggregateFunction(size = 0) │
└───────────────────────────────────────────────────────────────────────────────────────────────────┘

-- sum(cast(null as Nullable(Int32))
explain header=1 select sum(cast(null as Nullable(Int32))) from remote('127.0.0.{1,2}','system','one');

┌─explain──────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Expression (Projection) │
│ Header: sum(CAST(NULL, 'Nullable(Int32)')) Nullable(Int64) Nullable(size = 0, Int64(size = 0), UInt8(size = 0)) │
│ Expression (Before ORDER BY and SELECT) │
│ Header: sum(CAST(NULL, 'Nullable(Int32)')) Nullable(Int64) Nullable(size = 0, Int64(size = 0), UInt8(size = 0)) │
│ MergingAggregated │
│ Header: sum(CAST(NULL, 'Nullable(Int32)')) Nullable(Int64) Nullable(size = 0, Int64(size = 0), UInt8(size = 0)) │
│ ReadFromStorage (Read from Distributed) │
│ Header: sum(CAST(NULL, 'Nullable(Int32)')) AggregateFunction(sum, Nullable(Int32)) AggregateFunction(size = 0) │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

  1. From the bactrace and code walkthrough, when we do planning for sum(null), we do special handling for NULL arguments in AggregateFunctionFactory::getImpl() and assume this case is handled in "get" method. After this, we missing the "sum" name and get "nothing" as AggregateFunction. Later in executing phase, we fail to find the aggregate function "nothing" in aggregate function factory and then expection happens.

AggregateFunctionPtr AggregateFunctionFactory::get(
...
/// If one of the types is Nullable, we apply aggregate function combinator "Null".

if (std::any_of(type_without_low_cardinality.begin(), type_without_low_cardinality.end(),
    [](const auto & type) { return type->isNullable(); }))
{
    AggregateFunctionCombinatorPtr combinator = AggregateFunctionCombinatorFactory::instance().tryFindSuffix("Null");

...
AggregateFunctionPtr nested_function = getImpl(
name, nested_types, nested_parameters, out_properties, has_null_arguments, recursion_level); <<< return nullptr
return combinator->transformAggregateFunction(nested_function, out_properties, type_without_low_cardinality, parameters); <<< AggregateFunctionCombinatorNull::transformAggregateFunction
...
AggregateFunctionPtr AggregateFunctionFactory::getImpl(
...
/// Find by exact match.
if (auto it = aggregate_functions.find(name); it != aggregate_functions.end())
{
found = it->second;
}
...
if (found.creator)
{
out_properties = found.properties;

    /// The case when aggregate function should return NULL on NULL arguments. This case is handled in "get" method.
    if (!out_properties.returns_default_when_only_null && has_null_arguments)  <<< here return
        return nullptr;

    return found.creator(name, argument_types, parameters);
}

...

AggregateFunctionCombinatorNull::transformAggregateFunction(
...
if (has_null_types)
{
/// Currently the only functions that returns not-NULL on all NULL arguments are count and uniq, and they returns UInt64.
if (properties.returns_default_when_only_null)
return std::make_shared(DataTypes{
std::make_shared()}, params);
else
return std::make_shared(DataTypes{
std::make_shared(std::make_shared())}, params); <<< This is where AggregateFunction (nothing, Nullable(nothing)) came from
}
...

  1. we also see failure for count(null) from 2 remote servers.
    explain plan header=1 select count(null) from remote('127.0.0.{1,2}', 'system', 'one');

┌─explain─────────────────────────────────────────────────────────────────────────────────────────────┐
│ Expression (Projection) │
│ Header: count(NULL) UInt64 UInt64(size = 0) │
│ Expression (Before ORDER BY and SELECT) │
│ Header: count(NULL) UInt64 UInt64(size = 0) │
│ MergingAggregated │
│ Header: count(NULL) UInt64 UInt64(size = 0) │
│ ReadFromStorage (Read from Distributed) │
│ Header: count(NULL) AggregateFunction(nothing, Nullable(Nothing)) AggregateFunction(size = 0) │
└─────────────────────────────────────────────────────────────────────────────────────────────────────┘

  1. As there is no creator for aggregate function nothing, I tried to move the code in AggregateFunctionFactory::getImpl() to do the null check outside of if clause for found.creator, then the query failed with error : No idea why different data compressed.
    Code: 271, e.displayText() = DB::Exception: Data compressed with different methods, given method byte 0x83, previous method byte 0x82: while receiving packet from localhost:19000: While executing Remote (version 20.8.8.2) (from [::ffff:127.0.0.1]:43888) (in query: select sum(null) from remote('localhost:19000,localhost','system','one');), Stack trace (when copying this message, always include the lines below):

=== code changes ===

AggregateFunctionPtr AggregateFunctionFactory::getImpl(
...
out_properties = found.properties;

    /// The case when aggregate function should return NULL on NULL arguments. This case is handled in "get" method.
    if (!out_properties.returns_default_when_only_null && has_null_arguments) 
        return nullptr;

if (found.creator)
{
   /// Move the check outside the if clause for NULL case.
    return found.creator(name, argument_types, parameters);
}

...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
st-accepted The issue is in our backlog, ready to take unexpected behaviour
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants