Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(expr):support is_ipv4 and is_ipv6 expr to tiflash #6773

Merged
merged 50 commits into from
Feb 15, 2023
Merged
Show file tree
Hide file tree
Changes from 45 commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
772c064
feat(expr):support is_ipv4 and is_ipv6 expr to
AntiTopQuark Feb 7, 2023
fae3bbf
style: format code
AntiTopQuark Feb 7, 2023
9748c3e
Update is_ip_addr.test
AntiTopQuark Feb 8, 2023
359f882
Merge branch 'master' into master
AntiTopQuark Feb 8, 2023
9720869
style:Revise according to review comments for IsIPAddr func
AntiTopQuark Feb 8, 2023
b587f19
style:Revise according to review comments for IsIPAddr func
AntiTopQuark Feb 8, 2023
47e68f7
style:Revise according to review comments for IsIPAddr
AntiTopQuark Feb 8, 2023
d2287a5
style:Revise according to review comments for IsIPAddr
AntiTopQuark Feb 8, 2023
3d7a92b
style:Revise according to review comments for IsIPAddr func
AntiTopQuark Feb 8, 2023
e686c68
Merge branch 'master' into master
AntiTopQuark Feb 8, 2023
8c70e8e
Revise according to review comments:
AntiTopQuark Feb 8, 2023
380fe4d
Modify the style of the code
AntiTopQuark Feb 8, 2023
a67c8ee
Update dbms/src/Functions/FunctionsIsIPAddr.h
AntiTopQuark Feb 8, 2023
b9a499c
Merge branch 'master' into master
AntiTopQuark Feb 8, 2023
85ed858
fix some bugs for is_ipv4 and is_ipv6
AntiTopQuark Feb 8, 2023
4a7c64d
Merge branch 'master' of https://github.com/AntiTopQuark/support_ipv4…
AntiTopQuark Feb 8, 2023
f14e4b7
Merge branch 'master' into master
AntiTopQuark Feb 8, 2023
6eb6907
fix bug of IsIPAddr and modify the code style
AntiTopQuark Feb 8, 2023
cb8d9c1
Merge branch 'master' of https://github.com/AntiTopQuark/support_ipv4…
AntiTopQuark Feb 8, 2023
b1824d4
Update FunctionsIsIPAddr.h
AntiTopQuark Feb 8, 2023
7ce9952
test:update test for is_ipv4 and is_ipv6 func
AntiTopQuark Feb 8, 2023
33e25f6
Merge branch 'master' into master
AntiTopQuark Feb 8, 2023
4433ab2
fix bug:process null for is_ipv4 and is_ipv6 func
AntiTopQuark Feb 9, 2023
f7b7954
Merge branch 'master' into master
AntiTopQuark Feb 9, 2023
48b3fa1
process input length for is_ipv4 and is_ipv6 func
AntiTopQuark Feb 9, 2023
340be85
Update FunctionsIsIPAddr.h
AntiTopQuark Feb 9, 2023
b7799ed
Update FunctionsIsIPAddr.h
AntiTopQuark Feb 9, 2023
cea864a
Update FunctionsIsIPAddr.h
AntiTopQuark Feb 9, 2023
127f21f
Update FunctionsIsIPAddr.h
AntiTopQuark Feb 9, 2023
4058f1a
Update FunctionsIsIPAddr.h
AntiTopQuark Feb 9, 2023
a20a6c6
Merge branch 'pingcap:master' into master
AntiTopQuark Feb 10, 2023
c7f2914
Update gtest_is_ip_addr.cpp
AntiTopQuark Feb 10, 2023
f31a94a
Update FunctionsIsIPAddr.cpp
AntiTopQuark Feb 10, 2023
684fef9
Update FunctionsIsIPAddr.h
AntiTopQuark Feb 10, 2023
cc3f674
Update gtest_is_ip_addr.cpp
AntiTopQuark Feb 10, 2023
0767bd8
Update FunctionsIsIPAddr.cpp
AntiTopQuark Feb 10, 2023
c0a8c52
Update FunctionsIsIPAddr.cpp
AntiTopQuark Feb 13, 2023
cfc2cc2
Update FunctionsIsIPAddr.h
AntiTopQuark Feb 13, 2023
d9802f5
Merge branch 'pingcap:master' into master
AntiTopQuark Feb 13, 2023
88de59c
Update gtest_is_ip_addr.cpp
AntiTopQuark Feb 14, 2023
80ab34e
Update registerFunctions.cpp
AntiTopQuark Feb 14, 2023
2bc8de1
Update registerFunctions.cpp
AntiTopQuark Feb 14, 2023
dce17e8
Update FunctionsIsIPAddr.h
AntiTopQuark Feb 14, 2023
fb191ff
Update gtest_is_ip_addr.cpp
AntiTopQuark Feb 14, 2023
09fd7fd
update testcase
AntiTopQuark Feb 14, 2023
ddf1114
Merge branch 'master' into master
AntiTopQuark Feb 14, 2023
c23be4f
Update gtest_is_ip_addr.cpp
AntiTopQuark Feb 14, 2023
b66005f
Merge branch 'pingcap:master' into master
AntiTopQuark Feb 14, 2023
32cef30
Update FunctionsIsIPAddr.h
AntiTopQuark Feb 14, 2023
07e4a10
Merge branch 'master' into master
ti-chi-bot Feb 15, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions dbms/src/Flash/Coprocessor/DAGUtils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -424,10 +424,10 @@ const std::unordered_map<tipb::ScalarFuncSig, String> scalar_func_map({
{tipb::ScalarFuncSig::InetNtoa, "IPv4NumToString"},
{tipb::ScalarFuncSig::Inet6Aton, "tiDBIPv6StringToNum"},
{tipb::ScalarFuncSig::Inet6Ntoa, "tiDBIPv6NumToString"},
//{tipb::ScalarFuncSig::IsIPv4, "cast"},
{tipb::ScalarFuncSig::IsIPv4, "tiDBIsIPv4"},
//{tipb::ScalarFuncSig::IsIPv4Compat, "cast"},
//{tipb::ScalarFuncSig::IsIPv4Mapped, "cast"},
//{tipb::ScalarFuncSig::IsIPv6, "cast"},
{tipb::ScalarFuncSig::IsIPv6, "tiDBIsIPv6"},
//{tipb::ScalarFuncSig::UUID, "cast"},

{tipb::ScalarFuncSig::LikeSig, "like3Args"},
Expand Down
27 changes: 27 additions & 0 deletions dbms/src/Functions/FunctionsIsIPAddr.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
// Copyright 2023 PingCAP, Ltd.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

#include <Functions/FunctionFactory.h>
#include <Functions/FunctionsIsIPAddr.h>

namespace DB
{

Lloyd-Pottiger marked this conversation as resolved.
Show resolved Hide resolved
void registerFunctionsIsIPAddr(FunctionFactory & factory)
{
factory.registerFunction<FunctionIsIPv4OrIsIPv6<CheckIsIPv4Impl>>();
factory.registerFunction<FunctionIsIPv4OrIsIPv6<CheckIsIPv6Impl>>();
}

} // namespace DB
274 changes: 274 additions & 0 deletions dbms/src/Functions/FunctionsIsIPAddr.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,274 @@
// Copyright 2023 PingCAP, Ltd.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

#pragma once

#include <Columns/ColumnNullable.h>
#include <Columns/ColumnString.h>
#include <DataTypes/DataTypeNullable.h>
#include <DataTypes/DataTypeString.h>
#include <DataTypes/DataTypesNumber.h>
#include <Functions/FunctionHelpers.h>
#include <Functions/IFunction.h>

#ifndef INADDRSZ
#define INADDRSZ 4
#endif

#ifndef INT16SZ
#define INT16SZ sizeof(short)
#endif

#ifndef IN6ADDRSZ
#define IN6ADDRSZ 16
#endif

namespace DB
{
namespace ErrorCodes
{
extern const int ILLEGAL_COLUMN;
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
} // namespace ErrorCodes

/** Helper functions
*
* isIPv4(x) - Judge whether the input string is an IPv4 address.
*
* isIPv6(x) - Judge whether the input string is an IPv6 address.
*
*/


struct CheckIsIPv4Impl
{
static constexpr auto name = "tiDBIsIPv4";
/* Description:
* This function is used to determine whether the input string is an IPv4 address,
* and the code comes from the inet_pton4 function of "arpa/inet.h".
* References: http://svn.apache.org/repos/asf/apr/apr/trunk/network_io/unix/inet_pton.c
*/
static inline UInt8 isMatch(const char * src)
{
if (nullptr == src)
return 0;

static const char digits[] = "0123456789";
int saw_digit, octets;
char ch;
unsigned char tmp[INADDRSZ], *tp;

saw_digit = 0;
octets = 0;
*(tp = tmp) = 0;
while ((ch = *src++) != '\0')
{
const char * pch;

if ((pch = strchr(digits, ch)) != nullptr)
{
unsigned int num = *tp * 10 + static_cast<unsigned int>(pch - digits);

if (num > 255)
return 0;
*tp = num;
if (!saw_digit)
{
if (++octets > 4)
return 0;
saw_digit = 1;
}
}
else if (ch == '.' && saw_digit)
{
if (octets == 4)
return 0;
*++tp = 0;
saw_digit = 0;
}
else
return 0;
}
if (octets < 4)
return 0;

return 1;
}
};
struct CheckIsIPv6Impl
{
static constexpr auto name = "tiDBIsIPv6";

/* Description:
* This function is used to determine whether the input string is an IPv6 address,
* and the code comes from the inet_pton6 function of "arpa/inet.h".
* References: http://svn.apache.org/repos/asf/apr/apr/trunk/network_io/unix/inet_pton.c
*/
static inline UInt8 isMatch(const char * src)
{
if (nullptr == src)
return 0;
static const char xdigits_l[] = "0123456789abcdef",
xdigits_u[] = "0123456789ABCDEF";
unsigned char tmp[16], *tp, *endp, *colonp;
const char *xdigits, *curtok;
int ch, saw_xdigit;
unsigned int val;

memset((tp = tmp), '\0', IN6ADDRSZ);
endp = tp + IN6ADDRSZ;
colonp = nullptr;
if (*src == ':')
if (*++src != ':')
return 0;
curtok = src;
saw_xdigit = 0;
val = 0;
while ((ch = *src++) != '\0')
{
const char * pch;

if ((pch = strchr((xdigits = xdigits_l), ch)) == nullptr)
pch = strchr((xdigits = xdigits_u), ch);
if (pch != nullptr)
{
val <<= 4;
val |= (pch - xdigits);
if (val > 0xffff)
return 0;
saw_xdigit = 1;
continue;
}
if (ch == ':')
{
curtok = src;
if (!saw_xdigit)
{
if (colonp)
return 0;
colonp = tp;
continue;
}
if (tp + INT16SZ > endp)
return 0;
*tp++ = static_cast<unsigned char>(val >> 8) & 0xff;
*tp++ = static_cast<unsigned char>(val) & 0xff;
saw_xdigit = 0;
val = 0;
continue;
}
if (ch == '.' && ((tp + INADDRSZ) <= endp) && CheckIsIPv4Impl::isMatch(curtok) > 0)
{
tp += INADDRSZ;
saw_xdigit = 0;
break; /* '\0' was seen by CheckIsIPv4Impl::isMatch(). */
}
return 0;
}
if (saw_xdigit)
{
if (tp + INT16SZ > endp)
return 0;
*tp++ = static_cast<unsigned char>(val >> 8) & 0xff;
*tp++ = static_cast<unsigned char>(val) & 0xff;
}
if (colonp != nullptr)
{
const size_t n = tp - colonp;
size_t i;

for (i = 1; i <= n; ++i)
{
endp[-i] = colonp[n - i];
colonp[n - i] = 0;
}
tp = endp;
}
if (tp != endp)
return 0;
return 1;
}
};

template <typename Impl>
class FunctionIsIPv4OrIsIPv6 : public IFunction
SeaRise marked this conversation as resolved.
Show resolved Hide resolved
{
public:
static constexpr auto name = Impl::name;
FunctionIsIPv4OrIsIPv6() = default;

static FunctionPtr create(const Context &) { return std::make_shared<FunctionIsIPv4OrIsIPv6>(); };

std::string getName() const override { return name; }
size_t getNumberOfArguments() const override { return 1; }
bool useDefaultImplementationForConstants() const override { return true; }
bool useDefaultImplementationForNulls() const override { return false; }

DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
AntiTopQuark marked this conversation as resolved.
Show resolved Hide resolved
{
if (arguments.size() != 1)
throw Exception(
fmt::format("Number of arguments for function {} doesn't match: passed {}, should be 1.", getName(), arguments.size()),
ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
if (!arguments[0]->onlyNull())
{
DataTypePtr data_type = removeNullable(arguments[0]);
if (!data_type->isString())
throw Exception(
fmt::format("Illegal argument type {} of function {}, should be integer", arguments[0]->getName(), getName()),
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
}
return std::make_shared<DataTypeUInt8>();
}
void executeImpl(Block & block, const ColumnNumbers & arguments, size_t result) const override
{
auto [column, nullmap] = removeNullable(block.getByPosition(arguments[0]).column.get());
if (const auto * col_input = checkAndGetColumn<ColumnString>(column))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this hold true for the case where the input is only null?
Is there an ut to guarantee this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank u very much~

Before there was a test case where the input was all NULL for testing ConstColumn
https://github.com/AntiTopQuark/support_ipv4_and_ipv6_func/blob/09fd7fd29c83f298ee654c9b9a416601a718b854/dbms/src/Functions/tests/gtest_is_ip_addr.cpp#L80

    ASSERT_COLUMN_EQ(createConstColumn<UInt8>(4, 0), executeFunction("tiDBIsIPv6", {createConstColumn<Nullable<String>>(4, std::nullopt)}));

now I added another one

    ASSERT_COLUMN_EQ(createColumn<UInt8>({0, 0, 0, 0, 0}), executeFunction("tiDBIsIPv6", {createColumn<Nullable<String>>({std::nullopt, std::nullopt, std::nullopt, std::nullopt, std::nullopt})}));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you test for DataTypeNullable<DataTypeNothing>?
You can use

ColumnWithTypeAndName createOnlyNullColumnConst(size_t size, const String & name = "");
ColumnWithTypeAndName createOnlyNullColumn(size_t size, const String & name = "");

in FunctionTestUtils.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can refer to gtest_strings_concat_ws.cpp

{
size_t size = block.getByPosition(arguments[0]).column->size();
const typename ColumnString::Chars_t & data = col_input->getChars();
const typename ColumnString::Offsets & offsets = col_input->getOffsets();

auto col_res = ColumnUInt8::create();
ColumnUInt8::Container & vec_res = col_res->getData();
vec_res.resize(size);

size_t prev_offset = 0;
for (size_t i = 0; i < size; ++i)
{
if (nullmap && (*nullmap)[i])
{
vec_res[i] = 0;
}
else
{
vec_res[i] = Impl::isMatch(reinterpret_cast<const char *>(&data[prev_offset]));
}
prev_offset = offsets[i];
}

block.getByPosition(result).column = std::move(col_res);
}
else
throw Exception(
fmt::format("Illegal column {} of argument of function {}", block.getByPosition(arguments[0]).column->getName(), getName()),
ErrorCodes::ILLEGAL_COLUMN);
}
};
} // namespace DB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of executeImpl in ipv4 and ipv6 is similiar, and I think we can define a function to unify the logic such as

template <typename Impl>
void executeCheckIP(xxx)
{
...
  vec_res[i] = static_cast<UInt8>(Impl::check(reinterpret_cast<const char *>(&data[prev_offset])));
...
}

returnType executeImpl(xxx)
{
  return executeCheckIP(xxx);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, I referenced other functions. The following template is used

template <typename Name, UInt8(Function)(const char *)>

Is this okay?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, I referenced other functions. The following template is used

template <typename Name, UInt8(Function)(const char *)>

Is this okay?

I think it's okk


#undef INADDRSZ
#undef INT16SZ
#undef IN6ADDRSZ
2 changes: 2 additions & 0 deletions dbms/src/Functions/registerFunctions.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ void registerFunctionsStringMath(FunctionFactory &);
void registerFunctionsDuration(FunctionFactory &);
void registerFunctionsRegexp(FunctionFactory &);
void registerFunctionsJson(FunctionFactory &);
void registerFunctionsIsIPAddr(FunctionFactory &);


void registerFunctions()
Expand Down Expand Up @@ -73,6 +74,7 @@ void registerFunctions()
registerFunctionsDuration(factory);
registerFunctionsRegexp(factory);
registerFunctionsJson(factory);
registerFunctionsIsIPAddr(factory);
}

} // namespace DB
Loading