Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat](skew & kurt) New aggregate function skew & kurt #40945

Merged
merged 8 commits into from
Sep 25, 2024

Conversation

zhiqiang-hhhh
Copy link
Contributor

@zhiqiang-hhhh zhiqiang-hhhh commented Sep 18, 2024

skew,skew_pop and skewness is used to calculate skewness of a data distribution.
kurt,kurt_pop and kurtosis is used to calculate kurtosis of a data distribution.

The implementation references ClickHouse/ClickHouse#5200, and modified result type to AlwaysNullable since doris do not support NaN.

The formula used to calculate skew is 3-th moments / (variance^{1.5})
The formula used to calculate kurt is 4-th moments / (variance^{2}) - 3

when value of any result is NaN, doris will return NULL.

doc: apache/doris-website#1127

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@zhiqiang-hhhh
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 41559 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 4ae074e0205f37458c4695df952fd20b9a226166, data reload: false

------ Round 1 ----------------------------------
q1	17912	7521	7318	7318
q2	2046	165	164	164
q3	10640	1161	1242	1161
q4	10248	711	766	711
q5	7783	3184	3060	3060
q6	234	152	146	146
q7	1027	626	608	608
q8	9446	2053	2071	2053
q9	6812	6404	6439	6404
q10	7016	2249	2334	2249
q11	431	245	251	245
q12	407	212	208	208
q13	17797	3020	2997	2997
q14	239	210	234	210
q15	575	535	516	516
q16	681	645	631	631
q17	982	843	800	800
q18	7534	6635	6630	6630
q19	1402	1048	1002	1002
q20	586	296	296	296
q21	4089	3150	3221	3150
q22	1102	1003	1000	1000
Total cold run time: 108989 ms
Total hot run time: 41559 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7289	7336	7282	7282
q2	334	233	228	228
q3	3055	2985	2962	2962
q4	2062	1803	1836	1803
q5	5603	5559	5663	5559
q6	227	137	141	137
q7	2227	1795	1780	1780
q8	3264	3428	3357	3357
q9	8746	8770	8836	8770
q10	3586	3480	3501	3480
q11	576	480	461	461
q12	833	579	580	579
q13	10666	3045	3133	3045
q14	312	281	273	273
q15	567	525	522	522
q16	749	655	678	655
q17	1822	1608	1621	1608
q18	8204	7588	7504	7504
q19	1686	1436	1586	1436
q20	2043	1795	1830	1795
q21	5346	5292	5239	5239
q22	1095	1016	996	996
Total cold run time: 70292 ms
Total hot run time: 59471 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 193943 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 4ae074e0205f37458c4695df952fd20b9a226166, data reload: false

query1	916	377	383	377
query2	6517	2128	2044	2044
query3	6454	213	221	213
query4	33572	23473	23323	23323
query5	4251	479	469	469
query6	254	161	160	160
query7	4426	287	304	287
query8	262	225	220	220
query9	9319	2653	2608	2608
query10	427	287	300	287
query11	18187	15218	15227	15218
query12	154	97	98	97
query13	1668	404	402	402
query14	10240	7425	7277	7277
query15	254	170	176	170
query16	8131	466	479	466
query17	1515	572	554	554
query18	2149	305	301	301
query19	350	150	147	147
query20	119	111	113	111
query21	211	103	103	103
query22	4601	4349	4167	4167
query23	34772	33916	33808	33808
query24	11196	2960	2892	2892
query25	634	398	395	395
query26	1181	159	163	159
query27	2336	278	282	278
query28	8061	2460	2453	2453
query29	847	434	418	418
query30	289	160	149	149
query31	986	782	791	782
query32	102	64	64	64
query33	776	294	287	287
query34	935	504	494	494
query35	839	750	754	750
query36	1087	965	940	940
query37	164	94	91	91
query38	4063	3951	3862	3862
query39	1456	1391	1420	1391
query40	207	97	98	97
query41	51	47	47	47
query42	124	99	97	97
query43	537	483	483	483
query44	1214	824	813	813
query45	192	171	161	161
query46	1129	749	745	745
query47	1884	1799	1817	1799
query48	451	361	374	361
query49	1023	415	413	413
query50	825	402	402	402
query51	7087	6959	6903	6903
query52	102	87	88	87
query53	252	190	181	181
query54	1159	459	487	459
query55	82	77	82	77
query56	289	281	268	268
query57	1212	1111	1062	1062
query58	244	242	248	242
query59	3281	2943	2890	2890
query60	314	287	282	282
query61	130	147	101	101
query62	847	655	651	651
query63	212	188	184	184
query64	4277	652	669	652
query65	3211	3142	3194	3142
query66	978	304	296	296
query67	15874	15558	15390	15390
query68	3055	582	581	581
query69	437	286	284	284
query70	1146	1130	1059	1059
query71	335	279	271	271
query72	6030	3925	3975	3925
query73	742	328	329	328
query74	9203	8985	9033	8985
query75	3358	2657	2642	2642
query76	2026	945	945	945
query77	442	301	293	293
query78	9810	9801	9376	9376
query79	1335	873	875	873
query80	952	586	576	576
query81	574	251	253	251
query82	802	230	229	229
query83	221	165	166	165
query84	270	110	107	107
query85	759	359	367	359
query86	390	322	310	310
query87	4395	4259	4283	4259
query88	4607	4033	4016	4016
query89	377	355	354	354
query90	1901	307	306	306
query91	163	168	161	161
query92	76	72	71	71
query93	927	889	889	889
query94	824	375	355	355
query95	467	415	403	403
query96	487	480	488	480
query97	3118	3096	3094	3094
query98	229	230	223	223
query99	1399	1274	1277	1274
Total cold run time: 291746 ms
Total hot run time: 193943 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.75 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 4ae074e0205f37458c4695df952fd20b9a226166, data reload: false

query1	0.04	0.04	0.04
query2	0.07	0.03	0.02
query3	0.24	0.06	0.06
query4	1.65	0.10	0.10
query5	0.52	0.52	0.52
query6	1.13	0.73	0.72
query7	0.02	0.02	0.01
query8	0.04	0.03	0.03
query9	0.56	0.50	0.50
query10	0.55	0.56	0.53
query11	0.14	0.11	0.10
query12	0.14	0.11	0.10
query13	0.60	0.60	0.58
query14	2.90	2.93	3.02
query15	0.89	0.80	0.82
query16	0.37	0.37	0.37
query17	1.06	0.98	1.05
query18	0.19	0.19	0.20
query19	1.93	1.87	2.04
query20	0.02	0.01	0.01
query21	15.36	0.59	0.60
query22	2.82	2.12	1.78
query23	17.19	0.80	0.94
query24	2.85	1.59	0.20
query25	0.22	0.21	0.04
query26	0.51	0.15	0.14
query27	0.03	0.05	0.03
query28	11.18	1.11	1.07
query29	12.52	3.22	3.17
query30	0.24	0.06	0.06
query31	2.87	0.39	0.37
query32	3.28	0.45	0.45
query33	2.98	3.04	3.03
query34	17.00	4.41	4.39
query35	4.40	4.39	4.38
query36	0.67	0.51	0.50
query37	0.07	0.05	0.06
query38	0.04	0.03	0.03
query39	0.03	0.03	0.02
query40	0.16	0.12	0.12
query41	0.08	0.02	0.02
query42	0.03	0.02	0.02
query43	0.02	0.03	0.03
Total cold run time: 107.61 s
Total hot run time: 31.75 s

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.30% (9585/25694)
Line Coverage: 28.71% (79254/276025)
Region Coverage: 28.16% (41017/145672)
Branch Coverage: 24.79% (20902/84312)
Coverage Report: http://coverage.selectdb-in.cc/coverage/4ae074e0205f37458c4695df952fd20b9a226166_4ae074e0205f37458c4695df952fd20b9a226166/report/index.html

@zhiqiang-hhhh zhiqiang-hhhh marked this pull request as draft September 19, 2024 05:50
@zhiqiang-hhhh
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 41954 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c5fa11c0855a5362bca5e665b81d05b564c1aee6, data reload: false

------ Round 1 ----------------------------------
q1	17567	7389	7915	7389
q2	2065	170	158	158
q3	10650	1114	1228	1114
q4	10242	729	753	729
q5	7789	3143	3112	3112
q6	237	149	147	147
q7	998	619	600	600
q8	9430	2104	2098	2098
q9	6820	6407	6416	6407
q10	7035	2346	2330	2330
q11	433	245	246	245
q12	402	211	215	211
q13	17792	3003	2971	2971
q14	234	213	217	213
q15	576	531	527	527
q16	662	606	609	606
q17	976	831	817	817
q18	7271	6697	6659	6659
q19	1393	1040	1040	1040
q20	601	279	281	279
q21	4047	3272	3278	3272
q22	1125	1037	1030	1030
Total cold run time: 108345 ms
Total hot run time: 41954 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7238	7223	7327	7223
q2	325	228	226	226
q3	3094	2988	2987	2987
q4	2051	1866	1793	1793
q5	5625	5621	5668	5621
q6	234	146	145	145
q7	2223	1786	1785	1785
q8	3321	3421	3450	3421
q9	8802	8980	8842	8842
q10	3520	3508	3513	3508
q11	569	495	486	486
q12	823	627	660	627
q13	8911	3192	3102	3102
q14	301	276	267	267
q15	582	507	543	507
q16	722	669	671	669
q17	1783	1604	1578	1578
q18	8237	7861	7681	7681
q19	1754	1586	1684	1586
q20	2124	1921	1870	1870
q21	5560	5145	5431	5145
q22	1148	1053	1073	1053
Total cold run time: 68947 ms
Total hot run time: 60122 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 194990 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c5fa11c0855a5362bca5e665b81d05b564c1aee6, data reload: false

query1	1266	886	884	884
query2	6232	2077	2004	2004
query3	10737	4082	4129	4082
query4	63245	27740	23328	23328
query5	5290	473	457	457
query6	411	172	169	169
query7	5623	310	294	294
query8	314	223	233	223
query9	9086	2642	2657	2642
query10	493	279	275	275
query11	18236	15165	15845	15165
query12	160	101	105	101
query13	1569	422	416	416
query14	10880	7033	7182	7033
query15	202	179	169	169
query16	7154	500	454	454
query17	1191	613	579	579
query18	1916	320	308	308
query19	260	154	155	154
query20	127	108	111	108
query21	211	103	100	100
query22	5021	4487	4674	4487
query23	35173	34009	34508	34009
query24	6055	2869	2865	2865
query25	526	390	402	390
query26	650	157	169	157
query27	1731	286	290	286
query28	4139	2453	2411	2411
query29	668	434	433	433
query30	235	163	151	151
query31	992	799	805	799
query32	72	55	53	53
query33	435	304	293	293
query34	908	488	484	484
query35	833	748	722	722
query36	1080	933	929	929
query37	146	90	86	86
query38	4040	3817	3848	3817
query39	1483	1407	1458	1407
query40	207	101	96	96
query41	50	47	48	47
query42	117	94	94	94
query43	527	491	492	491
query44	1130	806	790	790
query45	192	161	164	161
query46	1111	743	761	743
query47	1925	1828	1823	1823
query48	478	358	355	355
query49	706	389	384	384
query50	818	402	390	390
query51	7100	7011	7034	7011
query52	95	86	83	83
query53	246	179	170	170
query54	573	454	455	454
query55	73	70	75	70
query56	267	244	250	244
query57	1215	1070	1077	1070
query58	229	224	246	224
query59	3210	2896	2903	2896
query60	283	257	273	257
query61	100	101	102	101
query62	744	647	663	647
query63	214	192	177	177
query64	1718	634	612	612
query65	3254	3172	3339	3172
query66	609	301	309	301
query67	15891	15360	15707	15360
query68	4461	591	566	566
query69	438	306	291	291
query70	1208	1142	1142	1142
query71	360	267	273	267
query72	6095	4041	4020	4020
query73	775	321	323	321
query74	10041	8906	8997	8906
query75	3342	2694	2692	2692
query76	1784	904	899	899
query77	482	293	292	292
query78	10006	9384	11043	9384
query79	1058	544	541	541
query80	880	431	426	426
query81	521	239	240	239
query82	236	137	133	133
query83	154	136	138	136
query84	290	75	79	75
query85	955	286	277	277
query86	343	278	304	278
query87	4556	4279	4267	4267
query88	2945	2346	2354	2346
query89	371	283	280	280
query90	1990	185	182	182
query91	179	144	142	142
query92	62	45	49	45
query93	1042	523	525	523
query94	751	291	287	287
query95	345	256	250	250
query96	604	274	276	274
query97	3272	3082	3112	3082
query98	204	205	189	189
query99	1562	1291	1297	1291
Total cold run time: 313184 ms
Total hot run time: 194990 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.71 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c5fa11c0855a5362bca5e665b81d05b564c1aee6, data reload: false

query1	0.05	0.05	0.05
query2	0.06	0.02	0.02
query3	0.23	0.06	0.06
query4	1.67	0.10	0.09
query5	0.53	0.48	0.51
query6	1.13	0.72	0.73
query7	0.02	0.01	0.02
query8	0.04	0.03	0.03
query9	0.55	0.51	0.50
query10	0.57	0.55	0.56
query11	0.15	0.10	0.10
query12	0.14	0.10	0.10
query13	0.60	0.59	0.59
query14	3.01	3.00	3.08
query15	0.88	0.81	0.83
query16	0.37	0.38	0.38
query17	1.07	1.06	1.03
query18	0.19	0.19	0.19
query19	2.00	1.93	1.98
query20	0.01	0.01	0.01
query21	15.35	0.57	0.57
query22	3.00	2.84	1.93
query23	17.39	0.71	0.83
query24	2.86	0.89	2.14
query25	0.25	0.16	0.07
query26	0.48	0.14	0.13
query27	0.05	0.05	0.04
query28	10.00	1.09	1.06
query29	12.60	3.24	3.25
query30	0.26	0.06	0.06
query31	2.90	0.38	0.37
query32	3.28	0.46	0.46
query33	3.06	2.94	3.00
query34	17.12	4.44	4.45
query35	4.38	4.38	4.52
query36	0.67	0.52	0.48
query37	0.08	0.05	0.05
query38	0.04	0.04	0.04
query39	0.03	0.02	0.03
query40	0.16	0.13	0.12
query41	0.07	0.02	0.02
query42	0.03	0.02	0.01
query43	0.03	0.03	0.03
Total cold run time: 107.36 s
Total hot run time: 32.71 s

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.30% (9587/25702)
Line Coverage: 28.70% (79238/276072)
Region Coverage: 28.16% (41025/145710)
Branch Coverage: 24.80% (20904/84304)
Coverage Report: http://coverage.selectdb-in.cc/coverage/c5fa11c0855a5362bca5e665b81d05b564c1aee6_c5fa11c0855a5362bca5e665b81d05b564c1aee6/report/index.html

@zhiqiang-hhhh zhiqiang-hhhh marked this pull request as ready for review September 19, 2024 08:00

namespace doris::vectorized {

enum class StatisticsFunctionKind : uint8_t { skewPop, kurtPop };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better use UPPER CASE

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


namespace doris::vectorized {

enum class StatisticsFunctionKind : uint8_t { skewPop, kurtPop };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better use UPPER CASE

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed to STATISTICS_FUNCTION_KIND


template <typename T, std::size_t _level>
struct StatFuncOneArg {
using Type1 = T;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same type, no need two type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
}

void reset() { return; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is usefully, should reset all m to init val

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

using ResultType = Float64;
using Data = VarMoments<ResultType, _level>;

static constexpr UInt32 num_args = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems not use this var?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

using ColVecT1 = ColumnVectorOrDecimal<T1>;
using ColVecT2 = ColumnVectorOrDecimal<T2>;
using ResultType = typename StatFunc::ResultType;
using ColVecResult = ColumnVector<ResultType>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here seems could write more simple code,
as the two function return type is ColumnFloat64

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

implements UnaryExpression, ExplicitlyCastableSignature, AlwaysNullable {

public static final List<FunctionSignature> SIGNATURES = ImmutableList.of(
FunctionSignature.ret(DoubleType.INSTANCE).args(FloatType.INSTANCE),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could let FE members check the args order, #39352

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now same with 39352


void add(AggregateDataPtr __restrict place, const IColumn** columns, ssize_t row_num,
Arena*) const override {
if constexpr (NullableInput) {
Copy link
Contributor

@HappenLee HappenLee Sep 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should skip the null value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is using creator_without_type::create_ignore_nullable, aggregate_function_null will not be used since this return type is always nullable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

"'getPopulation' method");
}

T getPopulation() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_population

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@zhiqiang-hhhh
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.25% (9599/25766)
Line Coverage: 28.64% (79300/276902)
Region Coverage: 28.09% (41050/146143)
Branch Coverage: 24.71% (20904/84588)
Coverage Report: http://coverage.selectdb-in.cc/coverage/2392c7df980f8e1f1fa46bc595fe25a67fb5b23b_2392c7df980f8e1f1fa46bc595fe25a67fb5b23b/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 41903 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2392c7df980f8e1f1fa46bc595fe25a67fb5b23b, data reload: false

------ Round 1 ----------------------------------
q1	17720	7374	7353	7353
q2	2079	163	163	163
q3	10840	1105	1187	1105
q4	10366	792	776	776
q5	7746	3123	3094	3094
q6	237	148	148	148
q7	1008	635	601	601
q8	9435	2066	2093	2066
q9	6846	6467	6425	6425
q10	7024	2304	2304	2304
q11	439	250	252	250
q12	411	222	219	219
q13	17792	3001	2996	2996
q14	233	211	228	211
q15	586	509	517	509
q16	651	630	603	603
q17	988	839	838	838
q18	7372	6797	6694	6694
q19	1407	1051	1064	1051
q20	590	289	288	288
q21	4015	3222	3292	3222
q22	1131	1015	987	987
Total cold run time: 108916 ms
Total hot run time: 41903 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7293	7237	7305	7237
q2	339	227	228	227
q3	3098	3016	3013	3013
q4	2032	1866	1776	1776
q5	5586	5664	5707	5664
q6	242	148	145	145
q7	2241	1846	1811	1811
q8	3323	3491	3448	3448
q9	8802	8919	8754	8754
q10	3444	3498	3477	3477
q11	588	490	490	490
q12	808	636	609	609
q13	10382	3155	3174	3155
q14	303	271	292	271
q15	576	530	509	509
q16	708	675	672	672
q17	1851	1611	1569	1569
q18	8256	7763	7882	7763
q19	1753	1603	1564	1564
q20	2136	1874	1942	1874
q21	5570	5568	5289	5289
q22	1135	1064	1049	1049
Total cold run time: 70466 ms
Total hot run time: 60366 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 195972 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2392c7df980f8e1f1fa46bc595fe25a67fb5b23b, data reload: false

query1	1277	885	879	879
query2	6343	2050	2051	2050
query3	10782	3902	3842	3842
query4	64668	26397	23448	23448
query5	5242	481	483	481
query6	427	170	186	170
query7	5483	306	299	299
query8	307	220	242	220
query9	9022	2645	2644	2644
query10	483	294	302	294
query11	17866	15369	15756	15369
query12	165	119	99	99
query13	1528	418	423	418
query14	10836	7766	7492	7492
query15	204	191	175	175
query16	6651	494	494	494
query17	1191	613	606	606
query18	1476	321	307	307
query19	230	153	170	153
query20	129	111	109	109
query21	211	106	104	104
query22	4733	4603	4870	4603
query23	34633	34098	33908	33908
query24	6081	2974	2908	2908
query25	490	375	393	375
query26	615	165	159	159
query27	1617	285	288	285
query28	4028	2462	2418	2418
query29	649	410	423	410
query30	236	148	150	148
query31	945	765	833	765
query32	70	55	62	55
query33	443	292	296	292
query34	888	499	479	479
query35	856	715	748	715
query36	1059	932	925	925
query37	144	84	84	84
query38	4052	3912	3943	3912
query39	1470	1425	1424	1424
query40	197	97	97	97
query41	49	47	47	47
query42	117	95	94	94
query43	523	469	482	469
query44	1157	826	818	818
query45	192	158	162	158
query46	1133	769	760	760
query47	1902	1794	1809	1794
query48	478	364	358	358
query49	684	388	399	388
query50	844	407	412	407
query51	7055	6876	6882	6876
query52	98	86	84	84
query53	260	181	174	174
query54	573	470	458	458
query55	76	84	85	84
query56	269	261	277	261
query57	1231	1092	1103	1092
query58	226	254	229	229
query59	3260	2993	2904	2904
query60	293	274	280	274
query61	110	101	103	101
query62	774	660	647	647
query63	216	184	177	177
query64	1358	661	635	635
query65	3258	3203	3201	3201
query66	638	296	312	296
query67	15995	15496	15678	15496
query68	4508	590	579	579
query69	452	302	300	300
query70	1174	1125	1146	1125
query71	334	280	273	273
query72	6343	3997	4058	3997
query73	793	330	331	330
query74	9513	9074	9083	9074
query75	3390	2657	2682	2657
query76	1877	894	911	894
query77	489	296	299	296
query78	9885	9113	9190	9113
query79	1362	550	556	550
query80	880	451	462	451
query81	505	252	241	241
query82	1129	145	139	139
query83	158	137	144	137
query84	284	85	73	73
query85	858	300	295	295
query86	332	309	297	297
query87	4467	4418	4421	4418
query88	3649	2341	2318	2318
query89	411	285	284	284
query90	1936	187	184	184
query91	198	142	141	141
query92	62	51	47	47
query93	1998	535	526	526
query94	725	300	253	253
query95	345	259	249	249
query96	625	281	279	279
query97	3305	3092	3103	3092
query98	220	203	196	196
query99	1613	1267	1289	1267
Total cold run time: 314316 ms
Total hot run time: 195972 ms

HappenLee
HappenLee previously approved these changes Sep 23, 2024
Copy link
Contributor

@HappenLee HappenLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhiqiang-hhhh
Copy link
Contributor Author

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Sep 23, 2024
@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.25% (9608/25796)
Line Coverage: 28.68% (79549/277390)
Region Coverage: 28.10% (41126/146374)
Branch Coverage: 24.75% (20963/84708)
Coverage Report: http://coverage.selectdb-in.cc/coverage/42228ba95fa800ceaab3214f0af03b22de18f686_42228ba95fa800ceaab3214f0af03b22de18f686/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 41536 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 42228ba95fa800ceaab3214f0af03b22de18f686, data reload: false

------ Round 1 ----------------------------------
q1	17589	7435	7298	7298
q2	2011	288	285	285
q3	12360	1104	1204	1104
q4	10586	770	703	703
q5	7780	3155	3089	3089
q6	238	155	148	148
q7	1011	606	610	606
q8	9459	2057	2192	2057
q9	6908	6459	6425	6425
q10	7021	2334	2271	2271
q11	442	242	245	242
q12	406	229	225	225
q13	17780	2977	2963	2963
q14	239	218	209	209
q15	591	545	524	524
q16	659	609	605	605
q17	964	790	782	782
q18	7241	6566	6790	6566
q19	1407	1032	1082	1032
q20	590	309	287	287
q21	4033	3406	3127	3127
q22	1124	988	1001	988
Total cold run time: 110439 ms
Total hot run time: 41536 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7312	7320	7296	7296
q2	328	243	248	243
q3	3079	3013	2935	2935
q4	2069	1823	1767	1767
q5	5619	5595	5693	5595
q6	243	148	143	143
q7	2204	1809	1784	1784
q8	3308	3505	3405	3405
q9	8819	8832	8814	8814
q10	3440	3492	3487	3487
q11	569	483	487	483
q12	821	630	611	611
q13	10643	3158	3194	3158
q14	304	282	279	279
q15	574	539	550	539
q16	707	666	671	666
q17	1821	1599	1610	1599
q18	8136	7772	7825	7772
q19	1723	1511	1623	1511
q20	2086	1914	1882	1882
q21	5560	5227	5441	5227
q22	1108	1051	1035	1035
Total cold run time: 70473 ms
Total hot run time: 60231 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 191403 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 42228ba95fa800ceaab3214f0af03b22de18f686, data reload: false

query1	928	394	386	386
query2	6260	2009	1948	1948
query3	8685	193	198	193
query4	33854	23496	23477	23477
query5	3402	483	463	463
query6	264	187	158	158
query7	4197	295	291	291
query8	289	225	230	225
query9	9567	2626	2627	2626
query10	487	282	276	276
query11	17923	15171	15147	15147
query12	167	99	99	99
query13	1526	406	392	392
query14	10021	7443	7301	7301
query15	253	178	179	178
query16	8073	507	438	438
query17	1684	613	580	580
query18	2179	315	325	315
query19	377	161	150	150
query20	122	123	126	123
query21	213	103	102	102
query22	4656	4704	4549	4549
query23	35092	34520	34442	34442
query24	11025	2937	2894	2894
query25	628	402	416	402
query26	1256	160	162	160
query27	2592	285	283	283
query28	7989	2444	2411	2411
query29	840	436	421	421
query30	268	164	161	161
query31	1043	776	816	776
query32	93	54	55	54
query33	747	298	299	298
query34	900	497	479	479
query35	859	732	748	732
query36	1067	921	943	921
query37	157	90	90	90
query38	3997	3895	3968	3895
query39	1472	1402	1408	1402
query40	253	94	99	94
query41	49	49	48	48
query42	121	97	94	94
query43	509	472	480	472
query44	1213	809	804	804
query45	197	167	165	165
query46	1135	768	778	768
query47	1896	1824	1856	1824
query48	473	355	374	355
query49	905	432	394	394
query50	809	410	418	410
query51	6980	6948	6908	6908
query52	98	89	86	86
query53	253	177	172	172
query54	1185	461	456	456
query55	81	80	77	77
query56	267	266	251	251
query57	1251	1121	1106	1106
query58	239	233	243	233
query59	3065	2833	2783	2783
query60	300	270	262	262
query61	104	103	101	101
query62	807	666	678	666
query63	211	183	183	183
query64	4021	666	635	635
query65	3222	3161	3194	3161
query66	840	298	306	298
query67	15836	15492	15619	15492
query68	4879	576	577	576
query69	522	300	297	297
query70	1196	1126	1170	1126
query71	375	271	268	268
query72	7375	4040	3996	3996
query73	750	325	332	325
query74	9700	9050	9058	9050
query75	3423	2627	2678	2627
query76	3095	968	953	953
query77	458	287	290	287
query78	10115	9163	9148	9148
query79	2280	530	538	530
query80	1149	441	436	436
query81	584	240	240	240
query82	632	141	140	140
query83	236	137	141	137
query84	243	78	74	74
query85	1377	286	278	278
query86	471	298	294	294
query87	4497	4230	4381	4230
query88	3774	2327	2311	2311
query89	387	287	281	281
query90	1911	188	182	182
query91	183	141	143	141
query92	59	49	48	48
query93	2281	533	541	533
query94	1004	297	266	266
query95	350	253	274	253
query96	627	280	276	276
query97	3231	3112	3086	3086
query98	228	191	197	191
query99	1550	1303	1287	1287
Total cold run time: 300414 ms
Total hot run time: 191403 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 33.05 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 42228ba95fa800ceaab3214f0af03b22de18f686, data reload: false

query1	0.05	0.04	0.04
query2	0.06	0.03	0.03
query3	0.23	0.05	0.06
query4	1.65	0.10	0.10
query5	0.50	0.52	0.50
query6	1.13	0.72	0.74
query7	0.02	0.01	0.02
query8	0.04	0.03	0.03
query9	0.57	0.50	0.47
query10	0.56	0.57	0.55
query11	0.14	0.10	0.10
query12	0.14	0.11	0.11
query13	0.63	0.59	0.59
query14	3.04	2.94	2.97
query15	0.90	0.82	0.81
query16	0.37	0.38	0.39
query17	1.04	1.06	1.07
query18	0.20	0.19	0.20
query19	1.97	1.89	1.96
query20	0.02	0.01	0.01
query21	15.38	0.56	0.55
query22	2.96	1.94	2.81
query23	17.25	0.89	0.65
query24	2.98	1.35	1.47
query25	0.18	0.14	0.06
query26	0.47	0.14	0.13
query27	0.05	0.04	0.04
query28	9.95	1.10	1.06
query29	12.59	3.27	3.27
query30	0.25	0.06	0.06
query31	2.86	0.39	0.37
query32	3.27	0.47	0.45
query33	3.01	3.00	3.03
query34	16.66	4.38	4.48
query35	4.46	4.39	4.50
query36	0.66	0.49	0.48
query37	0.08	0.05	0.05
query38	0.04	0.03	0.04
query39	0.03	0.02	0.02
query40	0.15	0.12	0.13
query41	0.08	0.02	0.02
query42	0.03	0.03	0.02
query43	0.04	0.03	0.03
Total cold run time: 106.69 s
Total hot run time: 33.05 s

@zhiqiang-hhhh
Copy link
Contributor Author

run p0

Copy link
Contributor

@HappenLee HappenLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Sep 24, 2024
@HappenLee HappenLee merged commit 8226221 into apache:master Sep 25, 2024
26 of 30 checks passed
@zhiqiang-hhhh zhiqiang-hhhh deleted the feat-agg branch September 25, 2024 06:58
zhiqiang-hhhh added a commit to zhiqiang-hhhh/doris that referenced this pull request Sep 25, 2024
`skew`,`skew_pop` and `skewness` is used to calculate
[skewness](https://en.wikipedia.org/wiki/Skewness#Pearson.27s_moment_coefficient_of_skewness)
of a data distribution.
`kurt`,`kurt_pop` and `kurtosis` is used to calculate
[kurtosis](https://en.wikipedia.org/wiki/Kurtosis) of a data
distribution.

The implementation references
ClickHouse/ClickHouse#5200, and modified result
type to AlwaysNullable since doris do not support NaN.

The formula used to calculate skew is `3-th moments / (variance^{1.5})`
The formula used to calculate kurt is `4-th moments / (variance^{2}) -
3`

when value of any result is NaN, doris will return NULL.

doc: apache/doris-website#1127
zhiqiang-hhhh added a commit to zhiqiang-hhhh/doris that referenced this pull request Sep 25, 2024
`skew`,`skew_pop` and `skewness` is used to calculate
[skewness](https://en.wikipedia.org/wiki/Skewness#Pearson.27s_moment_coefficient_of_skewness)
of a data distribution.
`kurt`,`kurt_pop` and `kurtosis` is used to calculate
[kurtosis](https://en.wikipedia.org/wiki/Kurtosis) of a data
distribution.

The implementation references
ClickHouse/ClickHouse#5200, and modified result
type to AlwaysNullable since doris do not support NaN.

The formula used to calculate skew is `3-th moments / (variance^{1.5})`
The formula used to calculate kurt is `4-th moments / (variance^{2}) -
3`

when value of any result is NaN, doris will return NULL.

doc: apache/doris-website#1127
morningman pushed a commit to apache/doris-website that referenced this pull request Sep 26, 2024
# Versions 

- [x] dev
- [x] 3.0
- [ ] 2.1
- [ ] 2.0

# Languages

- [x] Chinese
- [x] English


ref apache/doris#40945
zhiqiang-hhhh added a commit to zhiqiang-hhhh/doris that referenced this pull request Sep 26, 2024
`skew`,`skew_pop` and `skewness` is used to calculate
[skewness](https://en.wikipedia.org/wiki/Skewness#Pearson.27s_moment_coefficient_of_skewness)
of a data distribution.
`kurt`,`kurt_pop` and `kurtosis` is used to calculate
[kurtosis](https://en.wikipedia.org/wiki/Kurtosis) of a data
distribution.

The implementation references
ClickHouse/ClickHouse#5200, and modified result
type to AlwaysNullable since doris do not support NaN.

The formula used to calculate skew is `3-th moments / (variance^{1.5})`
The formula used to calculate kurt is `4-th moments / (variance^{2}) -
3`

when value of any result is NaN, doris will return NULL.

doc: apache/doris-website#1127
zhiqiang-hhhh added a commit to zhiqiang-hhhh/doris that referenced this pull request Sep 27, 2024
`skew`,`skew_pop` and `skewness` is used to calculate
[skewness](https://en.wikipedia.org/wiki/Skewness#Pearson.27s_moment_coefficient_of_skewness)
of a data distribution.
`kurt`,`kurt_pop` and `kurtosis` is used to calculate
[kurtosis](https://en.wikipedia.org/wiki/Kurtosis) of a data
distribution.

The implementation references
ClickHouse/ClickHouse#5200, and modified result
type to AlwaysNullable since doris do not support NaN.

The formula used to calculate skew is `3-th moments / (variance^{1.5})`
The formula used to calculate kurt is `4-th moments / (variance^{2}) -
3`

when value of any result is NaN, doris will return NULL.

doc: apache/doris-website#1127
dataroaring pushed a commit that referenced this pull request Sep 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/3.0.2-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants