Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executor: tablesample return unodered result with "order by" suffix for partitioned tables #27349

Closed
lichunzhu opened this issue Aug 18, 2021 · 6 comments · Fixed by #27383
Closed
Assignees
Labels
severity/critical sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.

Comments

@lichunzhu
Copy link
Contributor

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

dump the following data to TiDB v5.+

/*!40101 SET NAMES binary*/;
CREATE TABLE `test`.`UK_LP16944` (
  `COL1` tinyint(16) DEFAULT '108' COMMENT 'NUMERIC UNIQUE INDEX',
  `COL2` varchar(20) DEFAULT NULL,
  `COL4` datetime DEFAULT NULL,
  `COL3` bigint(20) DEFAULT NULL,
  `COL5` float DEFAULT NULL,
  UNIQUE KEY `UK_COL1` (`COL1`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;
-- PARTITION BY LIST COLUMNS(col1) (
--   PARTITION `P0` VALUES IN (-97,66,121,88,24,-13,-50,50,78,-77),
--   PARTITION `P1` VALUES IN (118,2,-24,87,110,-128,-40,-16,-98,-75),
--   PARTITION `P2` VALUES IN (-15,89,-90,15,113,-44,49,-31,72,126),
--   PARTITION `P3` VALUES IN (-59,58,-78,10,127,-20,-6,20,96,-23),
--   PARTITION `P4` VALUES IN (-41,-95,-25,6,-52,-53,79,-72,-32,3),
--   PARTITION `P5` VALUES IN (12,-48,60,-82,18,68,-80,-105,92,104),
--   PARTITION `P6` VALUES IN (14,-96,-47,76,29,108,40,101,75,16),
--   PARTITION `P7` VALUES IN (-34,-74,NULL,34,56,85,7,57,91,-126),
--   PARTITION `P8` VALUES IN (-19,80,120,52,114,-103,-36)
-- );
-- PARTITION BY HASH(COL1)
-- PARTITIONS 5;
PARTITION BY RANGE ( COL1 ) (
    PARTITION p0 VALUES LESS THAN ( 0 ),
    PARTITION p1 VALUES LESS THAN ( 30 ),
    PARTITION p2 VALUES LESS THAN ( 60 ),
    PARTITION p3 VALUES LESS THAN ( 120 ),
    PARTITION p9 VALUES LESS THAN ( 1000 )
);


/*!40101 SET NAMES binary*/;
INSERT INTO `test`.`UK_LP16944` VALUES
(114,'咂旱珀妓瘔啪咥頚Ü臹搌蠸嶜疠湊諫坫脊湡嗫','4418-05-06 03:47:14',4388573883086551915,-1.2719e38),
(75,'瀠囯槏媋轳瓡幨蔭蘲滿犕铉溬沀霊藮酊嶱誩传','6615-03-30 05:06:14',4286647304820783992,3.35436e38),
(-23,'膫黎蕑戓擣铒塒錬汯悧燁耼馟筲傳侬甜闈杖唭','0933-03-11 21:58:19',7138874294714425352,2.29273e38),
(-126,'吤缓耄憘茋沋謿瓐滏氜脦蚧園ǰ鶳瞱魧夆蚏睙','4675-10-05 14:37:56',-6977039237587871608,1.80877e38),
(-52,'駠嘏斺鵦緒楃劂烠逄乆諎灭洀稍咐祯÷诧弡涣','7811-04-29 12:10:21',-7249794925316542216,2.44742e38),
(NULL,'潇鑖蒷莲僓蕹弪偔鵷竭鴅喠孉帉鷂鮏慟書堏鵞','0297-01-22 14:16:10',-3477701933821409632,1.44094e38),
(-75,'虧敼鷹鼹桸Ĕ蒏垫嚷戝鎕楨撐癣笃蠡耏哴殸錑','7923-06-01 15:52:46',-851525491414728235,-3.33483e38),
(-82,'堍駪湉愹厀朱虗粴踱啎劇輮囒鮂騊邛ɂ礚卲嫲','1194-03-07 23:01:33',-7545573906563932885,1.91183e37),
(-50,'害紴血鱓瑧騬鬠礖Ͱ腼汣忹嫍唁樨疚巴毉釕΄','5515-09-24 23:20:51',-417579323959628175,3.76727e37),
(72,'弁莐炓巣馨瓝筤闊夨冦艵撑蓁岋洸龲踜仃蛺郲','2048-10-14 19:59:09',-2133993048623888565,9.77258e37),
(91,'怬砢綖賣锒沜毉轎觀腷羂鳜楱街運蒙鱧圴袳霑','2873-09-27 11:13:09',8614492975544129597,-7.61133e36),
(-97,'灱伊艴蓶笍倩厳距鵏悺多哲锴陈顓炋珲谜豭剆','9935-01-21 19:28:38',-8791607952111847492,-2.84065e38),
(-15,'鱠秴魢覙襐炩弅匯賨鉈帵攠Ë镲蜧釽呏棒莙宺','8523-07-20 16:01:06',-6067061440492825582,-6.21843e37),
(-19,'叚筺畴鷡殕梑眘曹紕钥攍灏臚恬茈躃陥縒墠Ô','4579-07-13 11:48:48',-6385318838586614716,3.09991e38),
(87,'沗梨跦珡顤欆猙叩撪乙戙牠碚鋌滟菛饝舺叕劓','4688-03-14 07:46:16',-4506277240200875828,-4.16076e37),
(-34,'锗豭靱疵檽懡谞廯娳攦溎嗙赧昷誒馡傮鳹恪欋','8669-04-09 22:06:51',604544593815345165,2.28527e38),
(6,'瑤緑況鹟傒趤葺萃蔱滞嬸厑醽妧瓆蚸ᚫ瞾蜈蟽','0501-09-24 10:32:48',-2549971593675584321,3.35323e38),
(-20,'恀讲祼謎瓌毬噐鲻螃试嘄犳檖檇鬴梳尻硧疈頔','5732-04-14 10:31:22',-2552389909847586568,-2.66162e38),
(56,'賌汛茔汘嘑锚匤戁籽滶k捨蘯屼竑舳絋娏虖逢','1443-02-09 00:47:33',-4311711762657192245,1.50847e38),
(-105,'挚勖疑鑼蒷镃癄嘯堥鼤揚伥禰᛭獅類箊究鵳薱','5399-08-30 14:20:59',-3284347791103998096,1.34184e38),
(-40,'潚赾忨鵉駎词鱸纻厗頹邧聝誽鸪瀜呎屙朊娺毠','7604-11-29 07:48:45',293507669715135033,-1.18079e38),
(52,'巃鯧径珐缲鞻鋬昞鴽鳺潤蹥箹綬慝嗽弢袎始膳','1398-04-06 19:15:16',-7913854190188624726,4.95036e37),
(-80,'鬴睵宋昮锿镀捜雍箮逳徍盿昇亢綋滍禐逿疤弢','7591-07-15 21:14:39',-1380242359047423350,-1.47866e38),
(-59,'洍驳滈瀨逕鹋螁禷赵屎儵脐暺钮祧畴郸攇燲傺','7325-12-07 22:28:29',-786233280482238884,-3.96212e37),
(85,'軽鑃篌堹様鱁甛導娗埩货苮疣垐毞虦吧懞胺睲','3279-05-21 13:20:49',-1403122789315317029,-1.40708e38),
(12,'鬪黼鰒昜愹攮裸咩匮税崚厝鱇继槥趆馬偑钨纹','8264-03-31 03:33:34',-5920316765742030643,-3.31946e37),
(-24,'禐聆乛茜鶕蚅洔鹟芐垊憁嫎逐蜈囄曠牯隺锪砇','6179-07-16 05:09:54',-5483310622347450395,-2.46541e38),
(-74,'觢Q项鍣觩昛犵锻滒癀秳梼妛暓濜樺玡魅揳旺','3712-04-28 01:02:38',-1774033869202661112,2.21194e38),
(-47,'硚齪籑強胩進抹釢端杧規槷芽蘗轐饮闟嘠膚撢','5118-09-27 05:44:29',-1645896295033261770,-5.11595e37),
(24,'卩戁觓曯蟅裍欽篣雜獧ũ档鈼凪蹕q总鸺烇餞','9184-06-28 02:02:01',-8301729759870498632,-2.00206e38),
(15,'蜹怿逍捹墻硺虙煊欂函後貲×銻豚駀剏蒽琦婫','7279-07-11 09:07:09',-2853518938778262116,3.3469e38),
(66,'慥沲歄躃玠謬阚鷖棙媒輌鬟Ʌ琏襀怹檂邑愣覑','0793-01-18 02:01:12',-4944853509796428475,8.57542e37),
(104,'撘蕠啴氚阏嘦嗅戏輖顋ⱥ謵棠揋椛菨瑰鈓嫰紜','6392-02-27 23:17:51',-5204259958858921305,1.05927e38),
(80,'曐撀翳廹吼鄤驆寎飢厌迚厪衭縇蛺騪謹鑢遘娩','7017-06-26 15:29:42',6373676739544409933,2.43185e38),
(79,'調ᛕ及讱姟桝駶磜鲐嗐狩羝唏嶻騥样傑嬹嬝乞','4707-11-20 21:44:55',141837662832065189,-3.11603e38),
(120,'蔾陊癃鞲獇荓睻鼪巤惇垎恙軷忥窟嘑ė廡鸀鏯','0879-09-18 09:02:44',-6258798708616091710,-9.59025e37),
(-6,'男株鶰唩彅雥碦頏仼抓郘郸瓇醪鱟ĩ貥涽劍歓','9391-07-27 01:53:08',-6304838277502562818,-2.38834e38),
(3,'ǯ弖搠镱箩鵜卟抡澞軹蠯犺蝔鱨矴懆紴鏿絧鴾','7027-03-15 08:17:20',-1689453792701447907,-2.30084e37),
(-96,'蹻祠淴蘷麚铌蚣譞搠忄碷礙煪麩镡陯冹枩釹楓','0620-01-27 06:23:23',7536206060191592283,3.03929e38),
(-98,'渠忥堢愮曜偱隢鐙髅骬礤顳啉嬱穅旻盈紒宅ƨ','5526-03-26 20:06:22',401055101400344422,-2.92676e38),
(-41,'薫趤瓖昴鉩堦綢篾喾鮚楰{眐ᚼ呁稆稕猵樢蛶','9258-07-29 10:01:59',-4278847757774971102,-1.63346e38),
(-48,'丟餆鎣驑夓碥蔋丆詤篔嵷碀譝怠濶贬瀕焓剸祧','7453-09-15 21:47:14',-4437413665237440145,2.90817e38),
(88,'躽刕瞕険醓卞瞿稆解科櫫鷭竪鮆汫箎煊闱骼踚','1676-05-12 04:53:51',3715561988776924151,2.73398e38),
(113,'鎝嵏墌軧卖貃肢鼜嬐执穙垚聜鄫劅夞粦蟜噮ⱡ','9408-04-18 11:33:12',5607790373443908685,-2.10652e38),
(-103,'冓`録极铻物肠緱琳疺拲吙申掔现鍥祩勫垠貪','7892-10-22 17:19:03',1371254108995418347,-2.86953e38),
(-90,'鳦龾烶咣夔禋乥鈼嘢腙噼肘襢熐肅楔詎勇撬毴','7219-12-25 08:23:22',-288013551764252846,9.96541e37),
(-77,'骦插颞硤亘寚升ᚿ靼璃岠釐讘屳僞殦聅诫去応','6191-11-16 05:48:32',3114724708207789539,4.28469e36),
(40,'玵搼鱽阾縺朙橸呻樲檱局甩萛鰙捛銙揲裎詽険','4218-10-01 09:01:22',4814803515726239232,2.04685e38),
(18,'褿萔敻凜峇舟圮摣訍蘘嘐茷墜恠坛營曲湾鸊鵊','8429-05-01 01:11:54',806440059521605460,3.77836e37),
(96,'陬蹈侭蕌鞙耜荗硈鹘絢丅禷蟇疢婨葽覙竝画拁','4286-02-23 09:47:35',-366264572159280157,-1.81412e38),
(-95,'堇韯怟碊邍簟玳缿穙鷩縓鎁魵咡昣瘀模娠幢踋','6331-02-07 02:44:29',8074052320090187837,1.47491e38),
(78,'诀燯蠫回蔅峇珌井眺棛菼跻滺偛悢誉呶呭郲嫼','8611-04-16 19:50:20',3285806306532777489,1.7747e38),
(-53,'淀箚銻蹅題糍惫写肚耙鱤方扝垖澾袨鉃鍴昩蠭','4454-07-04 05:05:52',3637206268782016525,-2.59078e38),
(110,'糘莁跩î痟界舠姗鏔癮纬琲轹弐騒谋楙几资茾','1426-03-30 04:31:55',4581920605106180242,5.36501e37),
(89,'旲摵營性灊鋌衕撇鵺額穟抱起佸譲遻鶭濃廘楑','8268-05-04 14:45:58',2348963725027615362,-1.59969e38),
(-44,'硦鲵鎲缥裴艱践嶂戋藹啸琣攴擨擫悒决髵輗櫪','9350-10-11 19:47:06',5828676998514979008,-1.35711e37),
(10,'泍訤廕袁畯譩暦婀焥燝禋鯯攢蔻鍪緙輝映浓Ý','4664-03-07 06:09:20',-7735899374758025891,5.92546e37),
(29,'衇灱愦澽紫ì琒文鳬耫摀眢垏罞鲴硽晡玵宆恧','5233-05-08 23:01:30',8366261864892891447,-9.16373e37),
(-36,'鄫脂倊咥鬈箦鞂嗴攵溷罗齨密簢咱ŭ褦佼両舑','1617-02-04 18:32:44',2862012277047285514,1.01628e38),
(-72,'罎键涸澜晀叨唟拥熗銫陞灙輠殠羧栌銀銆剴憼','4939-06-06 15:41:49',-7141014477088945817,1.21205e38),
(57,'眴鲶W谅泃劖縆銅窗´笂嘬鸦曤箽攍袡鱜賓儆','6511-10-31 14:43:26',-2234865909175097441,6.38604e37),
(-13,'鵾捱戥遭閛鯁锇鵠摲窒濣萷齏肇搷陚数噸驭毗','0438-09-21 00:32:45',-2628048012633186582,-7.21032e37),
(-32,'羳蓤裭贤柉箙熑韤鋺炲硵馧凲婤厐潇屲饓譁臕','0506-05-26 21:59:32',7683015531014073633,3.01196e38),
(34,'瓻鯲倪揘湓墶湕柭藜巍艭蹲猏朠妎窮綤泣媘歏','5653-05-25 17:46:00',4346318026559438813,3.05353e38),
(20,'洙筤虉鏣懐洆貓咿Ĩ覊駣贻槉隊楞瀕濷Ǎ冃溙','3691-04-26 22:32:00',-7649419873352892559,1.61094e37),
(-31,'咏岚攎罦椞抎殦緲陈圜汑蹉穧烯珨鯢棜眵拆勳','7339-12-28 07:27:03',-4868350756018954679,-2.24181e38),
(50,'獏藤洘尟跍翂诧埯葇恚鼨渥詰痦贠鮇氎瞴压墅','1679-02-13 04:38:12',8482112650549957879,-1.67915e38),
(76,'績笞拤渘碑譺觬帇貀暥摟胂甠豆澭菅駼湭艜齇','1394-02-06 10:07:23',-155916570341410128,-2.35482e38),
(49,'枰畴凄胉箠蹜羇輭絯癢湙凃内葌拳綳瘏澼絃堋','7133-05-19 17:42:52',4109546933647305824,3.18451e38),
(7,'徝農娢弰辀壝瑺壛錤ñ燒辅贳殊顂飲屦滫吴膋','1438-04-04 21:01:33',758858763773376771,-1.11626e38),
(14,'薘丝唸芦懹靏矾薇矧澮瞅雺曎洺杂軁囁踸揤艒','7265-01-30 21:37:23',2577620025699430952,2.69325e38),
(60,'曨苭癔氰匫跉矆魩擐厣鏭叠呆阕鉳絼搧巠譩璎','9836-06-18 19:56:49',-9076030164838075460,5.69782e37),
(58,'蘉龟鷭潗鵇嶞孝胛茯侍媒龣罙枇乣嬖耡暽犐穽','1053-10-01 22:46:37',3817289562199539243,1.08626e38),
(118,'聑ŕ铥沖鑖餾峉嵚鄢彥叽仲羒橤戾躘或鄁鵨闾','7919-11-25 13:44:39',4065655519049896946,2.66455e36),
(121,'鐎蔈浲莢鸬嗢ᛦ面Ʋ骘瀶堗哧飅黆阕鯐爩卜娂','7694-06-16 01:58:02',-1345373308816545488,-3.0134e37),
(2,'農妁圫轂å螶鉵芕呭塕達鮚ñ縠鵐虐妛窏溨Ȕ','6720-09-14 19:43:24',-5047473471689177267,-3.37747e38),
(-16,'杝刀咩鈚櫯膢皷儆蒞靮娸麃截銣卞椝菵襽艉穓','9095-02-17 00:44:23',3727276790115713828,-2.59725e38),
(-78,'孜厾捓纁銁睐壽熀硕砡畲褛偪炟狰凤慓Ɲ遬蛊','3274-08-04 16:35:47',-3723511240424254235,-8.93807e37),
(16,'崜姣彃呝傾跊所褶曅獝Ʈ翆鑥竊跃壘猈錟荗葢','6418-05-23 18:38:21',6452120560596481697,1.09946e38),
(127,'犓¤涺嫆驏篚虁实郡擣謭謏忌鶍羵噵櫡脇毶疻','7108-04-21 19:11:54',7758176480024750767,1.5002e38);

run the following sql:

mysql> select _tidb_rowid from test.UK_LP16944 TABLESAMPLE REGIONS() ORDER BY _tidb_rowid;
+-------------+
| _tidb_rowid |
+-------------+
|           3 |
|          17 |
|          19 |
|           1 |
|          36 |
+-------------+
5 rows in set (0.01 sec)

2. What did you expect to see? (Required)

The result is in order:

mysql> select _tidb_rowid from test.UK_LP16944 TABLESAMPLE REGIONS() ORDER BY _tidb_rowid;
+-------------+
| _tidb_rowid |
+-------------+
|           1 |
|          3 |
|          17 |
|           19 |
|          36 |
+-------------+
5 rows in set (0.01 sec)

3. What did you see instead (Required)

The result is disordered.

4. What is your TiDB version? (Required)

v5.0.0, v5.1.1

@lichunzhu lichunzhu added the type/bug The issue is confirmed as a bug. label Aug 18, 2021
@lichunzhu lichunzhu changed the title executor: tablesample return unodered result with "order by" suffix for partition tables executor: tablesample return unodered result with "order by" suffix for partitioned tables Aug 18, 2021
@qw4990 qw4990 added the sig/sql-infra SIG: SQL Infra label Aug 18, 2021
@bb7133
Copy link
Member

bb7133 commented Aug 18, 2021

I don't think this is 'critical' since TABLESAMPLE is not exported to the end-user, change it to 'major'.

@lichunzhu
Copy link
Contributor Author

I don't think this is 'critical' since TABLESAMPLE is not exported to the end-user, change it to 'major'.

Dumpling heavily relies on TABLESAMPLE. If this is not fixed, dumpling will dump partitioned tables incorrectly.

@tangenta
Copy link
Contributor

tangenta commented Aug 19, 2021

I will fix this ASAP.

Minimal reproducible steps:

drop table if exists t;
create table t (a int, b int, unique key idx(a)) partition by range (a) (
    partition p0 values less than (0),
    partition p1 values less than (10),
    partition p2 values less than (30),
    partition p3 values less than (maxvalue)
);
insert into t values (2, 2), (31, 31), (12, 12);
select _tidb_rowid from t tablesample regions() order by _tidb_rowid;

+-------------+
| _tidb_rowid |
+-------------+
|           1 |
|           3 |
|           2 |
+-------------+
3 rows in set (0.01 sec)

@tangenta
Copy link
Contributor

The problem is the incorrect assumption about the relationship between the ranges and the handles:

The smaller handles always occur in the ranges with a smaller start_key. If we sort the ranges by start_key, the result must be sorted by handles(_tidb_rowid).

tidb/executor/sample.go

Lines 134 to 144 in 7755d25

func (s *tableRegionSampler) initRanges() error {
if s.restKVRanges == nil {
var err error
s.restKVRanges, err = s.splitTableRanges()
if err != nil {
return err
}
sortRanges(s.restKVRanges, s.isDesc)
}
return nil
}

However, this is not true in partition tables. Different partitions use different table IDs, a smaller handle may be encoded with a large table ID, resulting in a larger range start_key.

We need to prevent the physical plan NominalSort from generating a PhysicalTableSample for the DataSource.

@cyliu0
Copy link
Contributor

cyliu0 commented Aug 19, 2021

Since this will block partition table users using dumpling, I'll change the severity to critical

@ti-srebot
Copy link
Contributor

Please edit this comment or add a new comment to complete the following information

Not a bug

  1. Remove the 'type/bug' label
  2. Add notes to indicate why it is not a bug

Duplicate bug

  1. Add the 'type/duplicate' label
  2. Add the link to the original bug

Bug

Note: Make Sure that 'component', and 'severity' labels are added
Example for how to fill out the template: #20100

1. Root Cause Analysis (RCA) (optional)

2. Symptom (optional)

3. All Trigger Conditions (optional)

4. Workaround (optional)

5. Affected versions

6. Fixed versions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
severity/critical sig/sql-infra SIG: SQL Infra type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants