Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix LOGICAL_ERROR on race between DROP and INSERT with materialized views #39477

Merged
merged 2 commits into from
Aug 5, 2022

Conversation

azat
Copy link
Collaborator

@azat azat commented Jul 21, 2022

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in official stable or prestable release)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Fix LOGICAL_ERROR on race between DROP and INSERT with materialized views

In case of parallel INSERT (max_insert_threads > 1) it is possible for
VIEW to be DROP/DETACH'ed while building pipeline for various paralell
streams, and in this case the header will not match since when you have
VIEW you will got empty header and non-empty header otherwise.

And this leads to LOGICAL_ERROR later, while checking that output
headers are the same (in QueryPipelineBuilder::addChains() -> Pipe::addChains()).

However it also makes the pipeline different for various parallel
streams, and it looks like it is better to fail in this case, so instead
of always returning empty header from buildChainImpl() explicit check
had been added.

Note, that I wasn't able to reproduce the issue with the added test,
but CI may have more "luck" (although I've verified it manually).

Fixes: #35902 (cc @tavplubix)
Cc: @KochetovNicolai

@robot-clickhouse robot-clickhouse added the pr-bugfix Pull request with bugfix, not backported by default label Jul 21, 2022
@azat
Copy link
Collaborator Author

azat commented Jul 22, 2022

Stateful tests (address, actions) — fail: 1, passed: 124, skipped: 1

Stateless tests flaky check (address, actions) — fail: 33, passed: 67

2022-07-22 05:14:07 [dacde7653878] 2022.07.22 05:13:33.487908 [ 896 ] {8b0b44d8-6bc2-417c-91dc-2be28db2d117}  executeQuery: Code: 60. DB::Exception: Table test_uka4go.mv (651c20a9-0766-4bb8-85b8-c8a1c2bf3db8) doesn't exist. (UNKNOWN_TABLE) (version 22.8.1.1) (from [::1]:60456) (comment: 02380_insert_mv_race.sh) (in query: INSERT INTO null SELECT * FROM numbers_mt(1e6)), Stack trace (when copying this message, always include the lines below):

Fixed.

@azat azat force-pushed the fix-mv-drop-race branch 3 times, most recently from bf79422 to ff42fe7 Compare July 23, 2022 04:35
@qoega
Copy link
Member

qoega commented Jul 25, 2022

2022-07-23 03:26:36 Code: 500. Code: 219. DB::Exception: New table appeared in database being dropped or detached. Try again. (DATABASE_NOT_EMPTY) (version 22.8.1.1)

One more race?

@KochetovNicolai
Copy link
Member

@azat thank you for the test and fix. I have managed to make a test which reproduces easily (every run in my env). Also I've made a little bit different fix output headers. Now we will not have an exception at all (but it is funny that pipelines for some threads still could be different :).

@azat
Copy link
Collaborator Author

azat commented Aug 1, 2022

I have managed to make a test which reproduces easily (every run in my env).

By simply using more threads, I see, thanks!

Also I've made a little bit different fix output headers. Now we will not have an exception at all

I thought about this at first, but then decided that different pipelines for different threads maybe misleading, since some threads will push to the MV, others won't, so I preferred more strict variant.

But I'm fine with your fix too, thanks!

@KochetovNicolai
Copy link
Member

I think it is ok to have different pipeline in case if we are going to drop MV anyway. Looks a little bit better than exception.

@azat
Copy link
Collaborator Author

azat commented Aug 5, 2022

Stateless tests flaky check (address) — fail: 3, passed: 97

2022-08-04 13:17:25 [f06991af8370] 2022.08.04 13:17:24.780455 [ 1261 ] {a6effe8a-c88e-4146-ae15-89d00075170b}  executeQuery: Code: 60. DB::Exception: Table test_57btd1.mv (6cbadd1c-dcfd-461f-9f29-45007a99dda5) doesn't exist. (UNKNOWN_TABLE) (version 22.8.1.1) (from [::1]:36484) (comment: 02380_insert_mv_race.sh) (in query: INSERT INTO null SELECT * FROM numbers_mt(1000) settings max_threads=1000, max_insert_threads=1000, max_block_size=1), Stack trace (when copying this message, always include the lines below):

Fixed (and also rebased a little bit).

P.S. Your test did not reproduce the issue for me, but I'm trying on a server with 2 xeons, so maybe your setup less powerful.

azat and others added 2 commits August 5, 2022 13:16
…iews

In case of parallel INSERT (max_insert_threads > 1) it is possible for
VIEW to be DROP/DETACH'ed while building pipeline for various paralell
streams, and in this case the header will not match since when you have
VIEW you will got empty header and non-empty header otherwise.

And this leads to LOGICAL_ERROR later, while checking that output
headers are the same (in QueryPipelineBuilder::addChains() -> Pipe::addChains()).

However it also makes the pipeline different for various parallel
streams, and it looks like it is better to fail in this case, so instead
of always returning empty header from buildChainImpl() explicit check
had been added.

Note, that I wasn't able to reproduce the issue with the added test,
but CI may have more "luck" (although I've verified it manually).

Fixes: ClickHouse#35902
Cc: @KochetovNicolai
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Instead of checking that number of processors different for different
threads, simply always return empty header from buildChainImpl(), by
adding explicit conversion.

v2: ignore UNKNOWN_TABLE errors in test
@azat
Copy link
Collaborator Author

azat commented Aug 5, 2022

Stateless tests (ubsan) — fail: 1, passed: 4317, skipped: 21

2022-08-05 21:08:55 02360_send_logs_level_colors:                                           [ FAIL ] 4.23 sec. - result differs with reference: 
2022-08-05 21:08:55 --- /usr/share/clickhouse-test/queries/0_stateless/02360_send_logs_level_colors.reference	2022-08-05 21:07:27.417735962 +0930
2022-08-05 21:08:55 +++ /tmp/clickhouse-test/0_stateless/02360_send_logs_level_colors.stdout	2022-08-05 21:08:55.514382062 +0930
2022-08-05 21:08:55 @@ -1,3 +1,2 @@
2022-08-05 21:08:55  ASCII text
2022-08-05 21:08:55  ASCII text
2022-08-05 21:08:55 -ASCII text

@KochetovNicolai KochetovNicolai merged commit 0005c37 into ClickHouse:master Aug 5, 2022
@azat azat deleted the fix-mv-drop-race branch August 5, 2022 16:05
@tavplubix
Copy link
Member

New test fails with Ordinary database, see #39985

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-bugfix Pull request with bugfix, not backported by default
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Block structure mismatch in QueryPipeline stream: different number of columns
5 participants