Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: stability of bucket_id -> replicaset mapping is not guaranteed #235

Open
Totktonada opened this issue Nov 12, 2021 · 0 comments
Open
Labels
code health Improve code readability, simplify maintenance and so on

Comments

@Totktonada
Copy link
Member

Totktonada commented Nov 12, 2021

There are tests (at least mine one from PR #222), which lean on stability of bucket_id -> replicaset mapping: this mapping should be the same from run to run. See:

-- gh-220: bucket_id argument is ignored when it cannot be deduced
-- from provided select/pairs conditions.
pgroup.test_select_no_map_reduce = function(g)
local customers = helpers.insert_objects(g, 'customers', {
{
-- bucket_id is 477, storage is s-2
id = 1, name = 'Elizabeth', last_name = 'Jackson',
age = 12, city = 'New York',
}, {
-- bucket_id is 401, storage is s-2
id = 2, name = 'Mary', last_name = 'Brown',
age = 46, city = 'Los Angeles',
}, {
-- bucket_id is 2804, storage is s-1
id = 3, name = 'David', last_name = 'Smith',
age = 33, city = 'Los Angeles',
}, {
-- bucket_id is 1161, storage is s-2
id = 4, name = 'William', last_name = 'White',
age = 81, city = 'Chicago',
},
})

The test uses it several lines below:

t.assert_equals(storage_stat.diff(stat_b, stat_a), {
['s-1'] = {
select_requests = 1,
},
['s-2'] = {
select_requests = 0,
},
})

It works, when several assumptions are true:

  • We don't add or remove storage replicasets.
  • Replicaset UUIDs are the same.
  • pairs() iteration over replicasets table in vshard is stable (the same from run to run).

In this conditions vshard just traverse over replicaset in the same order and created the same ranges of bucket ids on the same replicasets.

The point about stability of pairs() is a bit vague. It seems, it works on tarantool, however there is no such guarantee. It would not work on recent LuaJIT and may become broken in tarantool if we'll pulldown some LuaJIT patches (I guess, it is relevant to this one; Update: @Buristan corrects me that pairs() non-deterministic behavior is actually introduced in LuaJIT/LuaJIT@ff34b48).

There are two ways to make the code safe:

  1. Write the test without assumption about where particular bucket resides.
    • I don't like that the test (and all such tests) would become less straightforward and will be more abstract, so harder to understand.

      To illustrate: what if I'll need two keys, which gives different bucket ids, which should reside on different replicasets? Sure, it is possible to generate them, but it requires extra code and will make things cumbersome. We'll make the idea of the test less obvious for a reader.

  2. Don't call vshard.router.bootstrap(), disable rebalancer and create buckets manually using vshard.storage.bucket_force_create().

(Thanks @Gerold103 and @olegrok for answering questions and far-beyound-working-time discussions, so finally I ended up with conslusion that the testing code is not well written and filed this issue.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code health Improve code readability, simplify maintenance and so on
Projects
None yet
Development

No branches or pull requests

3 participants