Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support calculating bucket_id based on ddl.sharding_key #166

Closed
Tracked by #2227
sharonovd opened this issue Jun 3, 2021 · 3 comments · Fixed by #181
Closed
Tracked by #2227

Support calculating bucket_id based on ddl.sharding_key #166

sharonovd opened this issue Jun 3, 2021 · 3 comments · Fixed by #181
Assignees
Labels
customer feature A new functionality

Comments

@sharonovd
Copy link

sharonovd commented Jun 3, 2021

Crud/ddl/migrations is a way to go for our customers. But project after a project I observe the following clutches:

local function get_object_by_pk(part1, part2)    
    local bucket_id = vshard.router.bucket_id_mpcrc32(part2)    
    local settings, err_settings = crud.get('object', {part1, part2}, {bucket_id = bucket_id})   
    if err_settings ~= nil then return nil, err_settings    end    
    return settings
end

local function upsert_object(object)    
    local id = settings["id"]    
    local bucket_id = vshard.router.bucket_id_mpcrc32(id)
    local res, err = crud.replace_object("settings_object", settings, {bucket_id = bucket_id})    
    if err ~= nil then        return nil, err    end    
end

But this info is already present in ddl! (ddl.register_sharding_key('object', {"id"}))
This shoould work out-of-the-box

@Totktonada Totktonada added the feature A new functionality label Jun 18, 2021
@ligurio ligurio self-assigned this Jun 21, 2021
ligurio added a commit that referenced this issue Jul 6, 2021
CRUD allows to automatically calculate `bucket_id` based on primary key
or one can specify `bucket_id` explicitly [1]. However it is often
required to calculate `bucket_id` using sharding keys created by DDL schema.

DDL module exposes space with sharding keys as a part of public API [2],
so everyone is allowed to set and get sharding keys there without adding
DDL module to dependencies.

Patch allows to calculate `bucket_id` value automatically when sharding keys
used in a sharded space.

1. #46
2. https://github.com/tarantool/ddl#api
3. #46 (comment)

Closes #166
ligurio added a commit that referenced this issue Jul 6, 2021
CRUD allows to automatically calculate `bucket_id` based on primary key
or one can specify `bucket_id` explicitly [1]. However it is often
required to calculate `bucket_id` using sharding keys created by DDL schema.

DDL module exposes space with sharding keys as a part of public API [2],
so everyone is allowed to set and get sharding keys there without adding
DDL module to dependencies.

Patch allows to calculate `bucket_id` value automatically when sharding keys
used in a sharded space.

1. #46
2. https://github.com/tarantool/ddl#api
3. #46 (comment)

Closes #166
ligurio added a commit that referenced this issue Jul 8, 2021
CRUD allows to automatically calculate `bucket_id` based on primary key
or one can specify `bucket_id` explicitly [1]. However it is often
required to calculate `bucket_id` using sharding keys created by DDL schema.

DDL module exposes space with sharding keys as a part of public API [2],
so everyone is allowed to set and get sharding keys there without adding
DDL module to dependencies.

Patch allows to calculate `bucket_id` value automatically when sharding keys
used in a sharded space.

1. #46
2. https://github.com/tarantool/ddl#api
3. #46 (comment)

Closes #166
ligurio added a commit that referenced this issue Jul 9, 2021
CRUD allows to automatically calculate `bucket_id` based on primary key
or one can specify `bucket_id` explicitly [1]. However it is often
required to calculate `bucket_id` using sharding keys created by DDL schema.

DDL module exposes space with sharding keys as a part of public API [2],
so everyone is allowed to set and get sharding keys there without adding
DDL module to dependencies.

Patch allows to calculate `bucket_id` value automatically when sharding keys
specified using DDL module or manually in `_ddl_sharding_key` space.

1. #46
2. https://github.com/tarantool/ddl#api
3. #46 (comment)

Closes #166
ligurio added a commit that referenced this issue Jul 9, 2021
CRUD allows to automatically calculate `bucket_id` based on primary key
or one can specify `bucket_id` explicitly [1]. However it is often
required to calculate `bucket_id` using sharding keys created by DDL schema.

DDL module exposes space with sharding keys as a part of public API [2],
so everyone is allowed to set and get sharding keys there without adding
DDL module to dependencies.

Patch allows to calculate `bucket_id` value automatically when sharding keys
specified using DDL module or manually in `_ddl_sharding_key` space.

1. #46
2. https://github.com/tarantool/ddl#api
3. #46 (comment)

Closes #166
ligurio added a commit that referenced this issue Jul 9, 2021
CRUD allows to automatically calculate `bucket_id` based on primary key
or one can specify `bucket_id` explicitly [1]. However it is often
required to calculate `bucket_id` using sharding keys created by DDL schema.

DDL module exposes space with sharding keys as a part of public API [2],
so everyone is allowed to set and get sharding keys there without adding
DDL module to dependencies.

Patch allows to calculate `bucket_id` value automatically when sharding keys
specified using DDL module or manually in `_ddl_sharding_key` space.

1. #46
2. https://github.com/tarantool/ddl#api
3. #46 (comment)

Closes #166
ligurio added a commit that referenced this issue Jul 9, 2021
ligurio added a commit that referenced this issue Jul 9, 2021
Calculate bucket_id for operations replace, insert and upsert
using DDL sharding key.

Part of #166
ligurio added a commit that referenced this issue Jul 9, 2021
Use sharding keys to calculate bucket id (WIP)

CRUD allows to automatically calculate `bucket_id` based on primary key
or one can specify `bucket_id` explicitly [1]. However it is often
required to calculate `bucket_id` using sharding keys created by DDL schema.

DDL module exposes space with sharding keys as a part of public API [2],
so everyone is allowed to set and get sharding keys there without adding
DDL module to dependencies.

Patch allows to calculate `bucket_id` value automatically when sharding keys
specified using DDL module or manually in `_ddl_sharding_key` space.

1. #46
2. https://github.com/tarantool/ddl#api
3. #46 (comment)

Closes #166
@artur-barsegyan
Copy link

As part of the pull request #181, I saw that there was a commit with a change to the default hash function.

I wrote an issue with details about that problem: #185. We should implement it in the current scope of work. Without that we can't deliver:

  • CRUD + DDL feature for our paid customers
  • The new Getting started with CRUD on-board

ligurio added a commit that referenced this issue Jul 15, 2021
ligurio added a commit that referenced this issue Jul 15, 2021
Calculate bucket_id for operations replace, insert and upsert
using DDL sharding key.

Part of #166
ligurio added a commit that referenced this issue Jul 15, 2021
CRUD allows to automatically calculate `bucket_id` based on primary key
or one can specify `bucket_id` explicitly [1]. However it is often
required to calculate `bucket_id` using sharding keys created by DDL schema.

DDL module exposes space with sharding keys as a part of public API [2],
so everyone is allowed to set and get sharding keys there without adding
DDL module to dependencies.

Patch allows to calculate `bucket_id` value automatically when sharding keys
specified using DDL module or manually in `_ddl_sharding_key` space.

1. #46
2. https://github.com/tarantool/ddl#api
3. #46 (comment)

Closes #166
ligurio added a commit that referenced this issue Jul 15, 2021
Calculate bucket_id for operations replace, insert and upsert
using DDL sharding key.

Part of #166
ligurio added a commit that referenced this issue Jul 15, 2021
CRUD allows to automatically calculate `bucket_id` based on primary key
or one can specify `bucket_id` explicitly [1]. However it is often
required to calculate `bucket_id` using sharding keys created by DDL schema.

DDL module exposes space with sharding keys as a part of public API [2],
so everyone is allowed to set and get sharding keys there without adding
DDL module to dependencies.

Patch allows to calculate `bucket_id` value automatically when sharding keys
specified using DDL module or manually in `_ddl_sharding_key` space.

1. #46
2. https://github.com/tarantool/ddl#api
3. #46 (comment)

Closes #166
ligurio added a commit that referenced this issue Jul 15, 2021
ligurio added a commit that referenced this issue Jul 15, 2021
Calculate bucket_id for operations replace, insert and upsert
using DDL sharding key.

Part of #166
ligurio added a commit that referenced this issue Jul 15, 2021
CRUD allows to automatically calculate `bucket_id` based on primary key
or one can specify `bucket_id` explicitly [1]. However it is often
required to calculate `bucket_id` using sharding keys created by DDL schema.

DDL module exposes space with sharding keys as a part of public API [2],
so everyone is allowed to set and get sharding keys there without adding
DDL module to dependencies.

Patch allows to calculate `bucket_id` value automatically when sharding keys
specified using DDL module or manually in `_ddl_sharding_key` space.

1. #46
2. https://github.com/tarantool/ddl#api
3. #46 (comment)

Closes #166
ligurio added a commit that referenced this issue Nov 26, 2021
It's possible to drop cache with structures used for secondary sharding
keys support with command "require('crud.sharding_key').update_sharding_keys_cache()".

Part of #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
ligurio added a commit that referenced this issue Nov 26, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Co-authored-by: Georgy Moiseev <Georgy.moiseev@corp.mail.ru>
ligurio added a commit that referenced this issue Nov 26, 2021
Add a new wrapper for replicaset:call() that will be used in
_fetch_on_router().

Part of #166

Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
ligurio added a commit that referenced this issue Nov 26, 2021
Part of #166

Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Reviewed-by: Oleg Babin <babinoleg@mail.ru>
ligurio added a commit that referenced this issue Nov 26, 2021
NOTE: Prior to this patch CRUD assumes an index is unique. It was true
for the primary key, but it is not guaranteed for a sharding key. Patch
adds a tests with select() for non-unique index that failed due to
assumption regarding uniq index in crud/select/plan.lua. Seems we
can remove this condition and fix tests that relies on
total_tuple_count == 1. See also related discussion in [1].

1. #181 (comment)

Part of #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
ligurio added a commit that referenced this issue Nov 26, 2021
Part of #166

Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Reviewed-by: Oleg Babin <babinoleg@mail.ru>
ligurio added a commit that referenced this issue Nov 26, 2021
It's possible to drop cache with structures used for secondary sharding
keys support with command "require('crud.sharding_key').update_sharding_keys_cache()".

Part of #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
ligurio added a commit that referenced this issue Nov 26, 2021
Describe functionality and current limitations (#212, #213 and #219)
with custom sharding key in CHANGELOG and README.

Closes #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Co-authored-by: Georgy Moiseev <Georgy.moiseev@corp.mail.ru>
ligurio added a commit that referenced this issue Nov 26, 2021
Part of #166

Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Reviewed-by: Oleg Babin <babinoleg@mail.ru>
ligurio added a commit that referenced this issue Nov 26, 2021
NOTE: Prior to this patch CRUD assumes an index is unique. It was true
for the primary key, but it is not guaranteed for a sharding key. Patch
adds a tests with select() for non-unique index that failed due to
assumption regarding uniq index in crud/select/plan.lua. Seems we
can remove this condition and fix tests that relies on
total_tuple_count == 1. See also related discussion in [1].

1. #181 (comment)

Part of #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
ligurio added a commit that referenced this issue Nov 26, 2021
Part of #166

Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Reviewed-by: Oleg Babin <babinoleg@mail.ru>
ligurio added a commit that referenced this issue Nov 26, 2021
It's possible to drop cache with structures used for secondary sharding
keys support with command "require('crud.sharding_key').update_sharding_keys_cache()".

Part of #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
ligurio added a commit that referenced this issue Nov 26, 2021
Describe functionality and current limitations (#212, #213, #219, #243)
with custom sharding key in CHANGELOG and README.

Closes #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Co-authored-by: Georgy Moiseev <Georgy.moiseev@corp.mail.ru>
ligurio added a commit that referenced this issue Nov 26, 2021
Part of #166

Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Reviewed-by: Oleg Babin <babinoleg@mail.ru>
ligurio added a commit that referenced this issue Nov 26, 2021
NOTE: Prior to this patch CRUD assumes an index is unique. It was true
for the primary key, but it is not guaranteed for a sharding key. Patch
adds a tests with select() for non-unique index that failed due to
assumption regarding uniq index in crud/select/plan.lua. Seems we
can remove this condition and fix tests that relies on
total_tuple_count == 1. See also related discussion in [1].

1. #181 (comment)

Part of #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
ligurio added a commit that referenced this issue Nov 26, 2021
Part of #166

Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Reviewed-by: Oleg Babin <babinoleg@mail.ru>
ligurio added a commit that referenced this issue Nov 26, 2021
It's possible to drop cache with structures used for secondary sharding
keys support with command "require('crud.sharding_key').update_sharding_keys_cache()".

Part of #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
ligurio added a commit that referenced this issue Nov 26, 2021
Describe functionality and current limitations (#212, #213, #219, #243)
with custom sharding key in CHANGELOG and README.

Closes #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Co-authored-by: Georgy Moiseev <Georgy.moiseev@corp.mail.ru>
ligurio added a commit that referenced this issue Nov 27, 2021
Describe functionality and current limitations (#212, #213, #219, #243)
with custom sharding key in CHANGELOG and README.

Thanks to Oleg Babin (@olegrok) and Alexander Turenko (@Totktonada) for
help with feature implementation.

Closes #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Co-authored-by: Georgy Moiseev <Georgy.moiseev@corp.mail.ru>
ligurio added a commit that referenced this issue Nov 27, 2021
Add a new wrapper for replicaset:call() that will be used in
_fetch_on_router().

Part of #166

Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
ligurio added a commit that referenced this issue Nov 27, 2021
Part of #166

Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Reviewed-by: Oleg Babin <babinoleg@mail.ru>
ligurio added a commit that referenced this issue Nov 27, 2021
NOTE: Prior to this patch CRUD assumes an index is unique. It was true
for the primary key, but it is not guaranteed for a sharding key. Patch
adds a tests with select() for non-unique index that failed due to
assumption regarding uniq index in crud/select/plan.lua. Seems we
can remove this condition and fix tests that relies on
total_tuple_count == 1. See also related discussion in [1].

1. #181 (comment)

Part of #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
ligurio added a commit that referenced this issue Nov 27, 2021
Part of #166

Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Reviewed-by: Oleg Babin <babinoleg@mail.ru>
ligurio added a commit that referenced this issue Nov 27, 2021
It's possible to drop cache with structures used for secondary sharding
keys support with command "require('crud.sharding_key').update_sharding_keys_cache()".

Part of #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
ligurio added a commit that referenced this issue Nov 27, 2021
Describe functionality and current limitations (#212, #213, #219, #243)
with custom sharding key in CHANGELOG and README.

Thanks to Oleg Babin (@olegrok) and Alexander Turenko (@Totktonada) for
help with feature implementation.

Closes #166

Reviewed-by: Oleg Babin <babinoleg@mail.ru>
Reviewed-by: Alexander Turenko <alexander.turenko@tarantool.org>
Co-authored-by: Georgy Moiseev <Georgy.moiseev@corp.mail.ru>
Totktonada added a commit that referenced this issue Jan 27, 2022
When a sharding key is updated on storages a user should call this
function on routers to re-fetch the new sharding keys. I would highlight
that even if we add a space that was never seen before, we should call
the function on routers. Otherwise crud will assume that the new space
has the sharding key equal to the primary key and will calculate
`bucket_id` incorrectly.

Follows up #166
Related to #212
Related to TNT-262
Totktonada added a commit that referenced this issue Jan 27, 2022
When a sharding key is updated on storages a user should call this
function on routers to re-fetch the new sharding keys. I would highlight
that even if we add a space that was never seen before, we should call
the function on routers. Otherwise crud will assume that the new space
has the sharding key equal to the primary key and will calculate
`bucket_id` incorrectly.

Follows up #166
Related to #212
Related to TNT-462
Totktonada added a commit that referenced this issue Jan 31, 2022
When a sharding key is updated on storages a user should call this
function on routers to re-fetch the new sharding keys. I would highlight
that even if we add a space that was never seen before, we should call
the function on routers. Otherwise crud will assume that the new space
has the sharding key equal to the primary key and will calculate
`bucket_id` incorrectly.

Follows up #166
Related to #212
Related to TNT-462
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
customer feature A new functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants