-
Notifications
You must be signed in to change notification settings - Fork 35
router: make full map_callrw with split args #628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
router: make full map_callrw with split args #628
Conversation
Gerold103
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks 🔥! I sure hope it will help the customers 😁.
vshard/router/init.lua
Outdated
| local grouped_args_added = false | ||
| for id, rs in pairs(replicasets) do | ||
| if grouped_args ~= nil then | ||
| -- It's cheaper to push and then pop, rather then deepcopy | ||
| -- arguments table for every call. | ||
| table.insert(args, grouped_args[id]) | ||
| local rs_grouped_args = grouped_args[id] | ||
| table.insert(args, rs_grouped_args) | ||
| grouped_args_added = rs_grouped_args ~= nil | ||
| end | ||
| local res, err = rs:callrw('vshard.storage._call', func_args, opts_map) | ||
| if grouped_args ~= nil then | ||
| if grouped_args_added then | ||
| table.remove(args) | ||
| grouped_args_added = false | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You won't need this flag I think if you would change it to something as:
for ... do
local rs_args = grouped_args and grouped_args[id] or nil
if rs_args then table.insert(args, grouped_args[id]) end
-- do the call
if rs_args then table.remove(args) end
endThe code style is a bit relaxed, but you get the idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
vshard/router/init.lua
Outdated
| local mode = bucket_ids and MAP_CALLRW_PARTIAL or MAP_CALLRW_FULL | ||
| return mode, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to return trailing nils in Lua. Can be simplified to return bucket_ids and MAP_CALLRW_PARTIAL or MAP_CALLRW_FULL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
vshard/router/init.lua
Outdated
| if opts then | ||
| mode, err = router_set_map_callrw_mode(opts.mode, opts.bucket_ids) | ||
| if err then | ||
| return nil, err, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. No need for a trailing nil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
vshard/router/init.lua
Outdated
| local mode = MAP_CALLRW_FULL | ||
| if opts then | ||
| mode, err = router_set_map_callrw_mode(opts.mode, opts.bucket_ids) | ||
| if err then | ||
| return nil, err, nil | ||
| end | ||
| timeout = opts.timeout or consts.CALL_TIMEOUT_MIN | ||
| do_return_raw = opts.return_raw |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you assign mode and then overwrite it when opts are provided. Lets not do an overwrite. And instead simply declare the empty mode variable, and with given opts do what you do now, and in the else branch assign it to the "full" default. Then there won't be an overwrite, which in turn might save us a tiny bit of cpu.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
vshard/router/init.lua
Outdated
| -- 4) <mode = 'partial', bucket_id = {[bid_1] = {b_arg_1}, ...}> - the same | ||
| -- as the 3rd scenario but buckets' arguments (b_arg_1, ..., b_arg_n) | ||
| -- will be added as additional arguments to user function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This here and in the commit message might be slightly misleading. It looks like the user should expect the bucket arguments to be passed as positional arguments like func(all_args, bucket1_arg, bucket2_arg, ...). But instead it is more like func(all_args, bucket_args_map). I suggest to clarify this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
The commit message doesn't mention the ticket. You need to write |
This patch introduces a new way of `map_callrw` execution by which we can pass some arguments to all storages and split buckets' arguments to those storages that have at least one bucket of `bucket_ids`. To achieve this we introduce a new string option - `mode` to `map_callrw` api. Closes tarantool#559 @TarantoolBot document Title: vshard: `mode` option for `router.map_callrw()` This string option regulates on which storages the user function will be executed via `map_callrw`. Possible values: 1) `full` - the user function will be executed on all storages in cluster. 2) `partial` - the user function will be executed on certain sotrages on which buckets from `bucket_ids` were found. After that changes `map_callrw` works in 4 different scenarios depending on `mode` and `bucket_ids` options: 1) `<mode = 'full', bucket_ids = nil>` - user function will be executed with args on all storages. 2) `<mode = 'full', bucket_ids = bucket_args_map>` - storages that have at least one bucket of bucket_ids will execute user function with args and additional buckets' arguments. Other storages will execute user function only with args. 3) `<mode = 'partial', bucket_ids = numeric_buckets_map>` - user function will be executed with args on storages that have at least one bucket of bucket_ids. 4) `<mode = 'partial', bucket_ids = bucket_args_map>` - the same as the 3rd scenario but buckets' arguments will be added as additional arguments to user function. Annotations: bucket_args_map = `{[bid_1] = {b_arg_1}, ..., [bid_n] = {b_arg_n}}` numeric_buckets_map = `{bucket_id_1, ..., bucket_id_n}` Also now `map_callrw` ends with error in cases of `<mode = 'full', bucket_ids = {1, 2, ...}>` and `<mode = 'partial', bucket_ids = nil>`.
1f4e9b4 to
11e76dc
Compare
Serpentian
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the patch! It looks great, but I'm afraid, it's a little bit broken now, let's fix and test the new mode with broken cache on the router
| res, err = ivshard.router.map_callrw('assert', {'a'}, | ||
| {bucket_ids = {5ULL, 6ULL}}) | ||
| ilt.assert(res) | ||
| ilt.assert_not(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in the doc request:
the user function will be executed on certain sotrages on
on certain storages
| -- of bucket_ids. | ||
| -- 4) <mode = 'partial', bucket_ids = bucket_args_map> - the same as the 3rd | ||
| -- scenario but buckets' arguments will be added as additional arguments | ||
| -- to user function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment and doc request:
Please, let's describe all available modes in the doc request and the comment above, the API is very difficult, it's better to explicitly mention every available behavior. What we're missing now is the description, when mode is not provided. It's default value is a little bit tricky (full, when no bucket_ids and partial otherwise).
| return nil, lerror.make('Router can\'t execute map_callrw with ' .. | ||
| '\'full\' mode and numeric bucket_ids') | ||
| end | ||
| return opts_mode, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nil in the end is not needed
| local rs_args = grouped_args and grouped_args[id] or nil | ||
| -- It's cheaper to push and then pop, rather then deepcopy | ||
| -- arguments table for every call. | ||
| if rs_args then table.insert(args, rs_args) end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: let's preserve the old style with proper indentation of the conditions. It's easier to look at the code and minimizes the diff, which will look as follows:
diff --git a/vshard/router/init.lua b/vshard/router/init.lua
index b6baeb3..a0be400 100644
--- a/vshard/router/init.lua
+++ b/vshard/router/init.lua
@@ -981,13 +981,14 @@ local function replicasets_map_reduce(replicasets, rid, func, args,
--
local func_args = {'storage_map', rid, func, args}
for id, rs in pairs(replicasets) do
- if grouped_args ~= nil then
+ local rs_args = grouped_args and grouped_args[id]
+ if rs_args ~= nil then
-- It's cheaper to push and then pop, rather then deepcopy
-- arguments table for every call.
table.insert(args, grouped_args[id])
end
local res, err = rs:callrw('vshard.storage._call', func_args, opts_map)
- if grouped_args ~= nil then
+ if rs_args ~= nil then
table.remove(args)
end
if res == nil then| -- arguments table for every call. | ||
| table.insert(args, grouped_args[id]) | ||
| end | ||
| local rs_args = grouped_args and grouped_args[id] or nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or nil is not needed in the end
| router_ref_storage_all(router, timeout) | ||
| end | ||
| if timeout then | ||
| if plain_bucket_ids then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work unfortunately. Previously the router_group_map_callrw_args was invoked right after router_ref_storage_by_buckets, now it can be invoked also after router_ref_storage_all. The first one finds the buckets in the cluster, updates the router map cache and only because of that we could use the router_group_map_callrw_args function. But the router_ref_storage_all doesn't do that.
Currently, the newly added mode (mode = 'full' + split args) doesn't work, when the cache on a router is broken, the arguments will be sent to the wrong instances. Let's won't forget to add the test case for that
This patch introduces a new way of
map_callrwexecution by which we can pass some arguments to all storages and split buckets' arguments to those storages that have at least one bucket ofbucket_ids. To achieve this we introduce a new string option -modetomap_callrwapi.@TarantoolBot document
Title: vshard:
modeoption forrouter.map_callrw()This string option regulates on which storages the user function will be executed via
map_callrw. Possible values:full- the user function will be executed on all storages incluster.
partial- the user function will be executed on certain sotrages onwhich buckets from
bucket_idswere found.After that changes
map_callrwworks in 4 different scenarios depending onmodeandbucket_idsoptions:<mode = 'full', bucket_ids = nil>- user function will be executedwith
argson all storages.<mode = 'full', bucket_ids = {[bid_1] = {b_arg_1}, ...}>- storagesthat have at least one bucket of
bucket_idswill execute userfunction with
argsand additional buckets' arguments. Other storageswill execute user function only with
args.<mode = 'partial', bucket_ids = {bid_1, ...}>- user function will beexecuted with
argson storages that have at least one bucket ofbucket_ids.<mode = 'partial', bucket_id = {[bid_1] = {b_arg_1}, ...}>- the sameas the 3rd scenario but buckets' arguments (b_arg_1, ..., b_arg_n) will
be added as additional arguments to user function.
Also now
map_callrwends with error in cases of<mode = 'full', bucket_ids = {1, 2, ...}>and<mode = 'partial', bucket_ids = nil>.Closes #559