Move prepareClientToWrite out of loop for lrange command to reduce the redundant call. #860

lipzhu · 2024-08-01T07:16:42Z

Description

When I explore the cycles distributions for lrange test ( valkey-benchmark -p 9001 -t lrange -d 100 -r 1000000 -n 1000000 -c 50 --threads 4). I found the prepareClientToWrite and clientHasPendingReplies could be reduced to single call outside instead of called in a loop, ideally we can gain 3% performance. The corresponding LRANG_100, LRANG_300, LRANGE_500, LRANGE_600 have ~2% - 3% performance boost, the benchmark test prove it helps.

This patch try to move the prepareClientToWrite and its child clientHasPendingReplies out of the loop to reduce the function overhead.

Test Environment

OPERATING SYSTEM: Ubuntu 22.04.4 LTS
Kernel: 5.15.0-116-generic
PROCESSOR: Intel Xeon Platinum 8380
Server and Client in same socket.

Server Configuration

taskset -c 0-3 ~/valkey/src/valkey-server /tmp/valkey.conf

port 9001
bind * -::*
daemonize no
protected-mode no
save ""

Benchmark Results

Test Name	Perf Boost
memtier_benchmark-1key-list-10-elements-lrange-all-elements	2%
memtier_benchmark-1key-list-100-elements-lrange-all-elements	3%
memtier_benchmark-1key-list-1K-elements-lrange-all-elements	2%

loop for lrange command. --------- Signed-off-by: Lipeng Zhu <lipeng.zhu@intel.com> Co-authored-by: Wangyang Guo <wangyang.guo@intel.com>

codecov · 2024-08-01T07:56:30Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.32%. Comparing base (b728e41) to head (b2f2001).
Report is 55 commits behind head on unstable.

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable     #860      +/-   ##
============================================
- Coverage     70.47%   70.32%   -0.16%     
============================================
  Files           112      113       +1     
  Lines         61467    61744     +277     
============================================
+ Hits          43320    43422     +102     
- Misses        18147    18322     +175

Files	Coverage Δ
src/networking.c	`88.47% <100.00%> (-0.38%)`	⬇️
src/server.h	`100.00% <ø> (ø)`
src/t_list.c	`92.60% <100.00%> (-0.26%)`	⬇️

... and 37 files with indirect coverage changes

lipzhu · 2024-08-05T08:24:51Z

Ping @valkey-io/core-team, could you help to take a look?

src/server.h

src/networking.c

madolson · 2024-08-06T17:32:28Z

@lipzhu So I have a proposal to add some more type checking, which alleviates my concerns to prevent someone from misusing this API.

--------- Signed-off-by: Lipeng Zhu <lipeng.zhu@intel.com> Co-authored-by: Wangyang Guo <wangyang.guo@intel.com>

lipzhu · 2024-08-09T07:07:57Z

@lipzhu So I have a proposal to add some more type checking, which alleviates my concerns to prevent someone from misusing this API.

@madolson I am not sure I got your point, I just simply introduce the writeReadyClient which is alias of client, or do you prefer a new writeReadyClient struct to wrap the client?

lipzhu · 2024-08-12T09:04:07Z

Or we can introduce a new flag like write_ready in client structure, set write_ready = 1 before iteration and reset it after exit?

madolson · 2024-08-13T21:51:59Z

Or we can introduce a new flag like write_ready in client structure, set write_ready = 1 before iteration and reset it after exit?

We would have to reset it at the end of the command evocation. My concern is that someone might miss resetting it. There is no good idiomatic way to execute a defer in C afaik.

hpatro · 2024-08-13T22:46:32Z

I think the application of this writeReadyClient API will be in plenty of places if we accept it. The part I'm not sure about is how/when devs will pick the writeReady API(s) over the regular API(s). Also, we would need to duplicate most of the addReply API(s).

Overall, I feel introducing the client flag write_ready seems to introduce less redundant code and achieves the goal. Regarding the cleanup can't we do it in afterCommand flow once and guarantee the reset (not everyone needs to remember to do it).

madolson · 2024-08-14T03:59:28Z

Regarding the cleanup can't we do it in afterCommand flow once and guarantee the reset (not everyone needs to remember to do it).

We could maybe do it in afterCommand. Would need to go look through the various edge cases like client disconnects that might get bypassed. @lipzhu Do you want to prototype that? Otherwise I'm fine merging this as is.

lipzhu · 2024-08-15T02:28:58Z

Regarding the cleanup can't we do it in afterCommand flow once and guarantee the reset (not everyone needs to remember to do it).

We could maybe do it in afterCommand. Would need to go look through the various edge cases like client disconnects that might get bypassed. @lipzhu Do you want to prototype that? Otherwise I'm fine merging this as is.

@madolson Maybe we can merge this firstly. Regarding the proposal, I can open a new PR which focus on the code refactor.