Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize ZUNION[STORE] command by removing unnecessary accumulator dict #829

Merged
merged 2 commits into from
Aug 13, 2024

Conversation

RayaCoo
Copy link
Contributor

@RayaCoo RayaCoo commented Jul 26, 2024

In the past implementation of ZUNION and ZUNIONSTORE commands, we first create a temporary dict called accumulator. After adding all member-score mappings to accumulator, we still need to convert accumulator back to the final dict dstzset->dict. However, we can directly use dstzset->dict to avoid the additional copy operation.

This PR removes the accumulator dict and directly usesdstzset->dictto store the member-score mappings.

  • Test
    First, I added 300 unique elements to two sorted sets called 'zunion_test1' and 'zunion_test2'. Then, I tested zunion and zunionstore on these two sorted sets. The test results shown below indicate that the performance of both zunion and zunionstore improved about 31%.

ZUNION

unstable

./valkey-benchmark -P 10 -n 100000 zunion 2 zunion_test1 zunion_test2

Summary:
  throughput summary: 2713.41 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
      146.252     3.464   153.343   182.015   184.959   192.895

This PR

./valkey-benchmark -P 10 -n 100000 zunion 2 zunion_test1 zunion_test2

Summary:
  throughput summary: 3543.84 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
      108.259     2.984   114.239   141.695   145.151   160.255

ZUNIONSTORE

unstable

./valkey-benchmark -P 10 -n 100000 zunionstore out 2 zunion_test1 zunion_test2

Summary:
  throughput summary: 3168.07 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
      157.511     3.368   183.167   189.311   193.535   231.679

This PR

./valkey-benchmark -P 10 -n 100000 zunionstore out 2 zunion_test1 zunion_test2

Summary:
  throughput summary: 4144.73 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
      120.374     2.648   141.823   149.119   153.855   183.167

Signed-off-by: RayCao <zisong.cw@alibaba-inc.com>
Copy link

codecov bot commented Jul 26, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.27%. Comparing base (36e81d9) to head (325979a).
Report is 38 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable     #829      +/-   ##
============================================
- Coverage     70.46%   70.27%   -0.20%     
============================================
  Files           112      112              
  Lines         61135    61463     +328     
============================================
+ Hits          43077    43191     +114     
- Misses        18058    18272     +214     
Files Coverage Δ
src/t_zset.c 95.60% <100.00%> (+0.04%) ⬆️

... and 26 files with indirect coverage changes

@madolson madolson requested a review from hpatro August 7, 2024 23:15
src/t_zset.c Show resolved Hide resolved
Signed-off-by: zisong.cw <zisong.cw@alibaba-inc.com>
@soloestoy soloestoy added the release-notes This issue should get a line item in the release notes label Aug 13, 2024
@soloestoy soloestoy merged commit 76f809b into valkey-io:unstable Aug 13, 2024
47 checks passed
mapleFU pushed a commit to mapleFU/valkey that referenced this pull request Aug 21, 2024
…ct (valkey-io#829)

In the past implementation of `ZUNION` and `ZUNIONSTORE` commands, we
first create a temporary dict called `accumulator`. After adding all
member-score mappings to `accumulator`, we still need to convert
`accumulator` back to the final dict `dstzset->dict`. However, we can
directly use `dstzset->dict` to avoid the additional copy operation.

This PR removes the `accumulator` dict and directly uses` dstzset->dict
`to store the member-score mappings.

- **Test**
First, I added 300 unique elements to two sorted sets called
'zunion_test1' and 'zunion_test2'. Then, I tested `zunion` and
`zunionstore` on these two sorted sets. The test results shown below
indicate that the performance of both zunion and zunionstore improved
about 31%.

### ZUNION
#### unstable
```
./valkey-benchmark -P 10 -n 100000 zunion 2 zunion_test1 zunion_test2

Summary:
  throughput summary: 2713.41 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
      146.252     3.464   153.343   182.015   184.959   192.895
```
#### This PR
```
./valkey-benchmark -P 10 -n 100000 zunion 2 zunion_test1 zunion_test2

Summary:
  throughput summary: 3543.84 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
      108.259     2.984   114.239   141.695   145.151   160.255
```
### ZUNIONSTORE
#### unstable
```
./valkey-benchmark -P 10 -n 100000 zunionstore out 2 zunion_test1 zunion_test2

Summary:
  throughput summary: 3168.07 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
      157.511     3.368   183.167   189.311   193.535   231.679
```
#### This PR
```
./valkey-benchmark -P 10 -n 100000 zunionstore out 2 zunion_test1 zunion_test2

Summary:
  throughput summary: 4144.73 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
      120.374     2.648   141.823   149.119   153.855   183.167
```

---------

Signed-off-by: RayCao <zisong.cw@alibaba-inc.com>
Signed-off-by: zisong.cw <zisong.cw@alibaba-inc.com>
Signed-off-by: mwish <maplewish117@gmail.com>
mapleFU pushed a commit to mapleFU/valkey that referenced this pull request Aug 22, 2024
…ct (valkey-io#829)

In the past implementation of `ZUNION` and `ZUNIONSTORE` commands, we
first create a temporary dict called `accumulator`. After adding all
member-score mappings to `accumulator`, we still need to convert
`accumulator` back to the final dict `dstzset->dict`. However, we can
directly use `dstzset->dict` to avoid the additional copy operation.

This PR removes the `accumulator` dict and directly uses` dstzset->dict
`to store the member-score mappings.

- **Test**
First, I added 300 unique elements to two sorted sets called
'zunion_test1' and 'zunion_test2'. Then, I tested `zunion` and
`zunionstore` on these two sorted sets. The test results shown below
indicate that the performance of both zunion and zunionstore improved
about 31%.

### ZUNION
#### unstable
```
./valkey-benchmark -P 10 -n 100000 zunion 2 zunion_test1 zunion_test2

Summary:
  throughput summary: 2713.41 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
      146.252     3.464   153.343   182.015   184.959   192.895
```
#### This PR
```
./valkey-benchmark -P 10 -n 100000 zunion 2 zunion_test1 zunion_test2

Summary:
  throughput summary: 3543.84 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
      108.259     2.984   114.239   141.695   145.151   160.255
```
### ZUNIONSTORE
#### unstable
```
./valkey-benchmark -P 10 -n 100000 zunionstore out 2 zunion_test1 zunion_test2

Summary:
  throughput summary: 3168.07 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
      157.511     3.368   183.167   189.311   193.535   231.679
```
#### This PR
```
./valkey-benchmark -P 10 -n 100000 zunionstore out 2 zunion_test1 zunion_test2

Summary:
  throughput summary: 4144.73 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
      120.374     2.648   141.823   149.119   153.855   183.167
```

---------

Signed-off-by: RayCao <zisong.cw@alibaba-inc.com>
Signed-off-by: zisong.cw <zisong.cw@alibaba-inc.com>
Signed-off-by: mwish <maplewish117@gmail.com>
madolson pushed a commit that referenced this pull request Sep 2, 2024
…ct (#829)

In the past implementation of `ZUNION` and `ZUNIONSTORE` commands, we
first create a temporary dict called `accumulator`. After adding all
member-score mappings to `accumulator`, we still need to convert
`accumulator` back to the final dict `dstzset->dict`. However, we can
directly use `dstzset->dict` to avoid the additional copy operation.

This PR removes the `accumulator` dict and directly uses` dstzset->dict
`to store the member-score mappings.

- **Test**
First, I added 300 unique elements to two sorted sets called
'zunion_test1' and 'zunion_test2'. Then, I tested `zunion` and
`zunionstore` on these two sorted sets. The test results shown below
indicate that the performance of both zunion and zunionstore improved
about 31%.

### ZUNION
#### unstable
```
./valkey-benchmark -P 10 -n 100000 zunion 2 zunion_test1 zunion_test2

Summary:
  throughput summary: 2713.41 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
      146.252     3.464   153.343   182.015   184.959   192.895
```
#### This PR
```
./valkey-benchmark -P 10 -n 100000 zunion 2 zunion_test1 zunion_test2

Summary:
  throughput summary: 3543.84 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
      108.259     2.984   114.239   141.695   145.151   160.255
```
### ZUNIONSTORE
#### unstable
```
./valkey-benchmark -P 10 -n 100000 zunionstore out 2 zunion_test1 zunion_test2

Summary:
  throughput summary: 3168.07 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
      157.511     3.368   183.167   189.311   193.535   231.679
```
#### This PR
```
./valkey-benchmark -P 10 -n 100000 zunionstore out 2 zunion_test1 zunion_test2

Summary:
  throughput summary: 4144.73 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
      120.374     2.648   141.823   149.119   153.855   183.167
```

---------

Signed-off-by: RayCao <zisong.cw@alibaba-inc.com>
Signed-off-by: zisong.cw <zisong.cw@alibaba-inc.com>
madolson pushed a commit that referenced this pull request Sep 3, 2024
…ct (#829)

In the past implementation of `ZUNION` and `ZUNIONSTORE` commands, we
first create a temporary dict called `accumulator`. After adding all
member-score mappings to `accumulator`, we still need to convert
`accumulator` back to the final dict `dstzset->dict`. However, we can
directly use `dstzset->dict` to avoid the additional copy operation.

This PR removes the `accumulator` dict and directly uses` dstzset->dict
`to store the member-score mappings.

- **Test**
First, I added 300 unique elements to two sorted sets called
'zunion_test1' and 'zunion_test2'. Then, I tested `zunion` and
`zunionstore` on these two sorted sets. The test results shown below
indicate that the performance of both zunion and zunionstore improved
about 31%.

### ZUNION
#### unstable
```
./valkey-benchmark -P 10 -n 100000 zunion 2 zunion_test1 zunion_test2

Summary:
  throughput summary: 2713.41 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
      146.252     3.464   153.343   182.015   184.959   192.895
```
#### This PR
```
./valkey-benchmark -P 10 -n 100000 zunion 2 zunion_test1 zunion_test2

Summary:
  throughput summary: 3543.84 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
      108.259     2.984   114.239   141.695   145.151   160.255
```
### ZUNIONSTORE
#### unstable
```
./valkey-benchmark -P 10 -n 100000 zunionstore out 2 zunion_test1 zunion_test2

Summary:
  throughput summary: 3168.07 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
      157.511     3.368   183.167   189.311   193.535   231.679
```
#### This PR
```
./valkey-benchmark -P 10 -n 100000 zunionstore out 2 zunion_test1 zunion_test2

Summary:
  throughput summary: 4144.73 requests per second
  latency summary (msec):
          avg       min       p50       p95       p99       max
      120.374     2.648   141.823   149.119   153.855   183.167
```

---------

Signed-off-by: RayCao <zisong.cw@alibaba-inc.com>
Signed-off-by: zisong.cw <zisong.cw@alibaba-inc.com>
moticless added a commit to redis/redis that referenced this pull request Sep 25, 2024
…3566)

This PR is based on valkey-io/valkey#829

Previously, ZUNION and ZUNIONSTORE commands used a temporary accumulator dict
and at the end copied it as-is to dstzset->dict. This PR removes accumulator and directly
stores into dstzset->dict, eliminating the extra copy.

Co-authored-by: Rayacoo zisong.cw@alibaba-inc.com
@RayaCoo RayaCoo deleted the remove_accumulator_dict branch October 31, 2024 06:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance release-notes This issue should get a line item in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants