Release 2.3.0 #818

hycdong · 2021-09-22T12:01:29Z

Since Pegasus 2.2.0 (released on June 2021), there are 170 commits, including several useful features and significant bug fix. We are ready to release Apache Pegasus 2.3.0.

New features

Partition split

Supporting scalability for table. One partition will be divided into two partitions. If the original partition count is 4, after partition split, the new partition count will be 8. More details can be found: partition-split-design-documents.
Related pull request in this release:

More detailed pull requests can be found: [#754]

User-defined compaction strategy

Supporting user specified compaction policy, more details can be found: user-specified-compaction-RFC. Related pull requests:

Cluster load balance

Supporting whole cluster load balance, more details can be found: [#761], related pull requests:

One time backup

Supporting trigger backup once immediately, more details can be found: [#755], related pull requests:

Enhancement

Support uint8 in data input and output
- feat: support uint8_t for data_output and data_input XiaoMi/rdsn#792
Support user specified restore path
- feat(restore): support user specified restore path XiaoMi/rdsn#816
- fix(restore): fix not found error when restore app metadata XiaoMi/rdsn#822
Add rate limiter for backup request
- feat: add rate limiter for backup request XiaoMi/rdsn#855
- feat: add perf counter for backup request limiter #779
Support reject client write requests while disk space insufficient
- feat(disk): reject write if disk space is insufficient XiaoMi/rdsn#833
- fix(disk): fix disk space insufficient bug XiaoMi/rdsn#851
Support broken disk check while initialization and add disk dynamically
Support release all tcmalloc reserved but not used memory
- feat: supporting release all tcmalloc reserved but not used memory XiaoMi/rdsn#864
Support disable block cache of an app
- feat: add an env to enable or disable block cache of an app XiaoMi/rdsn#865
- feat: add an env to enable or disable block cache of an app #792
Support token bucket in fds configurable
- feat: make token bucket in fds configurable XiaoMi/rdsn#874
Add a new thread pool to process range read
- feat: add a new thread pool to process slow query XiaoMi/rdsn#856
- feat: split read request thread pool #782
Support preserving TTL while executing copy_data
- feat: support preserving TTL for copy_data #752
Support nfs server rate limit
- feat: support nfs server rate limit XiaoMi/rdsn#901
Range read count enhancement
- feat: range read count enhancement #811

New perf-counters

Partition split related counters
Backup request limiter counter
- feat: add rate limiter for backup request XiaoMi/rdsn#855
- feat: add perf counter for backup request limiter #779
Dropped timeout rpc count counter
Server session count counter
- feat: update dropped-timeout-rpc-count counter type XiaoMi/rdsn#867
Table-level reject user write request count while bulk load
- feat(bulk_load): add replica-level bulk load reject write counter XiaoMi/rdsn#895
- feat(bulk-load): add table-level bulk load reject count #805
Table-level hotpot partition count counter
- feat(hotspot): add a cluster level perf_counter to display hotspot #732
Backup request size counter
- feat: add perf counter for backup request size #742
RocksDB read/wrire amplification and hit rate
- feat: add more rocksdb perf-counter support #774
- feat(collector): support more table-level rocksdb perfcounter #800
Unmarshall failed request count counter
- fix: catch exception if unmarshall the rpc request encounters error #790
Table-level compaction counters
- feat: support table level rocksdb compaction perfcounter #806

Bug Fix

Duplication related fix

Graceful exit

Thrift unmarshall fix

Asan fix

Others

Fix shell can not find factory type provider: fix: use dsn::tools::native_aio_provider as default value of FLAGS_aio_factory_name XiaoMi/rdsn#774
Fix unit tests can not execute destructor: fix(cmake): define ENABLE_GCOV when build with --enable_gcov XiaoMi/rdsn#793
Fix init_table_level_latency_counter metric name error: fix: fix replica::init_table_level_latency_counter metric name error XiaoMi/rdsn#830
Fix cold backup execution while policy is disabled: fix(backup_policy): ignore current backup when policy is diabled XiaoMi/rdsn#840
Fix C++ client won't retry while receiving splitting or disk_insufficient: fix(client): client not retry while receive splitting or disk insufficient error XiaoMi/rdsn#852, feat(client): add splitting and disk insuffient error #777
Fix tracer mutation name: fix: update mutation tracer name to mutation->name() XiaoMi/rdsn#876
Support single replica for load balance check: feat: support single replica XiaoMi/rdsn#932
Fix mutation_log_test failed when compile with debug mode: fix: mutation_log_test failed when compile with debug type XiaoMi/rdsn#933
Add incr and duplicate qps into total_wrire_qps: fix: add incr_qps and duplicate_qps to total_write_qps #721
Fix dependency while pack tools: fix: add lib while pack tools #758
Fix rolling_update script hint message: feat: update usage hint for rolling update script #759
Fix move hdfs script into script folder: fix: move config_hdfs.sh to scripts directory #775
Use reference to catch exceptions: fix: use reference to catch exception #798
Fix pack script: fix(script): empty pack_template leads to 'sed: no input files' #821
Fix full scan bug: fix: full_scan can't scan data completely in some occassions #825
Fix prometheus counter name bug: fix: coredump when table name contains '_' and prometheus is enabled #828

Refactor

Refactor load balance

Refactor pegasus_value_schema

Others

Remove useless structure admission controller: refactor: remove unused adminssion controller XiaoMi/rdsn#684
Refactor nfs client and nfs server: refactor: simplify nfs client and nfs server XiaoMi/rdsn#771
Remove useless parameters of aio_internal: refactor: remove the unused parameter 'async' for native_linux_aio_provider::aio_internal XiaoMi/rdsn#777, fix: fix core while running simple_kv unit test XiaoMi/rdsn#782
Separate aio from runtime: refactor: make aio decoupled from dsn_runtime XiaoMi/rdsn#778, refactor: remove unused config aio_factory_name XiaoMi/rdsn#794, refactor: delete unused config aio_factory_name #699, revert: do not link dsn_aio to client and test libraries #703
Remove useless profile commands from remote command: refactor: remove profiler related remote commands XiaoMi/rdsn#810
Remove explorer: refactor: remove explorer from rdsn XiaoMi/rdsn#811
Refactor replication_options: refactor: refactor replication_options initialize function XiaoMi/rdsn#831
Remove table_stats structure: refactor: remove the useless class table_stats #720

No code update

Configurations

[apps.replica]
- pools = THREAD_POOL_DEFAULT,THREAD_POOL_REPLICATION_LONG,THREAD_POOL_REPLICATION,THREAD_POOL_FD,THREAD_POOL_LOCAL_APP,THREAD_POOL_BLOCK_SERVICE,THREAD_POOL_COMPACT,THREAD_POOL_INGESTION,THREAD_POOL_SLOG,THREAD_POOL_PLOG
+ pools = THREAD_POOL_DEFAULT,THREAD_POOL_REPLICATION_LONG,THREAD_POOL_REPLICATION,THREAD_POOL_FD,THREAD_POOL_LOCAL_APP,THREAD_POOL_BLOCK_SERVICE,THREAD_POOL_COMPACT,THREAD_POOL_INGESTION,THREAD_POOL_SLOG,THREAD_POOL_PLOG,THREAD_POOL_SCAN

+[threadpool.THREAD_POOL_SCAN]
+  name = scan_query
+  partitioned = false
+  worker_priority = THREAD_xPRIORITY_NORMAL
+  worker_count = 24

[meta_server]
+ balance_cluster=false
+ balance_op_count_per_round=10

[nfs]
+ max_send_rate_megabytes=500

[replication]
+ reject_write_when_disk_insufficient=true
+ disk_min_available_space_ratio=10
+ ignore_broken_disk=true

[pegasus.server]
+ read_amp_bytes_per_bit = 0 # 0 means disble read amp counter
- update_rdb_stat_interval = 600
+ update_rdb_stat_interval = 60

// Add drop timeout request config for the specific task code
[task.RPC_RRDB_RRDB_PUT]
+  rpc_request_dropped_before_execution_when_timeout = true

[task.RPC_RRDB_RRDB_GET]
+  rpc_request_dropped_before_execution_when_timeout = true

Perf-Counters

// partition split related
+ replica*eon.replica_stub*replicas.splitting.count
+ replica*eon.replica_stub*replicas.splitting.max.duration.time(ms)
+ replica*eon.replica_stub*replicas.splitting.max.async.learn.time(ms)
+ replica*eon.replica_stub*replicas.splitting.max.copy.file.size
+ replica*eon.replica_stub*replicas.splitting.recent.start.count
+ replica*eon.replica_stub*replicas.splitting.recent.copy.file.count
+ replica*eon.replica_stub*replicas.splitting.recent.copy.file.size
+ replica*eon.replica_stub*replicas.splitting.recent.copy.mutation.count
+ replica*eon.replica_stub*replicas.splitting.succ.count
+ replica*eon.replica_stub*replicas.splitting.fail.count
+ replica*eon.replica*recent.write.splitting.reject.count@[gpid]
+ replica*eon.replica*recent.read.splitting.reject.count@[gpid]
+ collector*app.pegasus*app.stat.recent_write_splitting_reject_count#[table_name]
+ collector*app.pegasus*app.stat.recent_read_splitting_reject_count#[table_name]

// backup request throttling
+ replica*eon.replica*recent.backup.request.throttling.delay.count@[table_name]
+ replica*eon.replica*recent.backup.request.throttling.reject.count@[table_name]
+ collector*app.pegasus*app.stat.recent_backup_request_throttling_delay_count#[table_name]
+ collector*app.pegasus*app.stat.recent_backup_request_throttling_reject_count#[table_name]

// table-level hotpot partition count
+ collector*app.pegasus*app.stat.hotspots.temp.read.total#[table_name]
+ collector*app.pegasus*app.stat.hotspots.temp.write.total#[table_name]

// backup request size
+ replica*app.pegasus*backup_request_bytes@[gpid]
+ collector*app.pegasus*backup_request_bytes@[table_name]

// rocksdb read write amplification, hit count
+ replica*app.pegasus*rdb.read_amplification@[gpid]
+ replica*app.pegasus*rdb.write_amplification@[gpid]
+ replica*app.pegasus.rdb.read_memtable_total_count@[gpid]
+ replica*app.pegasus.rdb.read_memtable_hit_count@[gpid]
+ replica*app.pegasus*rdb.read_l0_hit_count@[gpid]
+ replica*app.pegasus*rdb.read_l1_hit_count@[gpid]
+ replica*app.pegasus*rdb.read_l2andup_hit_count@[gpid]
+ collector*app.pegasus*app.stat.rdb_read_amplification#[table_name]
+ collector*app.pegasus*app.stat.rdb.write_amplification#[table_name]
+ collector*app.pegasus*app.stat.rdb.read_memtable_hit_rate#[table_name]
+ collector*app.pegasus*app.stat.rdb.read_l0_hit_rate#[table_name]
+ collector*app.pegasus*app.stat.rdb.read_l1_hit_rate#[table_name]
+ collector*app.pegasus*app.stat.rdb.read_l2andup_hit_rate#[table_name]

// session count 
+ server*network*client_session_count

// bulk load reject write request
- replica_stub.bulk.load.ingestion.reject.write.count
+ replica*eon.replica*recent.write.bulk.load.ingestion.reject.count@[gpid]
+ collector*app.pegasus*app.stat.recent_write_bulk_load_ingestion_reject_count#[table_name]

// RocksDB compaction
+ collector*app.pegasus*app.stat.recent_rdb_compaction_input_bytes#[table_name]
+ collector*app.pegasus*app.stat.recent_rdb_compaction_output_bytes#[table_name]

// unmarshall failed count
+ replica*app.pegasus*recent_corrupt_write_count@[gpid]

// If drop timeout request for task, the counter will be added, for example(RPC_RRDB_RRDB_PUT):
+ zion*profiler*RPC_RRDB_RRDB_PUT.rpc.dropped

Performance

The following result is tested by YCSB, and the latency unit is us.

Case	client and thread	R:W	R-QPS	R-Avg	R-P99	W-QPS	W-Avg	W-P99
Write Only	3 clients * 15 threads	0:1	-	-	-	42386	1060	6628
Read Only	3 clients * 50 threads	1:0	331623	585	2611	-	-	-
Read Write	3 clients * 30 threads	1:1	38766	1067	15521	38774	1246	7791
Read Write	3 clients * 15 threads	1:3	13140	819	11460	39428	863	4884
Read Write	3 clients * 15 threads	1:30	1552	937	9524	46570	930	5315
Read Write	3 clients * 30 threads	3:1	93746	623	6389	31246	996	5543
Read Write	3 clients * 50 threads	30:1	254534	560	2627	8481	901	3269

Contributors

acelyc111
cauchy1988
empiredan
hycdong
levy5307
lidingshengHHU
neverchanje
padmejin
Shuo-Jia
Smityz
zhangyifan27
ZhongChaoqiang

The text was updated successfully, but these errors were encountered:

hycdong added type/enhancement Indicates new feature requests release-note Notes on the version release and removed type/enhancement Indicates new feature requests labels Sep 22, 2021

hycdong changed the title ~~Prepare to Release 2.3.0~~ Release 2.3.0 Dec 2, 2021

hycdong closed this as completed Dec 30, 2021

acelyc111 pushed a commit that referenced this issue Jun 23, 2022

fix: support retry when io write incompletely (#818)

b4b9a4e

foreverneverer added the type/incompatible Changes that introduced incompatibility to Pegasus. label Aug 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 2.3.0 #818

Release 2.3.0 #818

hycdong commented Sep 22, 2021 •

edited

Loading

Release 2.3.0 #818

Release 2.3.0 #818

Comments

hycdong commented Sep 22, 2021 • edited Loading

New features

Partition split

User-defined compaction strategy

Cluster load balance

One time backup

Enhancement

New perf-counters

Bug Fix

Duplication related fix

Graceful exit

Thrift unmarshall fix

Asan fix

Others

Refactor

Refactor load balance

Refactor pegasus_value_schema

Others

No code update

Configurations

Perf-Counters

Performance

Contributors

hycdong commented Sep 22, 2021 •

edited

Loading