Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High Service Latency: SQL in Cockroach DB v2.0.2 #26830

Closed
akshath opened this issue Jun 19, 2018 · 27 comments
Closed

High Service Latency: SQL in Cockroach DB v2.0.2 #26830

akshath opened this issue Jun 19, 2018 · 27 comments
Assignees
Labels
A-kv-client Relating to the KV client and the KV interface. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-investigation Further steps needed to qualify. C-label will change.
Milestone

Comments

@akshath
Copy link

akshath commented Jun 19, 2018

Hi Team,

I am having very high 'Service Latency: SQL' on my Cockroach DB from a couple of days and I am not able to find the cause.

The application is during about 50 inserts/updates per second and has a 6 node cluster in Digital Ocean. Total records in the system are around 22M.

screen shot 2018-06-19 at 8 41 38 pm

Please do find the debug info, that may help.

Node2# free -m
              total        used        free      shared  buff/cache   available
Mem:          32174       13746         261         216       18166       17788
Swap:             0           0           0
Node2# top

top - 15:25:02 up 49 days,  6:41,  1 user,  load average: 0.65, 0.57, 0.63
Tasks: 156 total,   1 running, 155 sleeping,   0 stopped,   0 zombie
%Cpu(s): 18.5 us,  1.9 sy,  0.0 ni, 74.9 id,  4.0 wa,  0.0 hi,  0.3 si,  0.5 st
KiB Mem : 32946972 total,   357452 free, 14073752 used, 18515768 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 18217832 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                                                                                 
10610 root      20   0 47.357g 0.013t  26480 S   6.2 42.0  26681:01 cockroach                                                                                                                                                                                               
    1 root      20   0  185160   5040   3268 S   0.0  0.0   0:30.06 systemd                                                                                                                                                                                                 
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.07 kthreadd                                                                                                                                                                                                
    3 root      20   0       0      0      0 S   0.0  0.0   2:05.38 ksoftirqd/0                                                                                                                                                                                             
    5 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/0:0H                                                                                                                                                                                            
    7 root      20   0       0      0      0 S   0.0  0.0  10:11.21 rcu_sched                                                                                                                                                                                               
    8 root      20   0       0      0      0 S   0.0  0.0   0:00.00 rcu_bh                                                                                                                                                                                                  
    9 root      rt   0       0      0      0 S   0.0  0.0   0:00.57 migration/0                                                                                                                                                                                             
   10 root      rt   0       0      0      0 S   0.0  0.0   0:14.38 watchdog/0                                                                                                                                                                                              
   11 root      rt   0       0      0      0 S   0.0  0.0   0:13.54 watchdog/1                                                                                                                                                                                              
   12 root      rt   0       0      0      0 S   0.0  0.0   0:00.58 migration/1                                                                                                                                                                                             
   13 root      20   0       0      0      0 S   0.0  0.0  11:16.16 ksoftirqd/1                                                                                                                                                                                             
   15 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/1:0H                                                                                                                                                                                            
   16 root      rt   0       0      0      0 S   0.0  0.0   0:12.37 watchdog/2                                                                                                                                                                                              
   17 root      rt   0       0      0      0 S   0.0  0.0   0:00.74 migration/2                                                                                                                                                                                             
   18 root      20   0       0      0      0 S   0.0  0.0   2:06.27 ksoftirqd/2                                                                                                                                                                                             
   20 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/2:0H                                                                                                                                                                                            
   21 root      rt   0       0      0      0 S   0.0  0.0   0:12.84 watchdog/3                                                                                                                                                                                              
   22 root      rt   0       0      0      0 S   0.0  0.0   0:00.49 migration/3                                                                                                                                                                                             
   23 root      20   0       0      0      0 S   0.0  0.0   2:00.40 ksoftirqd/3                                                                                                                                                                                             
   25 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/3:0H                                                                                                                                                                                            
   26 root      rt   0       0      0      0 S   0.0  0.0   0:13.15 watchdog/4                                                                                                                                                                                              
   27 root      rt   0       0      0      0 S   0.0  0.0   0:00.46 migration/4                                                                                                                                                                                             
   28 root      20   0       0      0      0 S   0.0  0.0   1:59.53 ksoftirqd/4                                                                                                                                                                                             
   30 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/4:0H                                                                                                                                                                                            
   31 root      rt   0       0      0      0 S   0.0  0.0   0:13.33 watchdog/5                                                                                                                                                                                              
   32 root      rt   0       0      0      0 S   0.0  0.0   0:00.47 migration/5                                                                                                                                                                                             
   33 root      20   0       0      0      0 S   0.0  0.0   1:59.90 ksoftirqd/5                                                                                                                                                                                             
   35 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/5:0H                                                                                                                                                                                            
   36 root      rt   0       0      0      0 S   0.0  0.0   0:13.02 watchdog/6                                                                                                                                                                                              
   37 root      rt   0       0      0      0 S   0.0  0.0   0:00.88 migration/6                                                                                                                                                                                             
   38 root      20   0       0      0      0 S   0.0  0.0   1:59.48 ksoftirqd/6                                                                                                                                                                                             
   40 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/6:0H
Node3# free -m
              total        used        free      shared  buff/cache   available
Mem:          32174       13190         280         192       18703       18364
Swap:             0           0           0
Node3#top - 15:24:11 up 49 days,  6:41,  1 user,  load average: 0.61, 0.81, 0.86
Tasks: 155 total,   1 running, 154 sleeping,   0 stopped,   0 zombie
%Cpu(s): 16.1 us,  1.7 sy,  0.0 ni, 77.7 id,  3.9 wa,  0.0 hi,  0.2 si,  0.3 st
KiB Mem : 32946972 total,   270080 free, 13514464 used, 19162428 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 18798336 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                                                                                 
10296 root      20   0 35.452g 0.012t  21772 S  12.5 40.3  14836:24 cockroach                                                                                                                                                                                               
 6153 root      20   0   40520   3624   3028 R   6.2  0.0   0:00.01 top                                                                                                                                                                                                     
    1 root      20   0  185220   4668   2832 S   0.0  0.0   0:27.76 systemd                                                                                                                                                                                                 
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.08 kthreadd                                                                                                                                                                                                
    3 root      20   0       0      0      0 S   0.0  0.0   1:53.68 ksoftirqd/0                                                                                                                                                                                             
    5 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/0:0H                                                                                                                                                                                            
    7 root      20   0       0      0      0 S   0.0  0.0   9:10.20 rcu_sched                                                                                                                                                                                               
    8 root      20   0       0      0      0 S   0.0  0.0   0:00.00 rcu_bh                                                                                                                                                                                                  
    9 root      rt   0       0      0      0 S   0.0  0.0   0:00.73 migration/0                                                                                                                                                                                             
   10 root      rt   0       0      0      0 S   0.0  0.0   0:15.12 watchdog/0                                                                                                                                                                                              
   11 root      rt   0       0      0      0 S   0.0  0.0   0:13.91 watchdog/1                                                                                                                                                                                              
   12 root      rt   0       0      0      0 S   0.0  0.0   0:00.70 migration/1                                                                                                                                                                                             
   13 root      20   0       0      0      0 S   0.0  0.0  10:00.63 ksoftirqd/1                                                                                                                                                                                             
   15 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/1:0H                                                                                                                                                                                            
   16 root      rt   0       0      0      0 S   0.0  0.0   0:12.78 watchdog/2                                                                                                                                                                                              
   17 root      rt   0       0      0      0 S   0.0  0.0   0:00.70 migration/2                                                                                                                                                                                             
   18 root      20   0       0      0      0 S   0.0  0.0   1:49.24 ksoftirqd/2                                                                                                                                                                                             
   20 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/2:0H                                                                                                                                                                                            
   21 root      rt   0       0      0      0 S   0.0  0.0   0:12.91 watchdog/3                                                                                                                                                                                              
   22 root      rt   0       0      0      0 S   0.0  0.0   0:00.73 migration/3                                                                                                                                                                                             
   23 root      20   0       0      0      0 S   0.0  0.0   1:49.52 ksoftirqd/3                                                                                                                                                                                             
   25 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/3:0H                                                                                                                                                                                            
   26 root      rt   0       0      0      0 S   0.0  0.0   0:12.14 watchdog/4                                                                                                                                                                                              
   27 root      rt   0       0      0      0 S   0.0  0.0   0:00.63 migration/4                                                                                                                                                                                             
   28 root      20   0       0      0      0 S   0.0  0.0   1:50.19 ksoftirqd/4                                                                                                                                                                                             
   30 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/4:0H                                                                                                                                                                                            
   31 root      rt   0       0      0      0 S   0.0  0.0   0:12.91 watchdog/5                                                                                                                                                                                              
   32 root      rt   0       0      0      0 S   0.0  0.0   0:00.66 migration/5                                                                                                                                                                                             
   33 root      20   0       0      0      0 S   0.0  0.0   1:49.96 ksoftirqd/5                                                                                                                                                                                             
   35 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/5:0H                                                                                                                                                                                            
   36 root      rt   0       0      0      0 S   0.0  0.0   0:12.88 watchdog/6                                                                                                                                                                                              
   37 root      rt   0       0      0      0 S   0.0  0.0   0:01.00 migration/6                                                                                                                                                                                             
   38 root      20   0       0      0      0 S   0.0  0.0   1:51.36 ksoftirqd/6
# cockroach version
Build Tag:    v2.0.2
Build Time:   2018/05/21 14:55:20
Distribution: CCL
Platform:     linux amd64 (x86_64-unknown-linux-gnu)
Go Version:   go1.10
C Compiler:   gcc 6.3.0
Build SHA-1:  fa2b2bfd802ebe8150030a14b0a36c25fef90b91
Build Type:   release

Also I am unable to get 'profile?debug' has req. is timing out on the server when I issue wget for the same.

cockroach-last500line-node2.log
cockroach-last500line-node3.log

@tbg
Copy link
Member

tbg commented Jun 19, 2018

Hi @akshath, the logs look OK. What is the output of select * from crdb_internal.node_statement_statistics order by service_lat_avg desc limit 10;? That query should print out the queries that take longest on average.

You need to run it against each node to get the full view; each node only records the queries which it received.

@akshath
Copy link
Author

akshath commented Jun 20, 2018

Hi @tschottdorf Thanks for looking into this. Please do find the output from 2 of our nodes. Thanks

.63:26257/> select node_id, application_name, flags, anonymized, count, first_attempt_count, max_retries, last_error, rows_avg, rows_var, parse_lat_avg, parse_lat_var, plan_lat_avg, plan_lat_var, run_lat_avg, run_lat_var, service_lat_avg, service_lat_var, overhead_lat_avg, overhead_lat_var   from crdb_internal.node_statement_statistics order by service_lat_avg desc limit 10;
+---------+------------------------+-------+--------------------------------------------------------------------------+-------+---------------------+-------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+----------+---------------+---------------+-----------------------+------------------------+--------------------+--------------------+-------------------+--------------------+---------------------+-------------------+
| node_id |    application_name    | flags |                                anonymized                                | count | first_attempt_count | max_retries |                                                                              last_error                                                                              | rows_avg | rows_var | parse_lat_avg | parse_lat_var |     plan_lat_avg      |      plan_lat_var      |    run_lat_avg     |    run_lat_var     |  service_lat_avg  |  service_lat_var   |  overhead_lat_avg   | overhead_lat_var  |
+---------+------------------------+-------+--------------------------------------------------------------------------+-------+---------------------+-------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+----------+---------------+---------------+-----------------------+------------------------+--------------------+--------------------+-------------------+--------------------+---------------------+-------------------+
|       3 | PostgreSQL JDBC Driver | !     | UPSERT INTO _(_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,   |   449 |                 449 |           0 | waiting on split that failed: split at key                                                                                                                           |        0 |        0 |             0 |             0 | 0.0005864656926503344 | 1.1959763134486078e-07 | 12.334324954247206 | 17.552177148488607 | 13.09610419302672 | 25.400026600190156 |  0.7611927730868592 | 9.798498873023515 |
|         |                        |       | _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _) VALUES ($1, $2,    |       |                     |             | /Table/53/3/"Wibmo-User-Services"/1921-07-18T20:34:22.803999999Z/"IAP-Promocode-Failure"/0/0/"Wibmo-User-Services:201806160325:3a2170f5-44ab-4850-8ac7-21532f436ba8" |          |          |               |               |                       |                        |                    |                    |                   |                    |                     |                   |
|         |                        |       | $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, |       |                     |             | failed: command is too large: 411135097 bytes (max: 67108864)                                                                                                        |          |          |               |               |                       |                        |                    |                    |                   |                    |                     |                   |
|         |                        |       | $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, $29, $30, $31, $32,    |       |                     |             |                                                                                                                                                                      |          |          |               |               |                       |                        |                    |                    |                   |                    |                     |                   |
|         |                        |       | $33, $34, $35, $36, $37)                                                 |       |                     |             |                                                                                                                                                                      |          |          |               |               |                       |                        |                    |                    |                   |                    |                     |                   |
|       3 | PostgreSQL JDBC Driver | !     | INSERT INTO _(_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,   |    62 |                  62 |           0 | waiting on split that failed: split at key                                                                                                                           |        0 |        0 |             0 |             0 | 0.0002502923548387096 | 2.0648623864949776e-09 | 11.222318867516126 | 17.484553463098802 | 11.63914844672581 | 24.227832210938242 | 0.41657928685483836 | 5.303991335186809 |
|         |                        |       | _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _) VALUES ($1, $2,    |       |                     |             | /Table/53/3/"Wibmo-User-Services"/1921-07-18T20:34:22.803999999Z/"IAP-Promocode-Failure"/0/0/"Wibmo-User-Services:201806160325:3a2170f5-44ab-4850-8ac7-21532f436ba8" |          |          |               |               |                       |                        |                    |                    |                   |                    |                     |                   |
|         |                        |       | $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, |       |                     |             | failed: command is too large: 411135097 bytes (max: 67108864)                                                                                                        |          |          |               |               |                       |                        |                    |                    |                   |                    |                     |                   |
|         |                        |       | $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, $29, $30, $31, $32,    |       |                     |             |                                                                                                                                                                      |          |          |               |               |                       |                        |                    |                    |                   |                    |                     |                   |
|         |                        |       | $33, $34, $35, $36, $37)                                                 |       |                     |             |                                                                                                                                                                      |          |          |               |               |                       |                        |                    |                    |                   |                    |                     |                   |
+---------+------------------------+-------+--------------------------------------------------------------------------+-------+---------------------+-------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+----------+---------------+---------------+-----------------------+------------------------+--------------------+--------------------+-------------------+--------------------+---------------------+-------------------+
(2 rows)

Time: 30.33142ms

--
.37:26257/> select node_id, application_name, flags, anonymized, count, first_attempt_count, max_retries, last_error, rows_avg, rows_var, parse_lat_avg, parse_lat_var, plan_lat_avg, plan_lat_var, run_lat_avg, run_lat_var, service_lat_avg, service_lat_var, overhead_lat_avg, overhead_lat_var   from crdb_internal.node_statement_statistics order by service_lat_avg desc limit 10;
+---------+------------------------+-------+--------------------------------------------------------------------------+-------+---------------------+-------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+----------+---------------+---------------+-----------------------+-----------------------+--------------------+--------------------+--------------------+--------------------+---------------------+--------------------+
| node_id |    application_name    | flags |                                anonymized                                | count | first_attempt_count | max_retries |                                                                              last_error                                                                              | rows_avg | rows_var | parse_lat_avg | parse_lat_var |     plan_lat_avg      |     plan_lat_var      |    run_lat_avg     |    run_lat_var     |  service_lat_avg   |  service_lat_var   |  overhead_lat_avg   |  overhead_lat_var  |
+---------+------------------------+-------+--------------------------------------------------------------------------+-------+---------------------+-------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+----------+---------------+---------------+-----------------------+-----------------------+--------------------+--------------------+--------------------+--------------------+---------------------+--------------------+
|       4 | PostgreSQL JDBC Driver | !     | UPSERT INTO _(_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,   |   605 |                 605 |           0 | waiting on split that failed: split at key                                                                                                                           |        0 |        0 |             0 |             0 | 0.0009641634231404957 | 5.119412022820149e-06 | 12.374603077419838 | 14.550700033699048 | 13.448744997242972 | 26.040339129397303 |  1.0731777564000007 | 13.313399120084355 |
|         |                        |       | _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _) VALUES ($1, $2,    |       |                     |             | /Table/53/3/"Wibmo-User-Services"/1921-07-18T20:34:22.803999999Z/"IAP-Promocode-Failure"/0/0/"Wibmo-User-Services:201806160325:3a2170f5-44ab-4850-8ac7-21532f436ba8" |          |          |               |               |                       |                       |                    |                    |                    |                    |                     |                    |
|         |                        |       | $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, |       |                     |             | failed: command is too large: 411176996 bytes (max: 67108864)                                                                                                        |          |          |               |               |                       |                       |                    |                    |                    |                    |                     |                    |
|         |                        |       | $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, $29, $30, $31, $32,    |       |                     |             |                                                                                                                                                                      |          |          |               |               |                       |                       |                    |                    |                    |                    |                     |                    |
|         |                        |       | $33, $34, $35, $36, $37)                                                 |       |                     |             |                                                                                                                                                                      |          |          |               |               |                       |                       |                    |                    |                    |                    |                     |                    |
|       4 | PostgreSQL JDBC Driver | !     | INSERT INTO _(_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,   |    88 |                  88 |           0 | waiting on split that failed: split at key                                                                                                                           |        0 |        0 |             0 |             0 |  0.000422673659090909 | 4.875683078479517e-07 | 11.126149902465908 | 11.903821223902712 |       11.559220605 | 16.126650320411954 | 0.43264802887499987 |  5.969340121018091 |
|         |                        |       | _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _) VALUES ($1, $2,    |       |                     |             | /Table/53/3/"Wibmo-User-Services"/1921-07-18T20:34:22.803999999Z/"IAP-Promocode-Failure"/0/0/"Wibmo-User-Services:201806160325:3a2170f5-44ab-4850-8ac7-21532f436ba8" |          |          |               |               |                       |                       |                    |                    |                    |                    |                     |                    |
|         |                        |       | $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, |       |                     |             | failed: command is too large: 411176996 bytes (max: 67108864)                                                                                                        |          |          |               |               |                       |                       |                    |                    |                    |                    |                     |                    |
|         |                        |       | $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, $29, $30, $31, $32,    |       |                     |             |                                                                                                                                                                      |          |          |               |               |                       |                       |                    |                    |                    |                    |                     |                    |
|         |                        |       | $33, $34, $35, $36, $37)                                                 |       |                     |             |                                                                                                                                                                      |          |          |               |               |                       |                       |                    |                    |                    |                    |                     |                    |
|       4 | cockroach sql          |       | SELECT * FROM _.crdb_internal.node_build_info                            |     1 |                   1 |           0 | NULL                                                                                                                                                                 |        6 | NaN      |   0.000194389 | NaN           |           0.000229029 | NaN                   | 2.1821e-05         | NaN                |        0.000717841 | NaN                |         0.000272602 | NaN                |
+---------+------------------------+-------+--------------------------------------------------------------------------+-------+---------------------+-------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+----------+---------------+---------------+-----------------------+-----------------------+--------------------+--------------------+--------------------+--------------------+---------------------+--------------------+
(3 rows)

Time: 31.613941ms

--

@akshath
Copy link
Author

akshath commented Jun 20, 2018

Please find attached the output of https://<node-hostname>:8080/#/reports/range/199
Cockroach Console.pdf

@akshath
Copy link
Author

akshath commented Jun 21, 2018

Let me know if you need any more info from my side.

@tbg
Copy link
Member

tbg commented Jun 21, 2018

Hmm @akshath this seems like you're running into #25233. Let me figure out the next steps.

@tbg tbg added the A-kv-client Relating to the KV client and the KV interface. label Jun 21, 2018
@tbg
Copy link
Member

tbg commented Jun 21, 2018

@akshath are you sure the PDF is for the right range (unfortunately the key boundaries are cut off). I'm asking because I'd expect a larger range size (especially system bytes) but am not seeing that for the range you provided.

If you're having trouble finding the right range, try select * from crdb_internal.ranges and find the correct ID (you can use a where clause).

@a-robinson
Copy link
Contributor

You can also hover over the table entry on the UI page to see the full keys.

@akshath
Copy link
Author

akshath commented Jun 22, 2018

Yes I think its the wrong range, will try to get you the right range.

In the meanwhile, I tried to increase raft range using below command.

SET CLUSTER SETTING kv.raft.command.max_size='100000000';

But now one of my node is stopping continuously after starting for few mins. Please do find the logs of that node here.
cockroach.log.zip

.37:26257/> select node_id, application_name, flags, anonymized, count, first_attempt_count, max_retries, last_error, rows_avg, rows_var, parse_lat_avg, parse_lat_var, plan_lat_avg, plan_lat_var, run_lat_avg, run_lat_var, service_lat_avg, service_lat_var, overhead_lat_avg, overhead_lat_var  from crdb_internal.node_statement_statistics order by service_lat_avg desc limit 10;
+---------+------------------------+-------+--------------------------------------------------------------------------+-------+---------------------+-------------+--------------------------------------------------------------------------+----------+----------+---------------+---------------+-----------------------+------------------------+-------------------+-----------------------+--------------------+-----------------------+------------------------+-----------------------+
| node_id |    application_name    | flags |                                anonymized                                | count | first_attempt_count | max_retries |                                last_error                                | rows_avg | rows_var | parse_lat_avg | parse_lat_var |     plan_lat_avg      |      plan_lat_var      |    run_lat_avg    |      run_lat_var      |  service_lat_avg   |    service_lat_var    |    overhead_lat_avg    |   overhead_lat_var    |
+---------+------------------------+-------+--------------------------------------------------------------------------+-------+---------------------+-------------+--------------------------------------------------------------------------+----------+----------+---------------+---------------+-----------------------+------------------------+-------------------+-----------------------+--------------------+-----------------------+------------------------+-----------------------+
|       4 | PostgreSQL JDBC Driver | !     | UPSERT INTO _(_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,   |   397 |                 397 |           0 | failed to send RPC: sending to all 2 replicas failed; last error: { |        0 |        0 |             0 |             0 | 0.0004124487329974809 | 9.894478819524491e-09  | 60.00055264033502 | 5.612509102459291e-06 | 60.002029279753145 | 3.729748127091705e-06 |  0.0010641906851387034 | 1.827346683951616e-06 |
|         |                        |       | _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _) VALUES ($1, $2,    |       |                     |             | initial connection heartbeat failed: rpc error: code = Unavailable desc  |          |          |               |               |                       |                        |                   |                       |                    |                       |                        |                       |
|         |                        |       | $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, |       |                     |             | = all SubConns are in TransientFailure}                                  |          |          |               |               |                       |                        |                   |                       |                    |                       |                        |                       |
|         |                        |       | $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, $29, $30, $31, $32,    |       |                     |             |                                                                          |          |          |               |               |                       |                        |                   |                       |                    |                       |                        |                       |
|         |                        |       | $33, $34, $35, $36, $37)                                                 |       |                     |             |                                                                          |          |          |               |               |                       |                        |                   |                       |                    |                       |                        |                       |
|       4 | PostgreSQL JDBC Driver | !     | INSERT INTO _(_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,   |    59 |                  59 |           0 | failed to send RPC: sending to all 2 replicas failed; last error: { |        0 |        0 |             0 |             0 | 0.0001890023389830509 | 1.6361461264003518e-09 | 60.00109911883051 | 8.380717212976845e-07 | 60.002010979355916 | 7.45170076613103e-07  |  0.0007228581864416222 | 1.599297494696807e-08 |
|         |                        |       | _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _) VALUES ($1, $2,    |       |                     |             | initial connection heartbeat failed: rpc error: code = Unavailable desc  |          |          |               |               |                       |                        |                   |                       |                    |                       |                        |                       |
|         |                        |       | $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, |       |                     |             | = all SubConns are in TransientFailure}                                  |          |          |               |               |                       |                        |                   |                       |                    |                       |                        |                       |
|         |                        |       | $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, $29, $30, $31, $32,    |       |                     |             |                                                                          |          |          |               |               |                       |                        |                   |                       |                    |                       |                        |                       |
|         |                        |       | $33, $34, $35, $36, $37)                                                 |       |                     |             |                                                                          |          |          |               |               |                       |                        |                   |                       |                    |                       |                        |                       |
|       4 | cockroach sql          |       | SELECT * FROM _.crdb_internal.node_statement_statistics ORDER BY _ DESC  |     1 |                   1 |           0 | NULL                                                                     |        3 | NaN      | 5.3535e-05    | NaN           |           0.000487278 | NaN                    | 5.3257e-05        | NaN                   |        0.000743264 | NaN                   | 0.00014919399999999995 | NaN                   |
|         |                        |       | LIMIT _                                                                  |       |                     |             |                                                                          |          |          |               |               |                       |                        |                   |                       |                    |                       |                        |                       |
|       4 | cockroach sql          |       | SELECT * FROM _.crdb_internal.node_build_info                            |     1 |                   1 |           0 | NULL                                                                     |        6 | NaN      |   0.000145061 | NaN           |           0.000151945 | NaN                    | 1.7204e-05        | NaN                   |        0.000458473 | NaN                   | 0.00014426299999999994 | NaN                   |
+---------+------------------------+-------+--------------------------------------------------------------------------+-------+---------------------+-------------+--------------------------------------------------------------------------+----------+----------+---------------+---------------+-----------------------+------------------------+-------------------+-----------------------+--------------------+-----------------------+------------------------+-----------------------+
(4 rows)

Time: 8.336417ms

@tbg
Copy link
Member

tbg commented Jun 22, 2018

@akshath Please restore it to the defaults. Unfortunately with this setting it is possible to add commands to the Raft log that exceed what can be held in memory. This is not easy to undo, so let's hope that it eventually stops crash-looping.

@akshath
Copy link
Author

akshath commented Jun 22, 2018

Thank you @tschottdorf for the quick revert.

I have rolled-back that setting with the following command.

SET CLUSTER SETTING kv.raft.command.max_size='62914560';

But still, one of my node is not coming up and keeps exiting. Please do find the latest logs. Thanks
cockroach.log.zip

@tbg
Copy link
Member

tbg commented Jun 23, 2018

@akshath on the machine which keeps crashing, run this (you may have to adjust the url):

while true; do f=$(date '+heap.%Y%m%d-%H%M%S'); curl http://localhost:8080/debug/pprof/heap?debug=1 > $f || rm $f; sleep 10; done

When it crashes again, upload the heap.* files (they don't contain sensitive information).

Thanks!

tbg added a commit to tbg/cockroach that referenced this issue Jun 23, 2018
We are suspecting that some workloads create a sufficient number of
abort span records to cause split commands that are too large to be
proposed to Raft, in effect rendering the range permanently
unsplittable. This is a problem since the range will at some point
start to backpressure writes (and even without that, it's a resource
hog). Most of the problematic abort span records would very likely
be removed during garbage collection; however, abort span records
aren't counted in any quantity that triggers GC.

Instead of copying the entirety of the abort span, restrict to the
entries that would not be removed by a GC operation. In practice,
this means that unless a high number of abort span records are
created every hour, very few records will actually be copied, and
in return the split command size should be small.

See cockroachdb#26830.

Release note (bug fix): Avoid a situation in which ranges repeatedly
fail to perform a split.
@akshath
Copy link
Author

akshath commented Jun 23, 2018

Please do find the last two heap files. Thanks

heap.zip

tbg added a commit to tbg/cockroach that referenced this issue Jun 24, 2018
We are suspecting that some workloads create a sufficient number of
abort span records to cause split commands that are too large to be
proposed to Raft, in effect rendering the range permanently
unsplittable. This is a problem since the range will at some point
start to backpressure writes (and even without that, it's a resource
hog). Most of the problematic abort span records would very likely
be removed during garbage collection; however, abort span records
aren't counted in any quantity that triggers GC.

Instead of copying the entirety of the abort span, restrict to the
entries that would not be removed by a GC operation. In practice,
this means that unless a high number of abort span records are
created every hour, very few records will actually be copied, and
in return the split command size should be small.

See cockroachdb#26830.

Release note (bug fix): Avoid a situation in which ranges repeatedly
fail to perform a split.
craig bot pushed a commit that referenced this issue Jun 25, 2018
26934: storage: filter abort span during split copy r=nvanbenschoten a=tschottdorf

We are suspecting that some workloads create a sufficient number of
abort span records to cause split commands that are too large to be
proposed to Raft, in effect rendering the range permanently
unsplittable. This is a problem since the range will at some point
start to backpressure writes (and even without that, it's a resource
hog). Most of the problematic abort span records would very likely
be removed during garbage collection; however, abort span records
aren't counted in any quantity that triggers GC.

Instead of copying the entirety of the abort span, restrict to the
entries that would not be removed by a GC operation. In practice,
this means that unless a high number of abort span records are
created every hour, very few records will actually be copied, and
in return the split command size should be small.

See #26830.

Release note (bug fix): Avoid a situation in which ranges repeatedly
fail to perform a split.

Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com>
tbg added a commit to tbg/cockroach that referenced this issue Jun 25, 2018
We are suspecting that some workloads create a sufficient number of
abort span records to cause split commands that are too large to be
proposed to Raft, in effect rendering the range permanently
unsplittable. This is a problem since the range will at some point
start to backpressure writes (and even without that, it's a resource
hog). Most of the problematic abort span records would very likely
be removed during garbage collection; however, abort span records
aren't counted in any quantity that triggers GC.

Instead of copying the entirety of the abort span, restrict to the
entries that would not be removed by a GC operation. In practice,
this means that unless a high number of abort span records are
created every hour, very few records will actually be copied, and
in return the split command size should be small.

See cockroachdb#26830.

Release note (bug fix): Avoid a situation in which ranges repeatedly
fail to perform a split.
@tbg
Copy link
Member

tbg commented Jun 25, 2018

@akshath this looks like what I expected. You have a long tail of uncommitted very large Raft proposals in the Raft log (presumably due to incrementing the max proposal size as you did temporarily). During startup, the Raft group is elected leader (presumably because it calls elections for some high term and has the highest log, but @bdarnell, shouldn't there be a leader alredy, and shouldn't that prevent the candidate state in the first place?) and that means that it has to load all of the entries from storage, which causes the out of memory error.

@bdarnell is this expected behavior? I'm surprised that this replica calls an election. I think the other replicas are online (please confirm, @akshath -- and please also supply the range info page).

profile001.pdf

@bdarnell
Copy link
Contributor

Yeah, I'm confused about why we're seeing elections here. Maybe the range is dormant and for some reason this node is the only one that wakes it up? If the other nodes are up, one of them should become leader and invalidate the failing nodes' uncommitted entries.

In any case, I thought this was fixed in 2.0, but looking back at the fix (etcd-io/etcd#9073), my change doesn't actually do what I meant for it to do.

@tbg
Copy link
Member

tbg commented Jun 25, 2018

Is it possible not to call entries here?

https://github.com/cockroachdb/vendored/blob/1fb3eecf3ee812eafd14b224ed1cfce284494986/github.com/coreos/etcd/raft/raft.go#L696

From the code it looks like all entries is used for is to get the last entry (and make it the pending conf index if it's ahead of the commit index).

@akshath we'll need to see the Range status page for the correct RangeID to make future suggestions. It can be retrieved from one of the other nodes while your problematic node is down. I would leave that node down, by the way.

craig bot pushed a commit that referenced this issue Jun 25, 2018
26944: backport-2.0: storage: filter abort span during split copy r=bdarnell a=tschottdorf

Backport 1/1 commits from #26934.

Needed slight adjustments to apply, but nothing in the split logic.

/cc @cockroachdb/release

---

We are suspecting that some workloads create a sufficient number of
abort span records to cause split commands that are too large to be
proposed to Raft, in effect rendering the range permanently
unsplittable. This is a problem since the range will at some point
start to backpressure writes (and even without that, it's a resource
hog). Most of the problematic abort span records would very likely
be removed during garbage collection; however, abort span records
aren't counted in any quantity that triggers GC.

Instead of copying the entirety of the abort span, restrict to the
entries that would not be removed by a GC operation. In practice,
this means that unless a high number of abort span records are
created every hour, very few records will actually be copied, and
in return the split command size should be small.

See #26830.

Release note (bug fix): Avoid a situation in which ranges repeatedly
fail to perform a split.


Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com>
@bdarnell
Copy link
Contributor

Yes, we should be able to remove the call to entries (I meant to do so in etcd-io/etcd#9073). Filed #26946 to track this.

@akshath
Copy link
Author

akshath commented Jun 25, 2018

Right now only 3 of 6 nodes are up. Other 3 keep going down if I try to restart.

When I try to access 'admin ui' I keep getting the 'Connection to CockroachDB node lost.' error. Can you please guide me on how to get correct range info page? Also, the application is no longer able to connect to the DB.

Please do find the current output from SQL window.

37:26257/> select node_id, application_name, flags, anonymized, count, first_attempt_count, max_retries, last_error, rows_avg, rows_var, parse_lat_avg, parse_lat_var, plan_lat_avg, plan_lat_var, run_lat_avg, run_lat_var, service_lat_avg, service_lat_var, overhead_lat_avg, overhead_lat_var  from crdb_internal.node_statement_statistics order by service_lat_avg desc limit 10;
+---------+------------------------+-------+--------------------------------------------------------------------------+-------+---------------------+-------------+--------------------------------------------------------------------------+----------+----------+---------------+---------------+------------------------+------------------------+--------------------+------------------------+-------------------+------------------------+------------------------+------------------------+
| node_id |    application_name    | flags |                                anonymized                                | count | first_attempt_count | max_retries |                                last_error                                | rows_avg | rows_var | parse_lat_avg | parse_lat_var |      plan_lat_avg      |      plan_lat_var      |    run_lat_avg     |      run_lat_var       |  service_lat_avg  |    service_lat_var     |    overhead_lat_avg    |    overhead_lat_var    |
+---------+------------------------+-------+--------------------------------------------------------------------------+-------+---------------------+-------------+--------------------------------------------------------------------------+----------+----------+---------------+---------------+------------------------+------------------------+--------------------+------------------------+-------------------+------------------------+------------------------+------------------------+
|       4 | PostgreSQL JDBC Driver | !     | INSERT INTO _(_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,   |   154 |                 154 |           0 | failed to send RPC: sending to all 1 replicas failed; last error: { |        0 |        0 |             0 |             0 | 0.00028536728571428564 | 1.8641165387438835e-07 | 60.003291508188305 | 3.4200656321583525e-05 |  60.0046154097013 | 3.286236984590149e-05  |  0.0010385342272728945 | 2.782355718221276e-07  |
|         |                        |       | _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _) VALUES ($1, $2,    |       |                     |             | rpc error: code = Canceled desc = context canceled}                      |          |          |               |               |                        |                        |                    |                        |                   |                        |                        |                        |
|         |                        |       | $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, |       |                     |             |                                                                          |          |          |               |               |                        |                        |                    |                        |                   |                        |                        |                        |
|         |                        |       | $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, $29, $30, $31, $32,    |       |                     |             |                                                                          |          |          |               |               |                        |                        |                    |                        |                   |                        |                        |                        |
|         |                        |       | $33, $34, $35, $36, $37)                                                 |       |                     |             |                                                                          |          |          |               |               |                        |                        |                    |                        |                   |                        |                        |                        |
|       4 | PostgreSQL JDBC Driver | !     | SELECT * FROM _ WHERE _ = $1                                             |     1 |                   1 |           0 | failed to send RPC: sending to all 1 replicas failed; last error: { |        0 | NaN      |             0 | NaN           |            0.000424678 | NaN                    |       60.003456095 | NaN                    |      60.004421181 | NaN                    |   0.000540407999999104 | NaN                    |
|         |                        |       |                                                                          |       |                     |             | rpc error: code = Canceled desc = context canceled}                      |          |          |               |               |                        |                        |                    |                        |                   |                        |                        |                        |
|       4 | PostgreSQL JDBC Driver | !     | UPSERT INTO _(_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,   |   718 |                 718 |           0 | failed to send RPC: sending to all 1 replicas failed; last error: { |        0 |        0 |             0 |             0 |  0.0006447936197771588 | 2.2162686355907175e-06 |  59.99943388852792 | 2.258043766237384e-05  | 60.00151220645542 | 1.6787926880634934e-05 |  0.0014335243077994557 | 1.6650101094467024e-06 |
|         |                        |       | _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _) VALUES ($1, $2,    |       |                     |             | rpc error: code = Canceled desc = context canceled}                      |          |          |               |               |                        |                        |                    |                        |                   |                        |                        |                        |
|         |                        |       | $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16, $17, $18, |       |                     |             |                                                                          |          |          |               |               |                        |                        |                    |                        |                   |                        |                        |                        |
|         |                        |       | $19, $20, $21, $22, $23, $24, $25, $26, $27, $28, $29, $30, $31, $32,    |       |                     |             |                                                                          |          |          |               |               |                        |                        |                    |                        |                   |                        |                        |                        |
|         |                        |       | $33, $34, $35, $36, $37)                                                 |       |                     |             |                                                                          |          |          |               |               |                        |                        |                    |                        |                   |                        |                        |                        |
|       4 | cockroach sql          |       | SELECT * FROM _.crdb_internal.node_statement_statistics ORDER BY _ DESC  |     1 |                   1 |           0 | NULL                                                                     |        4 | NaN      | 6.7415e-05    | NaN           |            0.001236047 | NaN                    | 8.2888e-05         | NaN                    |       0.001572622 | NaN                    | 0.00018627200000000013 | NaN                    |
|         |                        |       | LIMIT _                                                                  |       |                     |             |                                                                          |          |          |               |               |                        |                        |                    |                        |                   |                        |                        |                        |
|       4 | cockroach sql          |       | SELECT * FROM _.crdb_internal.node_build_info                            |     1 |                   1 |           0 | NULL                                                                     |        6 | NaN      |   0.000128333 | NaN           |            0.000147583 | NaN                    | 3.1358e-05         | NaN                    |       0.000454601 | NaN                    |            0.000147327 | NaN                    |
+---------+------------------------+-------+--------------------------------------------------------------------------+-------+---------------------+-------------+--------------------------------------------------------------------------+----------+----------+---------------+---------------+------------------------+------------------------+--------------------+------------------------+-------------------+------------------------+------------------------+------------------------+
(5 rows)

Time: 12.07971ms

@akshath
Copy link
Author

akshath commented Jun 27, 2018

Hi Team, Any advice on how I should proceed? Thanks

@tbg
Copy link
Member

tbg commented Jun 27, 2018

@bdarnell has an upstream fix merged, which we should be able to cherry-pick onto the release-2.0 branch today. You should try running with that binary when it's available; it should mitigate this specific out of memory error, but there is a possibility that you will see more problems due to the large Raft log that has been built up.

@akshath
Copy link
Author

akshath commented Jun 28, 2018

Thanks, @tschottdorf for getting back. Waiting for the new release.

@tbg
Copy link
Member

tbg commented Jul 2, 2018

@akshath if you're able to run a custom build, commit ce128dd from the release-2.0 branch has the mitigation. The easiest way to build it is via ./build/builder.sh make build TYPE=release-linux-gnu. Note that the resulting binary will not be named cockroach, but will have some suffix.

@tbg
Copy link
Member

tbg commented Jul 17, 2018

@akshath v2.0.4 was released today, with the mitigation for this problem. Please give it a try!

@akshath
Copy link
Author

akshath commented Jul 17, 2018

Thank you, will upgrade and update back.

@tbg tbg added the C-investigation Further steps needed to qualify. C-label will change. label Jul 22, 2018
@tbg tbg added this to the 2.1 milestone Jul 22, 2018
@tbg
Copy link
Member

tbg commented Jul 31, 2018

@akshath, any news from your end? In other cases, we have seen that it was still difficult to revive the server due to other locations in the code that struggle with the large Raft log.
On the plus side, we have fixed the root cause in #27774 (i.e., you wouldn't get into this situation in the future, assuming you're running v2.0.5 once that comes out in a few weeks).

@tbg tbg self-assigned this Jul 31, 2018
@tbg tbg added S-1-stability Severe stability issues that can be fixed by upgrading, but usually don’t resolve by restarting and removed C-investigation Further steps needed to qualify. C-label will change. labels Jul 31, 2018
@akshath
Copy link
Author

akshath commented Aug 2, 2018

not much.. 1 node of 6 keeps quitting

@tbg
Copy link
Member

tbg commented Aug 7, 2018

Could you provide more information? This isn't expected.

@tbg tbg added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Aug 7, 2018
@tbg tbg added C-investigation Further steps needed to qualify. C-label will change. and removed S-1-stability Severe stability issues that can be fixed by upgrading, but usually don’t resolve by restarting labels Aug 16, 2018
@tbg
Copy link
Member

tbg commented Aug 16, 2018

Closing as the known problem here was a large Raft log, and @bdarnell and @nvanbenschoten have since landed a series of fixes that we believe eliminates both the root cause and mitigates the effects.

@tbg tbg closed this as completed Aug 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-client Relating to the KV client and the KV interface. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-investigation Further steps needed to qualify. C-label will change.
Projects
None yet
Development

No branches or pull requests

4 participants