x/bank: CPU: exorcise app.Deliver->runTx as it consumes more CPU cycles than app.Commit per BenchmarkOneBankMultiSendTxPerBlock #8697

odeke-em · 2021-02-25T07:47:53Z

Summary of Bug

As part of cosmos-sdk benchmarking, this issue is to provide a guide to figuring out what culprits are and what needs to be investigated and improved. Inside x/bank/bench_test.go there is a looming need to figure out what consumes more CPU per

cosmos-sdk/x/bank/bench_test.go

Lines 86 to 101 in 149bed4

    
           	// Run this with a profiler, so its easy to distinguish what time comes from 
        
           	// Committing, and what time comes from Check/Deliver Tx. 
        
           	for i := 0; i < b.N; i++ { 
        
           		benchmarkApp.BeginBlock(abci.RequestBeginBlock{Header: tmproto.Header{Height: height}}) 
        
           		_, _, err := benchmarkApp.Check(txGen.TxEncoder(), txs[i]) 
        
           		if err != nil { 
        
           			panic("something is broken in checking transaction") 
        
           		} 
        
           		_, _, err = benchmarkApp.Deliver(txGen.TxEncoder(), txs[i]) 
        
           		require.NoError(b, err) 
        
           		benchmarkApp.EndBlock(abci.RequestEndBlock{Height: height}) 
        
           		benchmarkApp.Commit() 
        
           		height++ 
        
           	} 
        
           }

The target here is to figure out how to make things better and improve on throughput and what to care about optimizing. We've worked on continuous benchmarking infrastructure that'll extract a git commit from a PR, run benchmarks and post results to show cosmos-sdk engineers what could have changed.

Results

app.Deliver->runTx consumes much more CPU cycles (10.30s) than app.Commit (1.16s)

Commit CPU graph

Deliver CPU graph

Version

Latest at 90d799f

Steps to Reproduce

To reproduce, please run

go test -run=^$ -bench=OneBankMultiSendTxPerBlock -run=60 -cpuprofile=ms.cpu -memprofile=ms.mem

and when completed, run pprof to examine the respective files

/cc @ethanfrey @cuonglm @okwme

For Admin Use

Not duplicate issue
Appropriate labels applied
Appropriate contributors tagged
Contributor assigned/self-assigned

The text was updated successfully, but these errors were encountered:

ethanfrey · 2021-08-21T08:35:36Z

app.Deliver->runTx consumes much more CPU cycles (10.30s) than app.Commit (1.16s)

yes, this seems like a reasonable conclusion.

commit may have more I/O (it has all the writes, while deliver has the possibly cached reads). It would be interesting to have a graph of IO time here as well.

Also, do you use leveldb with disk backed storage? This is what makes Commit and deliver reads slower

alexanderbez · 2021-08-23T13:16:30Z

Snapshot from Osmosis. Not sure if helpful or relevant here. Note, osmosis has distribution epochs and that's why you see the uniform spikes.

ethanfrey · 2021-08-23T15:28:03Z

Interesting real world graph. Thank you @alexanderbez

Are the commit times maxing at 100ms, or is that averaged? As I heard of multi-second end block for the distribution (but that might be end block and not commit)

alexanderbez · 2021-08-23T19:24:02Z

This is an average. I'll post the Prom query shortly.

odeke-em · 2021-08-23T19:32:08Z

This is an average. I'll post the Prom query shortly.

Averages are fallacious -- imagine myself, my sister and Bill Gates, my 8 friends all walked into a room and someone asked for the average wealth, you can see how skewed that is.

I'd highly encourage using percentiles and not averages so for example here is how to get the 95th percentile for a metric named "metric" in Prometheus

histogram_quantile(0.95,
    sum(rate(metric_bucket[5m])) by (job, le))

cc @kirbyquerby and @cuonglm

alexanderbez · 2021-08-23T19:53:07Z

Averages are fallacious -- imagine myself, my sister and Bill Gates, my 8 friends all walked into a room and someone asked for the average wealth, you can see how skewed that is.

Sure, but that's not at all what's happening here.

The query is:

rate(abci_deliver_tx_sum{job=\"tendermint\", hostname=\"$hostname\", host=\"\", chain_id=\"$chain_id\"}[5m]) / rate(abci_deliver_tx_count{job=\"tendermint\", hostname=\"$hostname\", host=\"\", chain_id=\"$chain_id\"}[5m])

But this thread isn't about Prom, so use this how you wish.

odeke-em · 2021-08-23T19:55:39Z

Averages are fallacious -- imagine myself, my sister and Bill Gates, my 8 friends all walked into a room and someone asked for the average wealth, you can see how skewed that is.

Sure, but that's not at all what's happening here.

The query is:
rate(abci_deliver_tx_sum{job=\"tendermint\", hostname=\"$hostname\", host=\"\", chain_id=\"$chain_id\"}[5m]) / rate(abci_deliver_tx_count{job=\"tendermint\", hostname=\"$hostname\", host=\"\", chain_id=\"$chain_id\"}[5m])
But this thread isn't about Prom, so use this how you wish.

@alexanderbez I raised that because we are dealing with latencies. Is it possible for you to help us with the bucketized p95th latency graph?

alexanderbez · 2021-08-23T20:28:44Z

Yup. I'll run the query in a bit! I also have a public Gaia instance if you want to run your own Prom queries?

elias-orijtech · 2023-05-17T23:01:43Z

Gentle ping. Still relevant?

odeke-em added the T: Performance Performance improvements label Feb 25, 2021

github-actions bot added the stale label Aug 11, 2021

github-actions bot closed this as completed Aug 21, 2021

odeke-em reopened this Aug 21, 2021

github-actions bot removed the stale label Aug 22, 2021

tac0turtle closed this as completed Sep 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

x/bank: CPU: exorcise app.Deliver->runTx as it consumes more CPU cycles than app.Commit per BenchmarkOneBankMultiSendTxPerBlock #8697

x/bank: CPU: exorcise app.Deliver->runTx as it consumes more CPU cycles than app.Commit per BenchmarkOneBankMultiSendTxPerBlock #8697

odeke-em commented Feb 25, 2021

ethanfrey commented Aug 21, 2021

alexanderbez commented Aug 23, 2021 •

edited

Loading

ethanfrey commented Aug 23, 2021

alexanderbez commented Aug 23, 2021

odeke-em commented Aug 23, 2021

alexanderbez commented Aug 23, 2021

odeke-em commented Aug 23, 2021

alexanderbez commented Aug 23, 2021

elias-orijtech commented May 17, 2023

x/bank: CPU: exorcise app.Deliver->runTx as it consumes more CPU cycles than app.Commit per BenchmarkOneBankMultiSendTxPerBlock #8697

x/bank: CPU: exorcise app.Deliver->runTx as it consumes more CPU cycles than app.Commit per BenchmarkOneBankMultiSendTxPerBlock #8697

Comments

odeke-em commented Feb 25, 2021

Summary of Bug

Results

Commit CPU graph

Deliver CPU graph

Version

Steps to Reproduce

For Admin Use

ethanfrey commented Aug 21, 2021

alexanderbez commented Aug 23, 2021 • edited Loading

ethanfrey commented Aug 23, 2021

alexanderbez commented Aug 23, 2021

odeke-em commented Aug 23, 2021

alexanderbez commented Aug 23, 2021

odeke-em commented Aug 23, 2021

alexanderbez commented Aug 23, 2021

elias-orijtech commented May 17, 2023

alexanderbez commented Aug 23, 2021 •

edited

Loading