Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update en doc #180

Merged
merged 1 commit into from
Aug 16, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions docs/.vuepress/config.js
Original file line number Diff line number Diff line change
Expand Up @@ -134,9 +134,8 @@ module.exports = {
title: 'Operation',
collapsable: false,
children: [
['', 'OM Advanced'],
['Efficiency_of_sealing.md', 'Improve the efficiency of the sealing sector'],
['System_monitor_of_Zabbix.md', 'System monitoring installation and use of Zabbix'],
['', 'Finding optimal configurations'],
// ['System_monitor_of_Zabbix.md', 'System monitoring installation and use of Zabbix'],
]
}
],
Expand Down
309 changes: 304 additions & 5 deletions docs/operation/README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,308 @@
# venus独立组件运维进阶篇
## Find your optimal configurations

   这里记录员如何运维venus独立组件,保障算力稳定增长,最大化利用系统资源,配置venus-worker的任务等系列文档。
Get your filecoin mining operation up and running is hard. Expanding growth of your system is even harder. It will take a lot of time to scale growth and make sure your setup running without errors.

## Overview

General guidelines to follow when optimizing your sealing pipeline.

- Pledge 2 to 4 sectors and record the exact time of each task (AP, P1, P2, C2) takes to finish

- Make sure all your boxes have tasks assigned to them all the time
- Automate your `sector pledge` command with [script](https://filecoinproject.slack.com/archives/CPFTWMY7N/p1628092388117700?thread_ts=1628092099.117600&cid=CPFTWMY7N)/cron
- Use `MaxSealingSectors` to cap maximum number of sectors sealing in parallel
- Every worker can be assigned with a subset of tasks (AP, P1, P2, C2) to specialize

## Record time for each task

Types of task that a worker can do.

```go
TTAddPiece TaskType = "seal/v0/addpiece"
TTPreCommit1 TaskType = "seal/v0/precommit/1"
TTPreCommit2 TaskType = "seal/v0/precommit/2"
TTCommit1 TaskType = "seal/v0/commit/1" // NOTE: We use this to transfer the sector into miner-local storage for now; Don't use on workers!
TTCommit2 TaskType = "seal/v0/commit/2"

TTFinalize TaskType = "seal/v0/finalize"
TTFetch TaskType = "seal/v0/fetch"
TTUnseal TaskType = "seal/v0/unseal"
```

Each task shows up in log first with key word of `prepare` (start and end) then with key word of `work` as another log entry (also start and end).

```bash
# seal/v0/fetch
2021-08-03T14:00:07.925+0800 INFO advmgr sector-storage/sched_worker.go:401 Sector 7 prepare for seal/v0/fetch ...
2021-08-03T14:05:36.772+0800 INFO advmgr sector-storage/sched_worker.go:403 Sector 7 prepare for seal/v0/fetch end ...

2021-08-03T14:05:36.772+0800 INFO advmgr sector-storage/sched_worker.go:442 Sector 7 work for seal/v0/fetch ...
2021-08-03T14:05:36.774+0800 INFO advmgr sector-storage/sched_worker.go:444 Sector 7 work for seal/v0/fetch end ...

# seal/v0/addpiece
2021-08-03T13:38:37.977+0800 INFO advmgr sector-storage/sched_worker.go:401 Sector 8 prepare for seal/v0/addpiece ...
2021-08-03T13:38:37.978+0800 INFO advmgr sector-storage/sched_worker.go:403 Sector 8 prepare for seal/v0/addpiece end ...

2021-08-03T13:38:37.978+0800 INFO advmgr sector-storage/sched_worker.go:442 Sector 8 work for seal/v0/addpiece ...
2021-08-03T13:44:26.295+0800 INFO advmgr sector-storage/sched_worker.go:444 Sector 8 work for seal/v0/addpiece end ...

# seal/v0/commit/2
2021-08-03T13:26:02.119+0800 INFO advmgr sector-storage/sched_worker.go:401 Sector 7 prepare for seal/v0/commit/2 ...
2021-08-03T13:26:02.119+0800 INFO advmgr sector-storage/sched_worker.go:403 Sector 7 prepare for seal/v0/commit/2 end ...

2021-08-03T13:26:02.119+0800 INFO advmgr sector-storage/sched_worker.go:442 Sector 7 work for seal/v0/commit/2 ...
2021-08-03T13:49:46.180+0800 INFO advmgr sector-storage/sched_worker.go:444 Sector 7 work for seal/v0/commit/2 end ...

# seal/v0/finalize
2021-08-03T13:54:17.414+0800 INFO advmgr sector-storage/sched_worker.go:401 Sector 7 prepare for seal/v0/finalize ...
2021-08-03T13:59:30.471+0800 INFO advmgr sector-storage/sched_worker.go:403 Sector 7 prepare for seal/v0/finalize end ...

2021-08-03T13:59:30.471+0800 INFO advmgr sector-storage/sched_worker.go:442 Sector 7 work for seal/v0/finalize ...
2021-08-03T14:00:07.915+0800 INFO advmgr sector-storage/sched_worker.go:444 Sector 7 work for seal/v0/finalize end ...
```

Some task may take more time in `prepare` than` work` and some are the other way around. Generally speaking, when task requires network transfer/bandwidth it will consume more time in `prepare` while if the task require more computation resources it will consume more time in `work`. Eg, AP, P1, P2, C2.

To record time of core tasks like AP, P1, P2 and C2, we aggregate both the time of `fetch` before it and the task itself. For example, time of P1 = time of P1 + time of fetch before P1.

## Performance factors

There are many factors cobtributes to the performance of your sealing pipeline.

### Sealing storage

During sealing of a sector, cahce files will be generated by the proof algorithm which requires high disk IO speed. Low IO speed may result in idling of your computation resources (CPUs/GPUs).

Choose apropriate hardware using forumla below.

```bash
file size * number of parallel threads / operation time = average file IO speed
```

To get more precise estimations, sum up per task IO throughput.

```bash
AP IO throughput = AP read + AP write
P1 IO throughput = P1 read + P1 write
P2 IO throughput = P2 read + P2 write
C2 IO throughput = C2 read + C2 write
```

SSD and NVMe are commonly used for sealing storages. To ensure effcient usage of these faster storage, it is recommended to use software RAID on these SSDs.

```bash
mdadm -C /dev/md1 -l 0 -n 2 /dev/sdb1 /dev/sdc1
mdadm -C /dev/md2 -l 5 -n 6 /dev/sd[b-g]1
# Options
-C, --create
Create a new array.
-l, --level=
Set RAID level.
-n, --raid-devices=
Specify the number of active devices in the array.
-x, --spare-devices=
Specify the number of spare (eXtra) devices in the initial array.
-A, --assemble
Assemble a pre-existing array.
```

More on `mdadm`, please visit [here](http://raid.wiki.kernel.org/). Get latest version from [here](http://www.kernel.org/pub/linux/utils/raid/mdadm/).

### Permenant storage

Possible adversaries to overcome when setting up permenant storage.

1. When a sector is sealed, it will be transferred from sealer to permenant storage which takes up network bandwidth and disk IOs.
2. During a `windowPost`, random selections files will be read in large number. Slow read may result in failed `windowPost`.
3. Choose high RAID level to have redunancy when possible. Eg, RAID5, RAID6, RAID10.
4. Monitor usage of your disk array.

### Network transfer

During sealing, if you specialize your worker in one type of task (to increase efficiency of your resources), it will result in file transfer over the network. If file being copied too slowly over the network, it will drag the speed of your sealing pipeline down. Closely monitor your computation resources and see if there is any idling. For example, if PC2 takes 25 minutes, reads ~400G and writes ~100G, then IO throughput will be ~368 MB/s (`440 * 1024 / 25 / 60 + 100 * 1024 / 25 / 60`).

After sealing, the sealed sector need to be transferred to permanent storage which can be bottlenecked by the network bandwidth connecting your `venus-sealer` and your HDD disk array.

### Environment variables

SHA extension would make a huge difference in computing P1 tasks. P1 could cost around 250 minutes with SHA extension enabled while may cost 420+ minutes without SHA.

When compiling `venus-sealer`, make sure you have set `RUSTFLAGS="-C target-cpu=native -g" FFI_BUILD_FROM_SOURCE="1"` flags and you shall see the following example output.

```bash
+ trap '{ rm -f $__build_output_log_tmp; }' EXIT
+ local '__rust_flags=--print native-static-libs -C target-feature=+sse2'
+ RUSTFLAGS='--print native-static-libs -C target-feature=+sse2'
+ cargo +nightly-2021-04-24 build --release --no-default-features --features multicore-sdr --features pairing,gpu
+ tee /tmp/tmp.IYtnd3xka9
Compiling autocfg v1.0.1
Compiling libc v0.2.97
Compiling cfg-if v1.0.0
Compiling proc-macro2 v1.0.27
Compiling unicode-xid v0.2.2
Compiling syn v1.0.73
Compiling lazy_static v1.4.0
Compiling cc v1.0.68
Compiling typenum v1.13.0
Compiling serde_derive v1.0.126
Compiling serde v1.0.126
```

### Core restriction

When running two types of tasks on same box, you may want to restrict CPU cores each task may use without competing for resources of the other.

Through `taskset`. Note you cannot dynamically change core restrictions during execution of the program.

```bash
TRUST_PARAMS=1 nohup taskset -c 0-32 ./venus-worker run
# Non-consecutive core selection
taskset -c 0-9,19-29,39-49
```

Or through `Cgrep`, which supports dynamic core restrictions during program execution.

```bash
sudo mkdir -p /sys/fs/cgroup/cpuset/Pre1-worker
sudo echo 0-31 > /sys/fs/cgroup/cpuset/Pre1-worker/cpuset.cpus
sudo echo <PID> > /sys/fs/cgroup/cpuset/Pre1-worker/cgroup.procs
```

## Worker optimization

All numbers are for 32G sectors. For 64G sectors, double what the numbers of 32G sector.

### P1 optimization

Set following environment variable to speed up P1.

```bash
# Store cache files in RAM; for 32G sectors, it will cost 56G RAM
export FIL_PROOFS_MAXIMIZE_CACHING=1
# Use mutiple cores for P1
export FIL_PROOFS_USE_MULTICORE_SDR=1
```

P1 RAM usage includes 56G cache file and 2 layers of the sector for each sector sealing in parallel.

```bash
# Assume 10 sector running in parallel
56G + 32G * 2 * 10 = 696G
```

P1 SSD usage includes 11 layers of the sector, 64G of `tree-d` file and 32G of the unsealed sector.

```bash
# For 1 sector
11 * 32G + 64 + 32 = 440G
```

### P2 optimization

Set following environment variable to speed up P2.

```bash
# Use GPU for tree-r-last
export FIL_PROOFS_USE_GPU_COLUMN_BUILDER=1
# Use GPU for tree-c
export FIL_PROOFS_USE_GPU_TREE_BUILDER=1
```

P2 RAM usage is 96G.

```bash
# Assume 10 sector running in parallel
96G * 10 = 960G
```

P1 SSD usage includes 4.6G tree-c file * 8, 9.2M tree-r-last file * 8, 4K t_aux file, 4K p_aux file and 32G unsealed sector file.

```bash
4.6G * 8 + 8 * 9.2M + 4K * 2 + 32G = ~70G
```

### Commit

C1 cost little CPU usage, but require sum of P1 and P2 SSD usage.

```bash
P1 440G + P2 79G = 519G
```

C2 environment variable

```bash
BELLMAN_NO_GPU=1
# Example, if you are using 3090
GPUBELLMAN_CUSTOM_GPU="GeForce RTX 3090:10496"
```

C2 RAM usage.

```bash
128G + 64G = 192G
```

## Optimize sealing pipeline

### Calculate your daily growth

Calculate how many tasks your sealing pipeline can process.

```bash
# for each type of task
tasks done / time = production rate
daily production rate * (32G OR 64G) = daily growth in power
```

For example, if we have one box and can finish P1 in 240 minutes, P2 in 30 minutes and Commit in 35 minutes, then you can derive daily growth by the following chart.

| Task | Minute | Parallel | Hourly production rate |
| ------ | ------ | -------- | ---------------------- |
| P1 | 240 | 1 | 0.25 = 1 / (240 /60) |
| P2 | 30 | 1 | 2 = 1 / (30 /60) |
| Commit | 35 | 1 | 1.71 = 1 / (35 /60) |

### Finding optimal task configurations

From the table above, we know that daily growth will be bottlenecked by P1. Adjust number of parallel tasks for different types of task to achieve maximum efficiency.

| Task | Minutes | Parallel | Hourly productin | Output | Memory consumption |
| ------ | ------- | -------- | -------------------- | ------ | ------------------ |
| P1 | 240 | 7 | 1.75 = 7 / (240 /60) | 1344 G | 504 G = 7*64+56 |
| P2 | 30 | 1 | 2 = 1 / (30 /60) | 1536 G | 96 G = 1*96 |
| Commit | 35 | 1 | 1.71 = 1 / (35 /60) | 1316 G | 192 G = 1*128+64 |

The goal is to have `output` for each task to be as close as possilbe so that the sealing pipeline runs in its maximum efficiency. Things to watchout for includes...

1. `hourly production` for Commit is lower than P1, which may result in tasks backlogged in Commit phase.
2. When one type of tasks being overly efficient than others, resources may become idle.
3. Miro management is needed to have highest possible efficiency.

### Finding optimal pledging

For example, if you find 7 P1 task to the optimal for your system, change the following venus-sealer configurations.

```toml
[Sealing]
MaxSealingSectors = 7
```

## Stop-loss

If one of tasks fails too many times, manual intervention is needed to get sealing pipeline back to its normal output.

Remove sectors when you have the following issues.

1. Expired ticket
2. Expired Commit
3. Corrupted proof params

To remove incomplete sectors.

```bash
venus-sealer sectors remove --really-do-it <sectorNum>
```

## 目录

1. [提升密封扇区效率](Efficiency_of_sealing.md)

2. [系统监控值zabbix](System_monitor_of_Zabbix.md)