-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #1908 from filecoin-project/docs/cpu-choice
add a doc on amd vs intel cpus
- Loading branch information
Showing
1 changed file
with
70 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
# Why does Filecoin mining work best on AMD? | ||
Currently, Filecoin's Proof of Replication (PoRep) prefers to be run on AMD | ||
processors. More accurately, it runs much much slower on Intel CPUs (it runs | ||
competitively fast on some ARM processors, like the ones in newer Samsung | ||
phones, but they lack the RAM to seal the larger sector sizes). The main reason | ||
that we see this benefit on AMD processors is due to their implementation of | ||
the SHA hardware instructions. Now, why do we use the SHA instruction? | ||
|
||
## PoRep security assumptions | ||
Our research team has two different models for the security of Proofs of | ||
Replication. These are the Latency Assumption, and the Cost Assumption. These | ||
assumptions are arguments for why an attacker cannot pull off a 'regeneration | ||
attack'. That is, the attacker cannot seal and commit random data (generated by | ||
a function), delete it, and then reseal it on the fly to respond to PoSt | ||
challenges, without actually storing the data for that time period. | ||
|
||
### Cost Assumptions | ||
The cost assumption states that the real money cost (hardware, electricity, | ||
etc) of generating a sector is higher than the real money cost of simply | ||
storing it on disks. NSE is a new PoRep our research team is working on that is | ||
based on the cost assumption, and is thus able to be very parallelizable (In | ||
comparison to schemes based on a latency assumption, as will be explained | ||
next). However, cost assumptions vary greatly with available and hypothetical | ||
hardware. For example, someone making an ASIC for NSE could break the cost | ||
assumption by lowering the cost of sealing too much. This is one of our main | ||
hesitations around shipping NSE. | ||
|
||
### Latency Assumptions | ||
A Proof of Replication that is secure under a latency assumption is secure | ||
because an attacker cannot regenerate the data in time. We use this assumption | ||
for SDR, where we assume that an attacker cannot regenerate enough of a sector | ||
fast enough to respond to a PoSt. The way we achieve this is through the use | ||
of depth-robust graphs. Without going into too much detail, depth-robust | ||
graphs guarantee a minimum number of serial operations to compute an encoding | ||
based on the graph. Each edge in the graph represents an operation we need to | ||
perform. We thus have a guarantee that someone has to perform some operation | ||
N times in a row in order to compute the encoding. That means that the | ||
computation of the encoding must take at least as long as N times the fastest | ||
someone can do that operation. | ||
|
||
Now, to make this secure, we need to choose an operation that can't be made | ||
much faster. There are many potential candidates here, depending on what | ||
hardware you want to require. We opted not to require ASICs in order to mine | ||
Filecoin, so that limits our choices severely. We have to look at what | ||
operations CPUs are really good at. One candidate was AES encryption, which | ||
also has hardware instructions. However, the difference between the performance | ||
of CPU AES instructions, and the hypothetical 'best' performance you get was | ||
still too great. This gap is generally called 'Amax', an attacker’s maximum | ||
advantage. The higher the Amax of an algorithm we choose, the more expensive | ||
the overall process has to become in order to bound how fast the attacker could | ||
do it. | ||
As we were doing our research, we noticed that AMD shipped their new processors | ||
with a builtin SHA function, and we looked into how fast someone could possibly | ||
compute a SHA hash. We found that AMD’s implementation is only around 3 times | ||
slower than anyone could reasonably do (given estimates by the hardware | ||
engineers at [Supranational](https://www.supranational.net/) ). This is | ||
incredibly impressive for something you can get in consumer hardware. With | ||
this, we were able to make SDR sealing reasonably performant for people with | ||
off-the-shelf hardware. | ||
|
||
## Super Optimized CPUs | ||
|
||
Given all of the above, with a latency assumption that we're basing our proofs | ||
on right now, you need a processor that can do iterated SHA hashes really fast. | ||
As mentioned earlier, this isn’t just AMD processors, but many ARM processors | ||
also have support for this. Hopefully, new Intel processors also follow suit. | ||
But for now, Filecoin works best on AMD processors. | ||
|
||
|
||
|