Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arm64 Raspberry Pi Build Failure #2176

Closed
deefactorial opened this issue Jun 29, 2020 · 60 comments
Closed

Arm64 Raspberry Pi Build Failure #2176

deefactorial opened this issue Jun 29, 2020 · 60 comments
Assignees
Labels
help wanted Extra attention is needed

Comments

@deefactorial
Copy link

deefactorial commented Jun 29, 2020

Describe the bug
When trying to build the lotus project from source files I run into this error:

    error[E0433]: failed to resolve: use of undeclared type or module `cc`
     --> /root/.cargo/registry/src/github.com-1ecc6299db9ec823/fil-sapling-crypto-0.6.0/build.rs:6:9
      |
    6 |         cc::Build::new()
      |         ^^ use of undeclared type or module `cc`
    
    error: aborting due to previous error

To Reproduce
Steps to reproduce the behavior:

  1. followed the steps in this issue:
    lotus aarch64 make all error #1779
export RUSTFLAGS="-C target-cpu=native -g"
export FFI_BUILD_FROM_SOURCE=1
make clean deps bench

Expected behavior
The lotus project to build without errors

Screenshots
Someone already reported the issue on an upstream repo.
zcash/sapling-crypto#104

Version (run lotus --version):
unable to compile the latest version

Additional context
Add any other context about the problem here.
Arm64 Architecture

@deefactorial
Copy link
Author

@magik6k magik6k added the help wanted Extra attention is needed label Jul 1, 2020
@LyleLee
Copy link

LyleLee commented Jul 6, 2020

Hi, the developer, any suggestions on this issues?

I am seeing the same problem when compiling filecoin-ffi.

Step to produce:

cd lotus/extern/filecoin-ffi
export RUSTFLAGS="-C target-cpu=native -g"
export FFI_BUILD_FROM_SOURCE=1
make clean
make all

The errors:

  Compiling half v1.6.0
   Compiling termcolor v1.1.0
   Compiling base64 v0.12.3
   Compiling humansize v1.1.0
   Compiling tee v0.1.0
error[E0433]: failed to resolve: use of undeclared type or module `cc`
 --> /home/user1/.cargo/registry/src/github.com-1ecc6299db9ec823/fff-0.2.2/build.rs:4:9
  |
4 |         cc::Build::new()
  |         ^^ use of undeclared type or module `cc`

error: aborting due to previous error

For more information about this error, try `rustc --explain E0433`.
error: could not compile `fff`.

To learn more, run the command again with --verbose.
warning: build failed, waiting for other jobs to finish...
error[E0433]: failed to resolve: use of undeclared type or module `cc`
 --> /home/user1/.cargo/registry/src/github.com-1ecc6299db9ec823/fil-sapling-crypto-0.6.2/build.rs:6:9
  |
6 |         cc::Build::new()
  |         ^^ use of undeclared type or module `cc`

error: aborting due to previous error

For more information about this error, try `rustc --explain E0433`.
error: could not compile `fil-sapling-crypto`.

To learn more, run the command again with --verbose.
warning: build failed, waiting for other jobs to finish...
error: build failed
+ rm -f /tmp/tmp.KivouBKvHt
Makefile:11: recipe for target '.install-filcrypto' failed
make: *** [.install-filcrypto] Error 101

@deefactorial
Copy link
Author

@dignifiedquire would you have any suggestions? or know how to contact original commit authors https://github.com/DrPeterVanNostrand or https://github.com/sean-sn to ask them how to fix it?

@deefactorial
Copy link
Author

I looked into this PR filecoin-project/sapling-crypto#10 which inserted the code in question and it appears that they added Pedersen Hash pre-comp for amd64 architecture based on the reasoning in this document https://github.com/filecoin-project/lotus/blob/master/documentation/en/sealing-procs.md and what is needed is to add it for other architectures like arm64. I found a research document that describes the implementation https://iden3-docs.readthedocs.io/en/latest/iden3_repos/research/publications/zkproof-standards-workshop-2/pedersen-hash/pedersen.html
but implementing Pedersen Hash for Arm64 in Rust is beyond my expertise. Is there any way we could use an architecture independent implementation of the Pedersen Hash function in order to get this to compile?

@dignifiedquire
Copy link
Contributor

The default implementation is in 100% rust, the issue why it is not compiling properly is that there is a bug in the dection of the target architecture and when to link the assembly.

@deefactorial
Copy link
Author

I've tried to follow this guide for cross-compiling rust for raspberry pi on linux but I have been unsuccessful

@dignifiedquire
Copy link
Contributor

@deefactorial this should fix it: filecoin-project/sapling-crypto#18

@deefactorial
Copy link
Author

Awesome, I will test asap.

@dignifiedquire
Copy link
Contributor

There is some more work needed: filecoin-project/rust-fil-proofs#1204

@magik6k
Copy link
Contributor

magik6k commented Jul 8, 2020

Related rust-fil-proofs epic issue: filecoin-project/rust-fil-proofs#1205

@ribasushi
Copy link
Collaborator

@deefactorial using the awesome work above, I am currently able to build the entirety of ./lotus using:

~/lotus$ rustup toolchain install nightly
~/lotus$ git submodule update --init --recursive
~/lotus$ patch -p0 <<"EOP"
--- ./extern/filecoin-ffi/rust/Cargo.toml
+++ ./extern/filecoin-ffi/rust/Cargo.toml
@@ -31,2 +31,7 @@ serde_json = "1.0.46"
 
+[patch.crates-io]
+filecoin-proofs = { git = "https://github.com/filecoin-project/rust-fil-proofs", branch = "feat/aarch64" }
+fil-sapling-crypto = { git = "https://github.com/filecoin-project/sapling-crypto", branch = "fix/compile-nonx86" }
+fff = { git = "https://github.com/filecoin-project/ff", branch = "fix-arch" }
+
 [dependencies.filecoin-proofs-api]
--- ./extern/filecoin-ffi/rust/rust-toolchain
+++ ./extern/filecoin-ffi/rust/rust-toolchain
@@ -1 +1 @@
-1.43.1
+nightly
EOP
~/lotus$ RUSTFLAGS="-C target-cpu=native -g" FFI_BUILD_FROM_SOURCE=1 make all

Please try this and report back when time permits!

@RobQuistNL
Copy link
Contributor

@ribasushi awesome! Its almost there I think, but I'm still getting a failure :(

 Compiling miniz_oxide v0.4.0
error: failed to run custom build command for `storage-proofs v4.0.3 (https://github.com/filecoin-project/rust-fil-proofs?branch=feat/aarch64#8c3b8e36)`

Caused by:
  process didn't exit successfully: `/home/pi/lotus/extern/filecoin-ffi/rust/target/release/build/storage-proofs-bf3f6056318ac680/build-script-build` (exit code: 101)
  --- stderr
  thread 'main' panicked at 'must be built for 64-bit architectures', /home/pi/.cargo/git/checkouts/rust-fil-proofs-0521dfc2b7dd3149/8c3b8e3/storage-proofs/build.rs:6:5
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
warning: build failed, waiting for other jobs to finish...
error: build failed
+ rm -f /tmp/tmp.JqsMsay37Q
make[1]: *** [Makefile:11: .install-filcrypto] Error 101
make[1]: Leaving directory '/home/pi/lotus/extern/filecoin-ffi'
make: *** [Makefile:37: build/.filecoin-install] Error 2

@ribasushi
Copy link
Collaborator

@RobQuistNL What arch is your pi? The underlying cryptoprimitive implementations really need a 64bit register set. There may be a way to recode the internals to fallback on 32bit-based emulation, but I am pretty sure the crypto-team has zero bandwidth to attempt this.

TLDR: At present aarch64 is a hard requirement I am afraid.

Patches welcome though: there is a decent test suite, so if you try to dig into this - there is a decent amount of guard-railing!

@RobQuistNL
Copy link
Contributor

Hey, its a Raspberry Pi 4 Model B Rev 1.2, installed it through NOOBS.I'm really unfamiliar with running different archs (or RPI's, or Go, or Rust), as you might have noticed ;p I assume I'm going to have to manually install an aarch64 os?

pi@raspberrypi:~/lotus $ uname -a
Linux raspberrypi 4.19.97-v7l+ #1294 SMP Thu Jan 30 13:21:14 GMT 2020 armv7l GNU/Linux
processor	: 0
model name	: ARMv7 Processor rev 3 (v7l)
BogoMIPS	: 108.00
Features	: half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 

@ribasushi
Copy link
Collaborator

@RobQuistNL apologies I used too much jargon: aarch64 is the CPU architecture - your pi has armv7l ( it's in the uname output ). You can't "fix" that.

https://en.wikipedia.org/wiki/List_of_ARM_microarchitectures

@LyleLee
Copy link

LyleLee commented Jul 10, 2020

@ribasushi @deefactorial Nice work! I verified the steps mentioned above, the build is successful.

banana@771c7e128ca9:~/lotus$ uname -a
Linux 771c7e128ca9 4.15.0-109-generic #110-Ubuntu SMP Tue Jun 23 02:40:18 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux
banana@771c7e128ca9:~/lotus$ ls lotus* -d
lotus  lotus-seal-worker  lotus-storage-miner  lotuspond
banana@771c7e128ca9:~/lotus$

But there is some warnings, I don't know if I should worry about:

go build  -ldflags="-X=github.com/filecoin-project/lotus/build.CurrentCommit=+git.b245fd0b.dirty" -o lotus ./cmd/lotus
# github.com/filecoin-project/filecoin-ffi/generated
/usr/bin/ld: warning: extern/filecoin-ffi/generated/../libfilcrypto.a(sha2raw-b988280a5635ee43.sha2raw.758cccdw-cgu.10.rcgu.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0000000
/usr/bin/ld: warning: extern/filecoin-ffi/generated/../libfilcrypto.a(ryu-f0ab4abe3b711f21.ryu.ae3lcmct-cgu.5.rcgu.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0000000
/usr/bin/ld: warning: extern/filecoin-ffi/generated/../libfilcrypto.a(ryu-f0ab4abe3b711f21.ryu.ae3lcmct-cgu.9.rcgu.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0000000
/usr/bin/ld: warning: extern/filecoin-ffi/generated/../libfilcrypto.a(aes_soft-4a30a7b3b229b50c.aes_soft.16f24ml4-cgu.8.rcgu.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0000000
/usr/bin/ld: warning: extern/filecoin-ffi/generated/../libfilcrypto.a(sha2-4156ae170d32d32b.sha2.2xmfubcj-cgu.4.rcgu.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0000000
/usr/bin/ld: warning: extern/filecoin-ffi/generated/../libfilcrypto.a(sha2ni-76c413449a7e37b2.sha2ni.8aflayr4-cgu.13.rcgu.o): unsupported GNU_PROPERTY_TYPE (5) type: 0xc0000000

@RobQuistNL
Copy link
Contributor

@LyleLee aaah, check - makes sense.

Is the aarch64 only available on older models then? I should have a RPI3 and an RPI2 laying around somewhere..

@dignifiedquire
Copy link
Contributor

I think it is an issue of the operating system. There ares some OS options for 64bit raspberry pi out there asfaict from a google search

@ribasushi
Copy link
Collaborator

@RobQuistNL yes, @dignifiedquire is correct, I missed your raspi model. The 4 definitely has a v8 cpu: https://en.wikipedia.org/wiki/Raspberry_Pi#Specifications

I suspect you need to get a "64bit OS" or somesuch. @LyleLee can you share your lscpu output?

@deefactorial
Copy link
Author

deefactorial commented Jul 10, 2020

I'm running mine on a 64bit ubuntu server instance and it seems to have compiled. I'm just getting some permission errors, I will have to install go for root and retry the build process.
test.txt

@deefactorial
Copy link
Author

deefactorial commented Jul 10, 2020

Success!

root@ubuntu:~/lotus# lscpu
Architecture:                    aarch64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
CPU(s):                          4
On-line CPU(s) list:             0-3
Thread(s) per core:              1
Core(s) per socket:              4
Socket(s):                       1
Vendor ID:                       ARM
Model:                           3
Model name:                      Cortex-A72
Stepping:                        r0p3
CPU max MHz:                     1500.0000
CPU min MHz:                     600.0000
BogoMIPS:                        108.00
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Spec store bypass: Vulnerable
Vulnerability Spectre v1:        Mitigation; __user pointer sanitization
Vulnerability Spectre v2:        Vulnerable
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
Flags:                           fp asimd evtstrm crc32 cpuid


root@ubuntu:~/lotus# ./lotus --version
lotus version 0.4.1+git.24d8a84a.dirty

So applying the patch above on a ubuntu server 64 bit instance as root did the trick.
Thanks for your support filecoin team!
You Rock!

@dignifiedquire
Copy link
Contributor

great news @deefactorial 🎉

@LyleLee
Copy link

LyleLee commented Jul 11, 2020

@RobQuistNL I am not using a Raspberry Pi actually, but an ARM64 server.
@ribasushi With my pleasure to share my lscpu output.

banana@771c7e128ca9:~/lotus$ lscpu
Architecture:        aarch64
Byte Order:          Little Endian
CPU(s):              128
On-line CPU(s) list: 0-127
Thread(s) per core:  4
Core(s) per socket:  16
Socket(s):           2
NUMA node(s):        4
Vendor ID:           0x48
Model:               0
Stepping:            0x1
CPU max MHz:         2600.0000
CPU min MHz:         200.0000
BogoMIPS:            200.00
L1d cache:           64K
L1i cache:           64K
L2 cache:            512K
L3 cache:            32768K
NUMA node0 CPU(s):   0-31
NUMA node1 CPU(s):   32-63
NUMA node2 CPU(s):   64-95
NUMA node3 CPU(s):   96-127
Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm

@RobQuistNL
Copy link
Contributor

Aaah that makes sense :)

Its time to phase out x86 anyway and go for ARM ;p

@RobQuistNL
Copy link
Contributor

Really wondering what openssh speed sha256 returns for you (as well as the 32GB benchmarks, by the way)

@ribasushi
Copy link
Collaborator

Really wondering what openssh speed sha256 returns for you

https://github.com/ribasushi/basic-aws-io-benchmark/compare/iobench_r5ad.2xlarge_2020-07-07_18-15-56..iobench_r6g.2xlarge_2020-07-07_17-19-25?diff=split&expand=1#diff-97c0aa1c594f9e7395e0162afef3d28f

There are 512MiB benches if you scroll higher up, the 32/64GiB will show up Mon-ish, check the repo for new commits

@RobQuistNL
Copy link
Contributor

wow

PC1 is so much faster :O really wonder what the 32gb looks like.

To compare:
512M sectors with a 2xE5-2680 v3 took 6.57m (1.23MiB/s)
512M sectors with an AMD 3900X took 3.26m (2.482MiB/s)
32GB sectors with an AMD Threadripper (don't know exact model) took 4h2m (2.25MiB/s)
32GB sectors with an AMD 3900X took 3h41m (2.46MiB/s)

Looks like ARM is going to be the way to go for PC1.... Awesome data!

The sha256 is actually slower on that ARM compared to my 3900X, so it might have to do with something else. I get about 2.2M for 16kb whereas your machine shows 1.5M for 16kb.

Both still much faster than some intel x86 tho

@ribasushi
Copy link
Collaborator

whereas your machine shows...

It's Jeff Bezos' machine :) The reason I compare the boxes core-to-core within AWS is because they both sit behind a common set of "handicaps" ( various vuln. mitigations, hypervisors, etc ).

There is an "arm metal" benchmark in the works, not sure whether it will matter much ( not sure how aws provisions metal instances, but it takes them considerably longer to boot, so it may be real hw ).

In any case: comparing apples to apples is prudent here, there are a few more branches in that repo ( and more will show up soon ) : click around.

@RobQuistNL
Copy link
Contributor

Yeah, I saw thats the r6x machine. I'm thinking it might be feasable to have an off-site (read: AWS) setup that handles the PC1 for me. Probably will have to squeeze PC2 in there, too, because otherwise its going to be hundreds of GB's that need to cross the network. But if a remote setup that costs ~200EUR / month can produce 1 sealed sector (2 if you'd go for the r6x2 with 128GB ram) per 2 hours, and then transfer those 32GB sectors over the internet to a simpler machine - it might very well be worth it.. :)

@RobQuistNL
Copy link
Contributor

I installed Archlinux on my Raspberry Pi 4, and managed to compile Lotus on it. However, when running the benchmarks, this is what happened:

2020-07-13T13:08:22.852 INFO filecoin_proofs::pieces > verifying 128 pieces
2020-07-13T13:08:22.853 INFO filecoin_proofs::api::seal > compute_comm_d:finish
2020-07-13T13:08:22.853 INFO filcrypto::proofs::api > generate_data_commitment: finish
2020-07-13T13:08:22.853Z	INFO	lotus-bench	lotus-bench/main.go:518	[1] Running replication(1)...
2020-07-13T13:08:27.037 INFO filcrypto::proofs::api > seal_pre_commit_phase1: start
2020-07-13T13:08:30.737 INFO filecoin_proofs::api::seal > seal_pre_commit_phase1:start
2020-07-13T13:09:44.162 INFO filecoin_proofs::api::seal > building merkle tree for the original data
2020-07-13T13:10:44.893 INFO filecoin_proofs::api::seal > verifying pieces
2020-07-13T13:10:44.894 INFO filecoin_proofs::pieces > verifying 1 pieces
2020-07-13T13:10:44.894 INFO storage_proofs_porep::stacked::vanilla::proof > replicate_phase1
2020-07-13T13:10:44.894 INFO storage_proofs_porep::stacked::vanilla::proof > generate labels
2020-07-13T13:10:44.894 INFO storage_proofs_porep::stacked::vanilla::graph > using parent_cache[2048 / 16777216]
2020-07-13T13:10:46.228 INFO storage_proofs_porep::stacked::vanilla::cache > parent cache: generating /var/tmp/filecoin-parents/v27-sdr-parent-2aa9c77c3e58259481351cc4be2079cc71e1c9af39700866545c043bfa30fb42.cache
2020-07-13T13:14:00.887 INFO storage_proofs_porep::stacked::vanilla::cache > parent cache: generated
2020-07-13T13:14:18.403 INFO storage_proofs_porep::stacked::vanilla::cache > parent cache: written to disk
2020-07-13T13:14:18.404 INFO storage_proofs_porep::stacked::vanilla::proof > generating layer: 1
SIGILL: illegal instruction
PC=0x1466c3c m=3 sigcode=1

goroutine 0 [idle]:
runtime: unknown pc 0x1466c3c
stack: frame={sp:0x7f6fffd250, fp:0x0} stack=[0x7f6f7fe990,0x7f6fffe590)
0000007f6fffd150:  0000000000000003  0000000000000000 
(....)
0000007f6fffd340:  0000007f17fff010  0000000000e9d35c 

goroutine 343 [syscall, 6 minutes]:
runtime.cgocall(0xdb5630, 0x400070d4f8, 0x0)
	/usr/lib/go/src/runtime/cgocall.go:133 +0x50 fp=0x400070d490 sp=0x400070d450 pc=0x4fe6f0
github.com/filecoin-project/filecoin-ffi/generated._Cfunc_fil_seal_pre_commit_phase1(0x4000000002, 0x400023e700, 0x400023e840, 0x400023e880, 0x1, 0x7e8, 0x0, 0x0, 0x0, 0xb243e526c051570e, ...)
	_cgo_gotypes.go:1640 +0x44 fp=0x400070d4f0 sp=0x400070d490 pc=0xa5d184
github.com/filecoin-project/filecoin-ffi/generated.FilSealPreCommitPhase1(0x4000000002, 0x400023e480, 0x32, 0x400023e180, 0x35, 0x400023e300, 0x33, 0x1, 0x7e8, 0x0, ...)
	/root/lotus/extern/filecoin-ffi/generated/generated.go:600 +0x2ec fp=0x400070d6c0 sp=0x400070d4f0 pc=0xa69a0c
github.com/filecoin-project/filecoin-ffi.SealPreCommitPhase1(0x2, 0x400023e480, 0x32, 0x400023e180, 0x35, 0x400023e300, 0x33, 0x1, 0x3e8, 0x40001880c0, ...)
	/root/lotus/extern/filecoin-ffi/proofs.go:285 +0x1f4 fp=0x400070d8c0 sp=0x400070d6c0 pc=0xa718f4
github.com/filecoin-project/sector-storage/ffiwrapper.(*Sealer).SealPreCommit1(0x40009588a0, 0x1efccc0, 0x4000034090, 0x3e8, 0x1, 0x40001880c0, 0x20, 0x20, 0x40002a2860, 0x1, ...)
	/root/go/pkg/mod/github.com/filecoin-project/sector-storage@v0.0.0-20200630180318-4c1968f62a8f/ffiwrapper/sealer_cgo.go:438 +0x314 fp=0x400070da80 sp=0x400070d8c0 pc=0xa7b944
main.runSeals.func1.1(0x0, 0x1, 0x3e8, 0x3bce110, 0x0, 0x0, 0x4000297bc0, 0x40009588a0, 0x4000103bc0, 0x4000297be0, ...)
	/root/lotus/cmd/lotus-bench/main.go:520 +0x358 fp=0x400070de40 sp=0x400070da80 pc=0xdb26c8
main.runSeals.func1(0x1, 0x3e8, 0x3bce110, 0x0, 0x0, 0x4000297bc0, 0x40009588a0, 0x4000103bc0, 0x4000297be0, 0x1, ...)
	/root/lotus/cmd/lotus-bench/main.go:641 +0xd0 fp=0x400070df20 sp=0x400070de40 pc=0xdb34d0
runtime.goexit()
	/usr/lib/go/src/runtime/asm_arm64.s:1148 +0x4 fp=0x400070df20 sp=0x400070df20 pc=0x55fc34
created by main.runSeals
	/root/lotus/cmd/lotus-bench/main.go:502 +0x540

If I'm not mistaken it originates from here: https://github.com/filecoin-project/filecoin-ffi/blob/master/generated/generated.go#L587

Would that mean that this CPU is missing a certain instructionset?

@ribasushi
Copy link
Collaborator

@RobQuistNL are you using any special flags/envvars when compiling? Is your golang downloaded or built from source? There are some weird stories about SIGILL when debugger shims are active...

TLDR: need more info to help you with this

@RobQuistNL
Copy link
Contributor

I installed golang through pacman if I'm not mistaken. But no worries - I don't think it makes any sense to run on a Raspberry anyway...

@ribasushi
Copy link
Collaborator

@RobQuistNL fair, for v0 it's not "critical" for sure ;)

@deefactorial @LyleLee we are tracking the merging of #2176 (comment) in a different ticket. Aside from that: can this (build) issue be considered resolved? If you experience errors like @RobQuistNL perhaps open new ones so that we can keep the conversation on a specific subtopic?

@LyleLee
Copy link

LyleLee commented Jul 20, 2020

@RobQuistNL fair, for v0 it's not "critical" for sure ;)

@deefactorial @LyleLee we are tracking the merging of #2176 (comment) in a different ticket. Aside from that: can this (build) issue be considered resolved? If you experience errors like @RobQuistNL perhaps open new ones so that we can keep the conversation on a specific subtopic?

It is a fix for lotus. But the fix affects go-filecoin compiling which I can't work it through.

@LyleLee
Copy link

LyleLee commented Jul 25, 2020

@ribasushi There is a new situation here, I modify the patch a little bit since fff and fil-sapling-crypto have merge aarch64 commits to the master branch. And I got a compile error on both x86 and arm64, Can you help out to stabilized patch into lotus master?

banana@28b9f781d968:~/lotus/extern/filecoin-ffi$ git diff --cached

diff --git a/rust/Cargo.toml b/rust/Cargo.toml
index 648eef4..a4b6092 100644
--- a/rust/Cargo.toml
+++ b/rust/Cargo.toml
@@ -16,7 +16,7 @@ crate-type = ["rlib", "staticlib"]
 bls-signatures = "0.6.0"
 byteorder = "1.2"
 drop_struct_macro_derive = "0.4.0"
-ff = { version = "0.2.1", package = "fff" }
+ff = { version = "0.2.3", package = "fff" }
 ffi-toolkit = "0.4.0"
 libc = "0.2.58"
 log = "0.4.7"
@@ -39,3 +39,6 @@ cbindgen = "= 0.14.0"
 [dev-dependencies]
 tempfile = "3.0.8"

+[patch.crates-io]
+filecoin-proofs = { git = "https://github.com/filecoin-project/rust-fil-proofs" , branch = "feat/aarch64" }
+fil-sapling-crypto = { git = "https://github.com/filecoin-project/sapling-crypto", branch = "master" }
diff --git a/rust/rust-toolchain b/rust/rust-toolchain
index 3987c47..bf867e0 100644
--- a/rust/rust-toolchain
+++ b/rust/rust-toolchain
@@ -1 +1 @@
-1.43.1
+nightly
banana@28b9f781d968:~/lotus$ RUSTFLAGS="-C target-cpu=native -g" FFI_BUILD_FROM_SOURCE=1 make all
make -C extern/filecoin-ffi/ .install-filcrypto
make[1]: Entering directory '/home/banana/lotus/extern/filecoin-ffi'
make[1]: '.install-filcrypto' is up to date.
make[1]: Leaving directory '/home/banana/lotus/extern/filecoin-ffi'
rm -f lotus
go build  -ldflags="-X=github.com/filecoin-project/lotus/build.CurrentCommit=+git.39216238.dirty" -o lotus ./cmd/lotus
# github.com/filecoin-project/sector-storage/zerocomm
../go/pkg/mod/github.com/filecoin-project/sector-storage@v0.0.0-20200630180318-4c1968f62a8f/zerocomm/zerocomm.go:54:2: too many arguments to return
        have (cid.Cid, error)
        want (cid.Cid)
Makefile:67: recipe for target 'lotus' failed
make: *** [lotus] Error 2

@ribasushi
Copy link
Collaborator

@cryptonemo could you please comment on whether a new tag for rust-ffi is upcoming, so I can bump it in lotus properly? It is fine if not planned, just checking what's the best way forward.

@cryptonemo
Copy link
Contributor

@ribasushi We'll be releasing a new rust-fil-proofs release this week (likely tomorrow) and I'll PR something for ffi after that.

@ribasushi
Copy link
Collaborator

@LyleLee ^^ let's wait for the above and then everything will get fixed for good. Too many moving parts at present.

@LyleLee
Copy link

LyleLee commented Jul 28, 2020

@LyleLee ^^ let's wait for the above and then everything will get fixed for good. Too many moving parts at present.

@ribasushi @cryptonemo Good job! Looking forward to that.

@cryptonemo
Copy link
Contributor

The ffi PR containing the latest release that supports aarch64 (via filecoin-project/rust-fil-proofs#1204) is here: filecoin-project/filecoin-ffi#124

@LyleLee
Copy link

LyleLee commented Jul 30, 2020

The ffi PR containing the latest release that supports aarch64 (via filecoin-project/rust-fil-proofs#1204) is here: filecoin-project/filecoin-ffi#124

I test the pull/124, still seeing errors that need help:

git checkout master
git fetch origin pull/124/head:fauxrep2
git merge fauxrep2
git submodule update
RUSTFLAGS="-C target-cpu=native -g" FFI_BUILD_FROM_SOURCE=1 make all
  Compiling termcolor v1.1.0
   Compiling base64 v0.12.3
   Compiling half v1.6.0
   Compiling number_prefix v0.3.0
   Compiling tee v0.1.0
   Compiling humansize v1.1.0
error[E0433]: failed to resolve: use of undeclared type or module `cc`
 --> /home/banana/.cargo/registry/src/github.com-1ecc6299db9ec823/fff-0.2.2/build.rs:4:9
  |
4 |         cc::Build::new()
  |         ^^ use of undeclared type or module `cc`

error: aborting due to previous error

For more information about this error, try `rustc --explain E0433`.
error: could not compile `fff`.

To learn more, run the command again with --verbose.
warning: build failed, waiting for other jobs to finish...
error[E0433]: failed to resolve: use of undeclared type or module `cc`
 --> /home/banana/.cargo/registry/src/github.com-1ecc6299db9ec823/fil-sapling-crypto-0.6.2/build.rs:6:9
  |
6 |         cc::Build::new()
  |         ^^ use of undeclared type or module `cc`

error: aborting due to previous error


@cryptonemo
Copy link
Contributor

@ribasushi The aarch64 platform you were using is not related to Raspberry Pi 64, correct? Does that actual target platform still build? I'm not specifically aware of Raspberry Pi 64 being a supported targetm can you comment there?

@RobQuistNL
Copy link
Contributor

He was running on AARCH64 platforms hosted by Amazon - I'm the idiot trying to run it on a raspberry pi :)

@ribasushi
Copy link
Collaborator

@LyleLee @RobQuistNL error on our side, we'll update when ready for retest

@cryptonemo
Copy link
Contributor

@ribasushi set me up with an aarch64 node and I'm unable to get this to compile. Strangely, I'm also unable to get the feat/aarch64 branch building (merged in filecoin-project/rust-fil-proofs#1204). It was reported that at one point this worked, so either 1) I'm missing something about how whoever tested it was building it, or 2) Some dependencies have changed that are now broken since then (unlikely). Anyone have any info on how the original testing was done, or if those setups still work?

@ribasushi
Copy link
Collaborator

@cryptonemo it was me that got it to work previously. I'll get things to build in the next ~6 hours and post exact branch/instructions here.

@cryptonemo
Copy link
Contributor

@cryptonemo it was me that got it to work previously. I'll get things to build in the next ~6 hours and post exact branch/instructions here.

I do believe it was working -- as you know, just trying to figure out what's changed or if I'm missing something. Thanks!

@ribasushi
Copy link
Collaborator

@LyleLee @RobQuistNL there has been some movement on this, should be resolved by EoW

@LyleLee
Copy link

LyleLee commented Aug 6, 2020

@ribasushi Thanks. I setup builds on Travis CI where support both x86 and Arm64. And x86 builds good but fail on Arm64.

The patches on filecoin-proofs and fil-sapling-crypto are still needed, my thought.

@ribasushi
Copy link
Collaborator

@LyleLee errors in the release process needs to be addressed, so that no patches are needed. I can provide you with a recipe on how to build lotus against all master s, but it's better to just fix this for good.

@LyleLee
Copy link

LyleLee commented Aug 7, 2020

@ribasushi At the last few line of Travis CI job log: Arm64 job failed: https://travis-ci.org/github/LyleLee/lotus/jobs/715412869:

   Compiling adler32 v1.1.0
   Compiling subtle v1.0.0
error[E0433]: failed to resolve: use of undeclared type or module `cc`
 --> /home/travis/.cargo/registry/src/github.com-1ecc6299db9ec823/fff-0.2.2/build.rs:4:9
  |
4 |         cc::Build::new()
  |         ^^ use of undeclared type or module `cc`
error: aborting due to previous error
For more information about this error, try `rustc --explain E0433`.
error: could not compile `fff`.
To learn more, run the command again with --verbose.
warning: build failed, waiting for other jobs to finish...
error: build failed
+rm -f /tmp/tmp.VdqFI1l3IA
Makefile:11: recipe for target '.install-filcrypto' failed
make[1]: Leaving directory '/home/travis/gopath/src/github.com/LyleLee/lotus/extern/filecoin-ffi'
make[1]: *** [.install-filcrypto] Error 101
Makefile:37: recipe for target 'build/.filecoin-install' failed
make: *** [build/.filecoin-install] Error 2
The command "RUSTFLAGS="-C target-cpu=native -g" FFI_BUILD_FROM_SOURCE=1 make all" exited with 2.

Yes, we can manage to fix this and build Lotus master branch by crafting a specific recipe. But it will be great if lotus master naturally builds successfully on Arm64.

@cryptonemo
Copy link
Contributor

@ribasushi I think we're resolved on this now?

@ribasushi ribasushi self-assigned this Aug 13, 2020
@ribasushi
Copy link
Collaborator

@cryptonemo @LyleLee I need to perform one more test on this, will update/close when ready ( EoW )

@ribasushi
Copy link
Collaborator

@LyleLee AT LAST! On your aarch64 You should be able to:

git clone https://github.com/filecoin-project/lotus.git

cd lotus

git checkout ntwk-calibration

git submodule update --init --recursive

patch -p0 <<"EOP"
diff --git ./extern/filecoin-ffi/rust/rust-toolchain b/rust/rust-toolchain
index 3987c47..bf867e0 100644
--- ./extern/filecoin-ffi/rust/rust-toolchain
+++ ./extern/filecoin-ffi/rust/rust-toolchain
@@ -1 +1 @@
-1.43.1
+nightly
EOP

RUSTFLAGS="-C target-cpu=native -g" FFI_BUILD_FROM_SOURCE=1 make all lotus-bench lotus-shed

The only difference from "official compilation" ( hence the required patch ) is that the build system of filecoin-ffi does not honor RUSTUP_TOOLCHAIN=xxx, but instead requires an explicit spec in the file we patch above. Rust-stable does not provide CPU intrinsic access on non-x86 architectures, thus the need to switch to nightly.

@LyleLee
Copy link

LyleLee commented Aug 17, 2020

@ribasushi Awesome! Verified! Both X86 and aarch64 are now building smooth as silver.

@ribasushi
Copy link
Collaborator

@LyleLee note that lotus v1.2.1 now builds and works on aarch out of the box ( no patches needed )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

7 participants