feat: support build tabby on windows #948

darknight · 2023-12-06T02:46:49Z

For #909

The trial is partially successful, raise PR for further discussion.

See comments inline.

crates/llama-cpp-bindings/src/engine.cc

crates/tabby-common/src/path.rs

crates/llama-cpp-bindings/build.rs

crates/tabby-download/Cargo.toml

darknight · 2023-12-06T03:59:46Z

My test env:

OS:
Windows 11

rustup default:
1.73.0-x86_64-pc-windows-msvc (default)

C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.35.32215\bin\Hostx64\x64> .\cl.exe
Microsoft (R) C/C++ Optimizing Compiler Version 19.35.32217.1 for x64

Local test:
Both cargo build and cargo build --release were done successfully, but the generated tabby.exe by cargo build seemed corrupted, it can not be run at all. The release build is working as expected.

I was suspecting probably because of unclean local development environment, so I setup a window runner on GitHub, see test-rust-windows.yml for more info.

CI test:
Can take a look at latest build: https://github.com/darknight/tabby/actions/runs/7109544886/job/19354636544

It's same result as my local testing, tabby.exe in release mode is able to execute and print ********enter main then exit********, while debug more produce an error:

error: process didn't exit successfully: `target\debug\tabby.exe` (exit code: 0xc000001d, STATUS_ILLEGAL_INSTRUCTION)

With some expriments and googling, I didn't have much luck, so raise a PR for public discussion. But I do suspect this corruption is due to incorrect build on llama-cpp-bindings (either llama.cpp or the binding itself)

WSL test:

For the change, I also tested in Ubuntu 22.04 via wsl, it's working fine. This is expected and proved those changes have no impact on already supported platforms (I don't have Mac but I think should be working as well).

cc: @wsxiaoys

wsxiaoys · 2023-12-06T05:13:37Z

For the change, I also tested in Ubuntu 22.04 via wsl, it's working fine. This is expected and proved those changes have no impact on already supported platforms (I don't have Mac but I think should be working as well).

I haven’t go through all details - but one thing might worth to try is you could build chat example on windows with llama.cpp instructions. As long as it works, we should have a way to make it behave properly in tabby as well

ichDaheim · 2023-12-07T08:51:27Z

First of all: I'm really looking forward to the possibility of using tabby under Windows. This would be fantastic and a great help to me. And i if have to admit that i have no insight to the code here - and do not know Rust at all. So please forgive me if my suggestion is nonsense. But from my understanding your main problem is to get llama.cpp build on different OS ?

If my assumption is correct: you may want to consider this repo from mozilla: https://github.com/Mozilla-Ocho/llamafile - for future builds?

If not: sorry for not keeping my mouth shut. ;-)

darknight · 2023-12-07T10:12:00Z

First of all: I'm really looking forward to the possibility of using tabby under Windows. This would be fantastic and a great help to me. And i if have to admit that i have no insight to the code here - and do not know Rust at all. So please forgive me if my suggestion is nonsense. But from my understanding your main problem is to get llama.cpp build on different OS ?

Your understanding is correct. The main problem I faced currently is if I build llama.cpp in Release mode, everything is fine, but if build in Debug mode, the generated llama libraries seems not working. Currently I'm investigating the root cause.

If my assumption is correct: you may want to consider this repo from mozilla: https://github.com/Mozilla-Ocho/llamafile - for future builds?

I took a quick look at this repo, seems its target is to generate one single-file executable to run everywhere. Well, what we need here are the libraries built from llama.cpp (more specificially, llama.lib & ggml_static.lib). Those libs will be used in tabby.

If not: sorry for not keeping my mouth shut. ;-)

Don't say that man, any idea/discussion/suggestion is welcome. @ichDaheim

wsxiaoys · 2023-12-07T10:14:14Z

Your understanding is correct. The main problem I faced currently is if I build llama.cpp in Release mode, everything is fine, but if build in Debug mode, the generated llama libraries seems not working. Currently I'm investigating the root cause.

One workaround I would like to take is to build llama.cpp in release mode regardless of rust build, I think this is also the current behavior in linux / mac.

darknight · 2023-12-07T15:14:08Z

Your understanding is correct. The main problem I faced currently is if I build llama.cpp in Release mode, everything is fine, but if build in Debug mode, the generated llama libraries seems not working. Currently I'm investigating the root cause.

One workaround I would like to take is to build llama.cpp in release mode regardless of rust build, I think this is also the current behavior in linux / mac.

If this is acceptable, I think the PR is ready for review.

Essentially, I just added conditional compilation settings for windows, so the changes are supposed to be completely back-compatible on *nx platforms.

Both build:

cargo build --package llama-cpp-bindings
-OR-
cargo build --package llama-cpp-bindings --release

succeed on my local laptop.

Be aware that this is the first step to enable tabby to build on windows.

cargo build still failed due to tabby crate depends on tabby-download crate, which has platform specific dependence.

I plan to raise 2 more PRs separately:

find aim alternative to replace it in tabby-download
update build.rs again to enable feature cuda & rocm, currently both features assume lib path in *nix style, need to change accordingly.

wsxiaoys · 2023-12-07T15:49:01Z

aim itself is opensource and it looks like its implementation is straight forward to port out from it

crates/llama-cpp-bindings/build.rs

wsxiaoys · 2023-12-10T04:10:37Z

Please add windows build to release.yml: https://github.com/TabbyML/tabby/blob/main/.github/workflows/release.yml (with cuda on)

darknight · 2023-12-10T06:48:56Z

Please add windows build to release.yml: https://github.com/TabbyML/tabby/blob/main/.github/workflows/release.yml (with cuda on)

cuda settings need to be updated to reflect cuda lib path on windows, and I'm working on that.

Is it ok that I raise a separate PR to include cuda change + workflow update?

crates/llama-cpp-bindings/build.rs

wsxiaoys · 2023-12-10T10:18:11Z

please rebase against main

darknight · 2023-12-10T10:22:11Z

crates/llama-cpp-bindings/build.rs

+            println!(r"cargo:rustc-link-search=native={}\lib\x64", cuda_path);
+        } else {
+            println!("cargo:rustc-link-search=native=/usr/local/cuda/lib64");
+        }
        println!("cargo:rustc-link-lib=cudart");
        println!("cargo:rustc-link-lib=culibos");


I did some search work, and compared the cuda doc carefully and found:

Until 12.0.0, there indeed exists culibos regardless of os type, it is a thread abstraction layer, as mentioned here: https://docs.nvidia.com/cuda/archive/12.0.0/cusparse/index.html#static-library-support

Since 12.0.1, in the similar page, there's no such keyword found: https://docs.nvidia.com/cuda/archive/12.0.1/cusparse/index.html#static-library-support

I do suspect this lib has been removed from 12.0.1 and afterwards, that's why I failed to compile on my laptop cause I installed latest cuda tookit (12.3.1).

We're using 11.7 during the release build, so I think this line should continue to work, even on the new windows settings (although I didn't test locally)

As for local development, we can raise a separate PR to fix it for both windows & *nix platforms. What do you think? @wsxiaoys

as long as the ci build pass - we're good

Whether to fix the build for cuda12 can be discussed separately

darknight · 2023-12-10T10:43:49Z

ci/prepare_build_environment.ps1

@@ -0,0 +1,3 @@
+Echo "install protocol buffer compiler..."
+choco install protoc
+choco install cuda --version=11.7.1.51694


One more note, I installed cuda by downloading complete .exe file (3.05GB) in my laptop, and it installed to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3, set CUDA_PATH by default.

Here I just use choco and assume it has similar behaviour, but be aware it may cause issue here.

wsxiaoys · 2023-12-10T12:06:35Z

debugging in #1009

wsxiaoys · 2023-12-10T12:23:07Z

fixed, please rebase

crates/llama-cpp-bindings/build.rs

.github/workflows/release.yml

wsxiaoys · 2023-12-11T02:58:03Z

Otherwise LGTM!

wsxiaoys · 2023-12-11T04:14:45Z

Thanks for the effort!

* feat: update config to support build on windows * resolve comment * update release.yml * resolve comment

ichDaheim · 2023-12-11T09:03:35Z

Hi @darknight @wsxiaoys
the windows version does not start on my machine. under powershell cmd the command ".\tabby_x86_64-pc-windows-msvc.exe serve --model TabbyML/CodeLlama-7B --chat-model TabbyML/Mistral-7B" does not do anything at all.
Double clicking the .exe gives me the error message that "cublas64_11.dll" and "cudart64.dll" are not found. (do not have a nvidia gpu). Guess this means the current version is only for users with a nvidia graphics card ?

wsxiaoys · 2023-12-11T09:52:49Z

Yes - the distributed binary requires cuda runtime (thus nvidia gpu card). If you're interested in CPU only binary distribution, feel free to file an issue, thanks!

darknight marked this pull request as draft December 6, 2023 02:46