whisper : add context param for disable gpu #1293

jhen0409 · 2023-09-15T22:21:16Z

Currently the Metal backend is using some SIMD operations, which it's only supported for Apple7+ family devices (ref: Metal-Feature-Set-Tables.pdf). In order to allow most older devices to run whisper.cpp normally, we can provide a param like use_gpu.

I think it will be helpful for some whisper.cpp binding that enabled Metal in build, but need to support old devices, or just don't want to access the GPU resources in some cases.

In this PR, I've added a new struct whisper_context_params { use_gpu = true } for that, I think it will also used for param like use_mmap in the future.

@ggerganov please let me know if you think this is a good idea or not, if yes I will do update for other backends & all bindings.

bobqianic · 2023-09-16T14:18:19Z

It seems that this modification could introduce some backward compatibility issues, which would necessitate refactoring many of the APIs. I recall the last time I attempted to introduce a debug-mode option in whisper.cpp. It proved challenging to implement without disrupting the existing API. Generally speaking, I believe the current API design lacks elegance. A well-designed API should ensure both forward and backward compatibility, so that minor changes don't inconvenience or disrupt the users.

jhen0409 · 2023-09-16T17:21:43Z

It seems that this modification could introduce some backward compatibility issues, which would necessitate refactoring many of the APIs. I recall the last time I attempted to introduce a debug-mode option in whisper.cpp. It proved challenging to implement without disrupting the existing API. Generally speaking, I believe the current API design lacks elegance. A well-designed API should ensure both forward and backward compatibility, so that minor changes don't inconvenience or disrupt the users.

We can consider to add new methods and mark deprecated for the old methods, it can retain some flexibility for users.

(There were some issues on Comments with my phone that made me delete the previous comment, sorry for the double notifications.)

bobqianic · 2023-09-24T12:01:32Z

@ggerganov ping

ggerganov · 2023-09-24T14:16:22Z

Will take a look this week - been travelling for a few days

ggerganov · 2023-10-03T20:04:06Z

@jhen0409 Yes, this is a good idea

jhen0409 · 2023-10-05T05:36:47Z

I'm looking for a way to disable CUDA backend (and OpenCL) by the param, should we just check GGML_BACKEND_GPU or need add a param to ggml?

I found all tensors are not setup backend as GGML_BACKEND_GPU, and it only use ggml_cuda_mul_mat (during can_mul_mat(...) will be true). Compared to llama.cpp, I think it missing setup such like a call of ggml_cuda_assign_buffers, but I'm not still not familiar with it. Also, it seems like related to #1179.

…API for java

jhen0409 · 2023-10-06T06:32:33Z

ggml-metal.m

+#ifdef GGML_SWIFT
+    bundle = SWIFTPM_MODULE_BUNDLE;
 #else
-    UNUSED(msl_library_source);
+    bundle = [NSBundle bundleForClass:[GGMLMetalClass class]];
+#endif

-    // read the source from "ggml-metal.metal" into a string and use newLibraryWithSource
    {
        NSError * error = nil;
+        NSString * libPath = [bundle pathForResource:@"default" ofType:@"metallib"];
+        if (libPath != nil) {
+            NSURL * libURL = [NSURL fileURLWithPath:libPath];
+            metal_printf("%s: loading '%s'\n", __func__, [libPath UTF8String]);
+            ctx->library = [ctx->device newLibraryWithURL:libURL error:&error];
+        } else {
+            metal_printf("%s: default.metallib not found, loading from source\n", __func__);
+
+            NSString * sourcePath = [bundle pathForResource:@"ggml-metal" ofType:@"metal"];
+            metal_printf("%s: loading '%s'\n", __func__, [sourcePath UTF8String]);
+            NSString * src = [NSString stringWithContentsOfFile:sourcePath encoding:NSUTF8StringEncoding error:&error];
+            if (error) {
+                metal_printf("%s: error: %s\n", __func__, [[error description] UTF8String]);
+                return NULL;
+            }


I enabled metal for test whisper.swiftui, and the project required to load compiled default.metallib, so I made this change. @bachittle I think this should be also help for ggerganov/llama.cpp#3284.

For GGML_SWIFT I use SWIFTPM_MODULE_BUNDLE instead and reuse the code of default.metallib load, I think it should be also work in llama.cpp swift package (need some tests later).

jhen0409 · 2023-10-06T06:33:52Z

Aside from the GPU backend questions, I think another things are ready to review.

paulocoutinhox · 2023-10-24T06:10:53Z

It solve this #1386?

bobqianic · 2023-10-24T23:50:06Z

I found all tensors are not setup backend as GGML_BACKEND_GPU, and it only use ggml_cuda_mul_mat (during can_mul_mat(...) will be true). Compared to llama.cpp, I think it missing setup such like a call of ggml_cuda_assign_buffers, but I'm not still not familiar with it. Also, it seems like related to #1179.

I'm in a similar situation. I'm completely lost in the llama.cpp code. Can't figure out how to offload the other operations to the GPU.

ggerganov · 2023-11-05T18:12:00Z

Aside from the GPU backend questions, I think another things are ready to review.

I will implement full GPU offloading in the following days.
Turning off CUDA is currently discussed in ggerganov/llama.cpp#3946

ggerganov

@jhen0409 This should be good to merge, correct?

jhen0409 · 2023-11-05T23:53:21Z

@jhen0409 This should be good to merge, correct?

Yes, it should be ready.

* whisper : check state->ctx_metal not null * whisper : add whisper_context_params { use_gpu } * whisper : new API with params & deprecate old API * examples : use no-gpu param && whisper_init_from_file_with_params * whisper.objc : enable metal & disable on simulator * whisper.swiftui, metal : enable metal & support load default.metallib * whisper.android : use new API * bindings : use new API * addon.node : fix build & test * bindings : updata java binding * bindings : add missing whisper_context_default_params_by_ref WHISPER_API for java * metal : use SWIFTPM_MODULE_BUNDLE for GGML_SWIFT and reuse library load * metal : move bundle var into block * metal : use SWIFT_PACKAGE instead of GGML_SWIFT * style : minor updates --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

jhen0409 added 2 commits September 16, 2023 06:32

whisper : check state->ctx_metal not null

9398498

whisper : add whisper_context_params { use_gpu }

3d1369e

jhen0409 force-pushed the gpu-param branch from 0787949 to 3d1369e Compare September 15, 2023 22:33

jhen0409 force-pushed the gpu-param branch from 5f735c0 to 5c2325f Compare September 18, 2023 01:36

whisper : new API with params & deprecate old API

f924c41

jhen0409 force-pushed the gpu-param branch from 5c2325f to f924c41 Compare September 18, 2023 01:58

jhen0409 mentioned this pull request Sep 18, 2023

ruby : fix build by add missing ggml-alloc #1305

Merged

bobqianic requested a review from ggerganov September 24, 2023 12:01

jhen0409 mentioned this pull request Oct 4, 2023

feat: sync whisper.cpp (enable ggml-alloc) mybigday/whisper.rn#123

Merged

Merge branch 'master' into gpu-param

f10db5f

jhen0409 added 6 commits October 6, 2023 09:25

examples : use no-gpu param && whisper_init_from_file_with_params

378bdb2

whisper.objc : enable metal & disable on simulator

a0aba3e

whisper.swiftui, metal : enable metal & support load default.metallib

41bc044

whisper.android : use new API

22ab809

bindings : use new API

e0ebea2

addon.node : fix build & test

851b2ce

jhen0409 force-pushed the gpu-param branch from 94ecb91 to 851b2ce Compare October 6, 2023 03:20

jhen0409 added 3 commits October 6, 2023 12:40

bindings : updata java binding

0b0e368

bindings : add missing whisper_context_default_params_by_ref WHISPER_…

883e04f

…API for java

metal : use SWIFTPM_MODULE_BUNDLE for GGML_SWIFT and reuse library load

8e49f2a

jhen0409 commented Oct 6, 2023

View reviewed changes

jhen0409 marked this pull request as ready for review October 6, 2023 06:33

metal : move bundle var into block

bda4b59

jhen0409 mentioned this pull request Oct 7, 2023

metal : support default.metallib load & reuse code for swift package ggerganov/llama.cpp#3522

Merged

metal : use SWIFT_PACKAGE instead of GGML_SWIFT

52c3dd6

bobqianic added the need feedback Testing and feedback with results are needed label Oct 17, 2023

bobqianic mentioned this pull request Oct 18, 2023

Problem when use with iOS and Metal: GGML_METAL_ADD_KERNEL #1377

Closed

jhen0409 mentioned this pull request Oct 22, 2023

GGML_USE_METAL=1 fails on iPadOS running iOS 15/16 (and 17 perhaps) #1387

Open

Merge branch 'master' into gpu-param

aaf9649

style : minor updates

d7b6b35

ggerganov approved these changes Nov 5, 2023

View reviewed changes

ggerganov merged commit 0463028 into ggerganov:master Nov 6, 2023
39 checks passed

jhen0409 mentioned this pull request Nov 7, 2023

talk-llama : fix n_gpu_layers usage #1441

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

whisper : add context param for disable gpu #1293

whisper : add context param for disable gpu #1293

jhen0409 commented Sep 15, 2023 •

edited

Loading

bobqianic commented Sep 16, 2023

jhen0409 commented Sep 16, 2023

bobqianic commented Sep 24, 2023

ggerganov commented Sep 24, 2023

ggerganov commented Oct 3, 2023

jhen0409 commented Oct 5, 2023

jhen0409 Oct 6, 2023 •

edited

Loading

jhen0409 commented Oct 6, 2023

paulocoutinhox commented Oct 24, 2023

bobqianic commented Oct 24, 2023

ggerganov commented Nov 5, 2023

ggerganov left a comment

jhen0409 commented Nov 5, 2023

whisper : add context param for disable gpu #1293

whisper : add context param for disable gpu #1293

Conversation

jhen0409 commented Sep 15, 2023 • edited Loading

bobqianic commented Sep 16, 2023

jhen0409 commented Sep 16, 2023

bobqianic commented Sep 24, 2023

ggerganov commented Sep 24, 2023

ggerganov commented Oct 3, 2023

jhen0409 commented Oct 5, 2023

jhen0409 Oct 6, 2023 • edited Loading

Choose a reason for hiding this comment

jhen0409 commented Oct 6, 2023

paulocoutinhox commented Oct 24, 2023

bobqianic commented Oct 24, 2023

ggerganov commented Nov 5, 2023

ggerganov left a comment

Choose a reason for hiding this comment

jhen0409 commented Nov 5, 2023

jhen0409 commented Sep 15, 2023 •

edited

Loading

jhen0409 Oct 6, 2023 •

edited

Loading