WIP: New python based entry point for containers #1686

jpodivin · 2023-06-03T20:43:38Z

No description provided.

jpodivin · 2023-06-05T11:53:42Z

@SlyEcho This is the endpoint change I've mentioned.

SlyEcho · 2023-06-05T13:08:49Z

Does it allow passing arguments with spaces?

jpodivin · 2023-06-06T20:33:42Z

Does it allow passing arguments with spaces?

I'm not sure what exactly you mean. My implementation makes some changes. But, for example, these work:

docker run -v /local/model/path/:/models llama.cpp:endpoint --all-in-one /models/7B
docker run -v/local/model/path/:/models llama.cpp:endpoint --run /models/7B/ggml-model-q4_0.bin -p "Building a website can be done in 10 simple steps:" -n 512

But I wouldn't really like to get this specific version merged. I would prefer subcommands over those args, as I wouldn't have to worry about overloading arguments between run/quantize/etc.

Not to say that it's impossible, but subcommands do offer flexibility you don't have with multiple arguments.

SlyEcho · 2023-06-06T20:46:21Z

That's what I meant.

I would prefer subcommands over those args, as I wouldn't have to worry about overloading arguments between run/quantize/etc

Couldn't it be done by just checking argv[1]? I don't know how to do it in Python but there is not really any "parsing" needed, I think, just switching by the string value. Kind of like the old script did.

jpodivin · 2023-06-07T05:52:20Z

Advantage of python argparse is that it does take care of those checks for you, including edge cases.
You can define CLI of arbitrary complexity using combination of commands, subcommands and args. The help, error handling, type checks and defaults come in the bargain.

Now you can do that with plain conditionals checks on argv elements, but you would also have to implement some other logic to make it robust. And that puts unnecessary burden on maintainers.

What I would propose is something like this:

usage: tools.py [-h] model {run,convert,quantize,all-in-one} ...

positional arguments:
  model                 Directory containing model file, or model file itself
                        (*.pth, *.pt, *.bin)
  {run,convert,quantize,all-in-one}
    run                 Run a model previously converted into ggml
    convert             Convert a llama model into ggml
    quantize            Optimize with quantization process ggml
    all-in-one          Execute convert & quantize

options:
  -h, --help            show this help message and exit

Each one of those would have their own args, so no chance of overloading. Actually I'm bit ashamed I've not proposed it already.

Edit: I'll make ASCIInema demo later today, when I have more time.

jpodivin · 2023-06-07T20:54:21Z

I have the promised demo.[0] It's aciinema[1] recording, so you can play it in terminal. I do recommend using the -i 1 when playing it, as otherwise you will have to endure long gaps in activity, as my desktop struggles. Also, I've been doing other things, and after quantization command I forgot I was recording for couple of minutes.

[0] https://gist.github.com/jpodivin/ef4d037c21bfc2ce0a9f91b1d3f29ea5
[1] https://asciinema.org/docs/usage

jpodivin · 2023-06-08T18:31:17Z

@SlyEcho wdyt?

SlyEcho · 2023-06-08T18:40:38Z

I have not had time to test it yet.

jpodivin · 2023-08-29T08:45:41Z

@SlyEcho So are we moving this out of WIP? Or should I just close it?

SlyEcho · 2023-08-29T16:55:01Z

I think it's still worth it.

There are a couple of things that have changed:

A lot more Dockerfiles.
The model format changed, so now we have .gguf files.

jpodivin · 2023-08-29T17:06:01Z

Right. I'll take a look at it.

…

On Tue, Aug 29, 2023 at 6:55 PM Henri Vasserman ***@***.***> wrote: I think it's still worth it. There are a couple of things that have changed: 1. A lot more Dockerfiles. 2. The model format changed, so now we have .gguf files. — Reply to this email directly, view it on GitHub <#1686 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/APZOTD24XBLNH7DVKAD6XYDXXYNG7ANCNFSM6AAAAAAYZPZ3SQ> . You are receiving this because you authored the thread.Message ID: ***@***.***>

Signed-off-by: Jiri Podivin <jpodivin@gmail.com>

jpodivin · 2023-08-30T18:22:25Z

@SlyEcho I've updated the script to work with new model format and server binary. Also I've replaced the endpoint in other container files. I've tested the standard file quantization, conversion, run and server binary with open-llama.

jpodivin mentioned this pull request Jun 3, 2023

Replacing call to convert-pth-to-ggml.py with convert.py #1641

Merged

jpodivin force-pushed the python-endpoint branch from 5c39555 to f56beda Compare June 6, 2023 20:30

jpodivin force-pushed the python-endpoint branch 2 times, most recently from 5020781 to 89e7976 Compare June 8, 2023 18:30

jpodivin marked this pull request as ready for review June 15, 2023 06:37

jpodivin force-pushed the python-endpoint branch from 89e7976 to 048fd14 Compare July 16, 2023 12:53

SlyEcho mentioned this pull request Aug 9, 2023

Error With Docker #2535

Closed

SlyEcho mentioned this pull request Aug 29, 2023

[Docker] fix tools.sh argument passing. #2884

Merged

New python based entry point for containers

bb0cadf

Signed-off-by: Jiri Podivin <jpodivin@gmail.com>

jpodivin force-pushed the python-endpoint branch from 048fd14 to bb0cadf Compare August 30, 2023 14:24

jpodivin closed this Mar 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: New python based entry point for containers #1686

WIP: New python based entry point for containers #1686

jpodivin commented Jun 3, 2023

jpodivin commented Jun 5, 2023

SlyEcho commented Jun 5, 2023

jpodivin commented Jun 6, 2023 •

edited

Loading

SlyEcho commented Jun 6, 2023

jpodivin commented Jun 7, 2023 •

edited

Loading

jpodivin commented Jun 7, 2023

jpodivin commented Jun 8, 2023

SlyEcho commented Jun 8, 2023

jpodivin commented Aug 29, 2023

SlyEcho commented Aug 29, 2023

jpodivin commented Aug 29, 2023 via email

jpodivin commented Aug 30, 2023

WIP: New python based entry point for containers #1686

WIP: New python based entry point for containers #1686

Conversation

jpodivin commented Jun 3, 2023

jpodivin commented Jun 5, 2023

SlyEcho commented Jun 5, 2023

jpodivin commented Jun 6, 2023 • edited Loading

SlyEcho commented Jun 6, 2023

jpodivin commented Jun 7, 2023 • edited Loading

jpodivin commented Jun 7, 2023

jpodivin commented Jun 8, 2023

SlyEcho commented Jun 8, 2023

jpodivin commented Aug 29, 2023

SlyEcho commented Aug 29, 2023

jpodivin commented Aug 29, 2023 via email

jpodivin commented Aug 30, 2023

jpodivin commented Jun 6, 2023 •

edited

Loading

jpodivin commented Jun 7, 2023 •

edited

Loading