Replacing call to `convert-pth-to-ggml.py` with `convert.py` #1641

jpodivin · 2023-05-29T17:07:27Z

Deprecation disclaimer was added to convert-pth-to-ggml.py
I'm keeping the script around for now, as it may be needed for backward compat.

That being said, removing it in near future would be prudent.

I've verified that the convert.py can take other arguments when called in container.
Although I haven't tested all possible combinations.

#1628

Deprecation disclaimer was added to convert-pth-to-ggml.py Signed-off-by: Jiri Podivin <jpodivin@gmail.com>

windprak · 2023-06-03T08:12:32Z

Convert.py gives me this error consistently on many different computers and operating systems:
line 868, in lazy_load_file elif struct.unpack('<Q', first8)[0] < 16 * 1024 * 1024: struct.error: unpack requires a buffer of 8 bytes

jpodivin · 2023-06-03T08:17:07Z

Convert.py gives me this error consistently on many different computers and operating systems: line 868, in lazy_load_file elif struct.unpack('<Q', first8)[0] < 16 * 1024 * 1024: struct.error: unpack requires a buffer of 8 bytes

Hi. Could you give me the full trace of this? Along with the arguments used in CLI?

I haven't actually changed the convert.py script. So whatever this issue is, it must have been there before, but I can try to fix it in a new PR if I get enough info.

SlyEcho · 2023-06-03T10:05:27Z

Convert.py gives me this error consistently on many different computers and operating systems

That probably means the file you're trying to convert is faulty.

windprak · 2023-06-03T11:06:13Z

Convert.py gives me this error consistently on many different computers and operating systems

That probably means the file you're trying to convert is faulty.

But the pth file still has the same checksum as in my checklist.chk.

windprak · 2023-06-03T11:16:03Z

Convert.py gives me this error consistently on many different computers and operating systems: line 868, in lazy_load_file elif struct.unpack('<Q', first8)[0] < 16 * 1024 * 1024: struct.error: unpack requires a buffer of 8 bytes

Hi. Could you give me the full trace of this? Along with the arguments used in CLI?

I haven't actually changed the convert.py script. So whatever this issue is, it must have been there before, but I can try to fix it in a new PR if I get enough info.

I checked my files and the checksums look fine. It happens with every model size. Here is my CLI call and output:

python3 -m trace --count -C . convert.py models/7B
Loading model file models/7B/consolidated.00.pth
Loading model file models/7B/consolidated.01.pth
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/usr/lib/python3.10/trace.py", line 740, in
main()
File "/usr/lib/python3.10/trace.py", line 728, in main
t.runctx(code, globs, globs)
File "/usr/lib/python3.10/trace.py", line 450, in runctx
exec(cmd, globals, locals)
File "convert.py", line 1168, in
main()
File "convert.py", line 1148, in main
model_plus = load_some_model(args.model)
File "convert.py", line 1074, in load_some_model
models_plus.append(lazy_load_file(path))
File "convert.py", line 868, in lazy_load_file
elif struct.unpack('<Q', first8)[0] < 16 * 1024 * 1024:
struct.error: unpack requires a buffer of 8 bytes

Intresstingly it loads loading model file models/7B/consolidated.01.pth which is 0KB. That is the reason for the error... when moving the 0KB pth files in another directory, it works!

SlyEcho · 2023-06-03T11:20:12Z

7B has only one pth file: consolidated.00.pth

windprak · 2023-06-03T11:34:17Z

7B has only one pth file: consolidated.00.pth

I have no idea where the 0kb pth parts came from but they did cost me a lot of time...

jpodivin · 2023-06-03T11:50:32Z

In that case I suppose it really is not related to this patch and we can get it merged. Unless there is something else going wrong?

SlyEcho · 2023-06-03T12:09:48Z

Tested, it works.

However I can't seem to pass an argument with spaces to --run -p but it's an unrelated issue.

jpodivin · 2023-06-03T12:17:54Z

Tested, it works.

However I can't seem to pass an argument with spaces to --run -p but it's an unrelated issue.

Yes, that is I believe due to the way the endpoint argument processing works. Shell isn't exactly known for dealing well with multiple levels of quoting. I do have a rewrite into python in the works though.

SlyEcho · 2023-06-03T12:57:25Z

I don't really see the need for an entrypoint script at all, actually.

you could put something like ENV PATH="/app/build/bin:$PATH" in the end and then the user can run docker run llama.cpp main -m ... -p ... etc.

jpodivin · 2023-06-03T14:14:30Z

I don't really see the need for an entrypoint script at all, actually.

you could put something like ENV PATH="/app/build/bin:$PATH" in the end and then the user can run docker run llama.cpp main -m ... -p ... etc.

TBH, this is a UI/UX question. Adapting container paradigm fundamentally means that there is now additional layer between user and the tool. How should that layer behave depends on the use case. For llama.cpp we are talking about a set of CLI tools, each with a special purpose. It could be argued that at some point, although not necessarily today, it may make sense to unify them all behind one CLI wrapper. Which is how the container with an endpoint effectively behaves.

SlyEcho · 2023-06-03T14:58:49Z

Since Python is already in the image, I guess a Python script is a better way to do it.

jpodivin · 2023-06-03T18:44:01Z

I would say that if there should be a unified CLI for the entire toolchain, that is convert, quantize, main and checksum, we should probably start building it outside of the container. When it is finished we can always just add the magic line in container file.

But in that case it would probably be better to use proper python bindings, rather than an adhoc script calling binaries.
#1660

SlyEcho · 2023-06-03T19:22:01Z

Well, we have a cool server coming in #1570 that doesn't need Python or any specific language.

jpodivin · 2023-06-03T20:10:23Z

In that case it would be better to just wait for it. And use that as an endpoint for containers. No need to duplicate things.

…

On Sat, Jun 3, 2023 at 9:22 PM Henri Vasserman ***@***.***> wrote: Well, we have a cool server coming in #1570 <#1570> that doesn't need Python or any specific language. — Reply to this email directly, view it on GitHub <#1641 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/APZOTD5TE6QHX4XPFEGW2FLXJOFGHANCNFSM6AAAAAAYS76VAM> . You are receiving this because you authored the thread.Message ID: ***@***.***>

jpodivin · 2023-06-03T20:50:11Z

If the python script as an entry point for container does make sense, I would keep it simple.

Have it implement absolute minimum of logic needed to call other scripts/binaries and rely on user to invoke the right args. After all, it is probably going to be used in some sort of semi-automated, or even fully automated manner.

Like this #1686

CLI proper is a different thing. In that case the existing convert.py would be a good guideline.

AbdirayimovS · 2023-07-28T13:40:40Z

Convert.py gives me this error consistently on many different computers and operating systems: line 868, in lazy_load_file elif struct.unpack('<Q', first8)[0] < 16 * 1024 * 1024: struct.error: unpack requires a buffer of 8 bytes

Hi. Could you give me the full trace of this? Along with the arguments used in CLI?
I haven't actually changed the convert.py script. So whatever this issue is, it must have been there before, but I can try to fix it in a new PR if I get enough info.

I checked my files and the checksums look fine. It happens with every model size. Here is my CLI call and output:

python3 -m trace --count -C . convert.py models/7B Loading model file models/7B/consolidated.00.pth Loading model file models/7B/consolidated.01.pth Traceback (most recent call last): File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/usr/lib/python3.10/trace.py", line 740, in main() File "/usr/lib/python3.10/trace.py", line 728, in main t.runctx(code, globs, globs) File "/usr/lib/python3.10/trace.py", line 450, in runctx exec(cmd, globals, locals) File "convert.py", line 1168, in main() File "convert.py", line 1148, in main model_plus = load_some_model(args.model) File "convert.py", line 1074, in load_some_model models_plus.append(lazy_load_file(path)) File "convert.py", line 868, in lazy_load_file elif struct.unpack('<Q', first8)[0] < 16 * 1024 * 1024: struct.error: unpack requires a buffer of 8 bytes

Intresstingly it loads loading model file models/7B/consolidated.01.pth which is 0KB. That is the reason for the error... when moving the 0KB pth files in another directory, it works!

I am also getting the same error:

I inserted the downloaded model to the directory -> models/llama-2-7b-chat

Then I copied and pasted: python convert-pth-to-ggml.py models/llama-2-7b-chat/ 1

It returns the same error!

Looking for your helps

Replacing call to convert-pth-to-ggml.py with convert.py

062dc6c

Deprecation disclaimer was added to convert-pth-to-ggml.py Signed-off-by: Jiri Podivin <jpodivin@gmail.com>

SlyEcho approved these changes May 31, 2023

View reviewed changes

SlyEcho merged commit b5c8546 into ggerganov:master Jun 3, 2023

jpodivin deleted the convert-pth-to-ggml-deprecation branch June 3, 2023 14:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replacing call to `convert-pth-to-ggml.py` with `convert.py` #1641

Replacing call to `convert-pth-to-ggml.py` with `convert.py` #1641

jpodivin commented May 29, 2023

windprak commented Jun 3, 2023

jpodivin commented Jun 3, 2023 •

edited

Loading

SlyEcho commented Jun 3, 2023

windprak commented Jun 3, 2023

windprak commented Jun 3, 2023 •

edited

Loading

SlyEcho commented Jun 3, 2023

windprak commented Jun 3, 2023

jpodivin commented Jun 3, 2023

SlyEcho commented Jun 3, 2023

jpodivin commented Jun 3, 2023

SlyEcho commented Jun 3, 2023

jpodivin commented Jun 3, 2023

SlyEcho commented Jun 3, 2023 via email

jpodivin commented Jun 3, 2023 •

edited

Loading

SlyEcho commented Jun 3, 2023

jpodivin commented Jun 3, 2023 via email

jpodivin commented Jun 3, 2023 •

edited

Loading

AbdirayimovS commented Jul 28, 2023

Replacing call to convert-pth-to-ggml.py with convert.py #1641

Replacing call to convert-pth-to-ggml.py with convert.py #1641

Conversation

jpodivin commented May 29, 2023

windprak commented Jun 3, 2023

jpodivin commented Jun 3, 2023 • edited Loading

SlyEcho commented Jun 3, 2023

windprak commented Jun 3, 2023

windprak commented Jun 3, 2023 • edited Loading

SlyEcho commented Jun 3, 2023

windprak commented Jun 3, 2023

jpodivin commented Jun 3, 2023

SlyEcho commented Jun 3, 2023

jpodivin commented Jun 3, 2023

SlyEcho commented Jun 3, 2023

jpodivin commented Jun 3, 2023

SlyEcho commented Jun 3, 2023 via email

jpodivin commented Jun 3, 2023 • edited Loading

SlyEcho commented Jun 3, 2023

jpodivin commented Jun 3, 2023 via email

jpodivin commented Jun 3, 2023 • edited Loading

AbdirayimovS commented Jul 28, 2023

Replacing call to `convert-pth-to-ggml.py` with `convert.py` #1641

Replacing call to `convert-pth-to-ggml.py` with `convert.py` #1641

jpodivin commented Jun 3, 2023 •

edited

Loading

windprak commented Jun 3, 2023 •

edited

Loading

jpodivin commented Jun 3, 2023 •

edited

Loading

jpodivin commented Jun 3, 2023 •

edited

Loading