Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replacing call to convert-pth-to-ggml.py with convert.py #1641

Merged
merged 1 commit into from
Jun 3, 2023

Conversation

jpodivin
Copy link
Contributor

Deprecation disclaimer was added to convert-pth-to-ggml.py
I'm keeping the script around for now, as it may be needed for backward compat.

That being said, removing it in near future would be prudent.

I've verified that the convert.py can take other arguments when called in container.
Although I haven't tested all possible combinations.

#1628

Deprecation disclaimer was added to convert-pth-to-ggml.py

Signed-off-by: Jiri Podivin <jpodivin@gmail.com>
@windprak
Copy link

windprak commented Jun 3, 2023

Convert.py gives me this error consistently on many different computers and operating systems:
line 868, in lazy_load_file elif struct.unpack('<Q', first8)[0] < 16 * 1024 * 1024: struct.error: unpack requires a buffer of 8 bytes

@jpodivin
Copy link
Contributor Author

jpodivin commented Jun 3, 2023

Convert.py gives me this error consistently on many different computers and operating systems: line 868, in lazy_load_file elif struct.unpack('<Q', first8)[0] < 16 * 1024 * 1024: struct.error: unpack requires a buffer of 8 bytes

Hi. Could you give me the full trace of this? Along with the arguments used in CLI?

I haven't actually changed the convert.py script. So whatever this issue is, it must have been there before, but I can try to fix it in a new PR if I get enough info.

@SlyEcho
Copy link
Collaborator

SlyEcho commented Jun 3, 2023

Convert.py gives me this error consistently on many different computers and operating systems

That probably means the file you're trying to convert is faulty.

@windprak
Copy link

windprak commented Jun 3, 2023

Convert.py gives me this error consistently on many different computers and operating systems

That probably means the file you're trying to convert is faulty.

But the pth file still has the same checksum as in my checklist.chk.

@windprak
Copy link

windprak commented Jun 3, 2023

Convert.py gives me this error consistently on many different computers and operating systems: line 868, in lazy_load_file elif struct.unpack('<Q', first8)[0] < 16 * 1024 * 1024: struct.error: unpack requires a buffer of 8 bytes

Hi. Could you give me the full trace of this? Along with the arguments used in CLI?

I haven't actually changed the convert.py script. So whatever this issue is, it must have been there before, but I can try to fix it in a new PR if I get enough info.

I checked my files and the checksums look fine. It happens with every model size. Here is my CLI call and output:

python3 -m trace --count -C . convert.py models/7B
Loading model file models/7B/consolidated.00.pth
Loading model file models/7B/consolidated.01.pth
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/usr/lib/python3.10/trace.py", line 740, in
main()
File "/usr/lib/python3.10/trace.py", line 728, in main
t.runctx(code, globs, globs)
File "/usr/lib/python3.10/trace.py", line 450, in runctx
exec(cmd, globals, locals)
File "convert.py", line 1168, in
main()
File "convert.py", line 1148, in main
model_plus = load_some_model(args.model)
File "convert.py", line 1074, in load_some_model
models_plus.append(lazy_load_file(path))
File "convert.py", line 868, in lazy_load_file
elif struct.unpack('<Q', first8)[0] < 16 * 1024 * 1024:
struct.error: unpack requires a buffer of 8 bytes

Intresstingly it loads loading model file models/7B/consolidated.01.pth which is 0KB. That is the reason for the error... when moving the 0KB pth files in another directory, it works!

@SlyEcho
Copy link
Collaborator

SlyEcho commented Jun 3, 2023

7B has only one pth file: consolidated.00.pth

@windprak
Copy link

windprak commented Jun 3, 2023

7B has only one pth file: consolidated.00.pth

I have no idea where the 0kb pth parts came from but they did cost me a lot of time...

@jpodivin
Copy link
Contributor Author

jpodivin commented Jun 3, 2023

In that case I suppose it really is not related to this patch and we can get it merged. Unless there is something else going wrong?

@SlyEcho
Copy link
Collaborator

SlyEcho commented Jun 3, 2023

Tested, it works.

However I can't seem to pass an argument with spaces to --run -p but it's an unrelated issue.

@SlyEcho SlyEcho merged commit b5c8546 into ggerganov:master Jun 3, 2023
@jpodivin
Copy link
Contributor Author

jpodivin commented Jun 3, 2023

Tested, it works.

However I can't seem to pass an argument with spaces to --run -p but it's an unrelated issue.

Yes, that is I believe due to the way the endpoint argument processing works. Shell isn't exactly known for dealing well with multiple levels of quoting. I do have a rewrite into python in the works though.

@SlyEcho
Copy link
Collaborator

SlyEcho commented Jun 3, 2023

I don't really see the need for an entrypoint script at all, actually.

you could put something like ENV PATH="/app/build/bin:$PATH" in the end and then the user can run docker run llama.cpp main -m ... -p ... etc.

@jpodivin
Copy link
Contributor Author

jpodivin commented Jun 3, 2023

I don't really see the need for an entrypoint script at all, actually.

you could put something like ENV PATH="/app/build/bin:$PATH" in the end and then the user can run docker run llama.cpp main -m ... -p ... etc.

TBH, this is a UI/UX question. Adapting container paradigm fundamentally means that there is now additional layer between user and the tool. How should that layer behave depends on the use case. For llama.cpp we are talking about a set of CLI tools, each with a special purpose. It could be argued that at some point, although not necessarily today, it may make sense to unify them all behind one CLI wrapper. Which is how the container with an endpoint effectively behaves.

@jpodivin jpodivin deleted the convert-pth-to-ggml-deprecation branch June 3, 2023 14:14
@SlyEcho
Copy link
Collaborator

SlyEcho commented Jun 3, 2023 via email

@jpodivin
Copy link
Contributor Author

jpodivin commented Jun 3, 2023

I would say that if there should be a unified CLI for the entire toolchain, that is convert, quantize, main and checksum, we should probably start building it outside of the container. When it is finished we can always just add the magic line in container file.

But in that case it would probably be better to use proper python bindings, rather than an adhoc script calling binaries.
#1660

@SlyEcho
Copy link
Collaborator

SlyEcho commented Jun 3, 2023

Well, we have a cool server coming in #1570 that doesn't need Python or any specific language.

@jpodivin
Copy link
Contributor Author

jpodivin commented Jun 3, 2023 via email

@jpodivin
Copy link
Contributor Author

jpodivin commented Jun 3, 2023

If the python script as an entry point for container does make sense, I would keep it simple.

Have it implement absolute minimum of logic needed to call other scripts/binaries and rely on user to invoke the right args. After all, it is probably going to be used in some sort of semi-automated, or even fully automated manner.

Like this #1686

CLI proper is a different thing. In that case the existing convert.py would be a good guideline.

@AbdirayimovS
Copy link

Convert.py gives me this error consistently on many different computers and operating systems: line 868, in lazy_load_file elif struct.unpack('<Q', first8)[0] < 16 * 1024 * 1024: struct.error: unpack requires a buffer of 8 bytes

Hi. Could you give me the full trace of this? Along with the arguments used in CLI?
I haven't actually changed the convert.py script. So whatever this issue is, it must have been there before, but I can try to fix it in a new PR if I get enough info.

I checked my files and the checksums look fine. It happens with every model size. Here is my CLI call and output:

python3 -m trace --count -C . convert.py models/7B Loading model file models/7B/consolidated.00.pth Loading model file models/7B/consolidated.01.pth Traceback (most recent call last): File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/usr/lib/python3.10/trace.py", line 740, in main() File "/usr/lib/python3.10/trace.py", line 728, in main t.runctx(code, globs, globs) File "/usr/lib/python3.10/trace.py", line 450, in runctx exec(cmd, globals, locals) File "convert.py", line 1168, in main() File "convert.py", line 1148, in main model_plus = load_some_model(args.model) File "convert.py", line 1074, in load_some_model models_plus.append(lazy_load_file(path)) File "convert.py", line 868, in lazy_load_file elif struct.unpack('<Q', first8)[0] < 16 * 1024 * 1024: struct.error: unpack requires a buffer of 8 bytes

Intresstingly it loads loading model file models/7B/consolidated.01.pth which is 0KB. That is the reason for the error... when moving the 0KB pth files in another directory, it works!

I am also getting the same error:

I inserted the downloaded model to the directory -> models/llama-2-7b-chat

Then I copied and pasted: python convert-pth-to-ggml.py models/llama-2-7b-chat/ 1

It returns the same error!

Looking for your helps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants