Skip to content
This repository has been archived by the owner on Jun 24, 2024. It is now read-only.

Directly load pth/PyTorch tensor model files #21

Open
philpax opened this issue Mar 16, 2023 · 2 comments · Fixed by #83
Open

Directly load pth/PyTorch tensor model files #21

philpax opened this issue Mar 16, 2023 · 2 comments · Fixed by #83
Labels
issue:enhancement New feature or request

Comments

@philpax
Copy link
Collaborator

philpax commented Mar 16, 2023

At present, llama.cpp contains a Python script that converts pth to ggml format.

It would be nice to build it into the CLI directly, so that you can load the original model files. The original Python script could also be converted to Rust, so that we have a fully-Rust method of converting pth to ggml models.

@philpax
Copy link
Collaborator Author

philpax commented Mar 16, 2023

re loading pth: serde-pickle looks quite promising, but we would need to figure out if it can load PyTorch tensors.

@philpax philpax changed the title Port llama.cpp utilities to Rust Directly load pth/PyTorch tensor model files Mar 18, 2023
@philpax philpax added the issue:enhancement New feature or request label Mar 24, 2023
@philpax philpax reopened this Apr 6, 2023
@philpax
Copy link
Collaborator Author

philpax commented Apr 6, 2023

This is not complete yet. We've merged in the start of a converter, but more work is required to convert the weight.

Luckily, @KerfuffleV2's developed a Pickle parser that can handle PyTorch tensors: https://github.com/KerfuffleV2/repugnant-pickle

We should be able to use this to convert tensors to GGML format. In future, we can directly load tensors (I may separate that out into a new issue), but our focus is on loading tensors so that they can be quantised by #84 and used by llama-cli.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
issue:enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant