fix #149 - load tensors by type, ignoring filetype #152

philpax · 2023-04-24T00:58:08Z

As noticed by @KerfuffleV2, both loaders incorrectly handle newer models that mix tensor types indiscriminately due to a confusion between the ftype/f16_ and element type.

I was planning on bringing #84 up to date first, but realised that I needed to figure out what was going on with ftype before baking that in.

I've addressed the issue by decoupling them properly, and then switching over to loading the tensors from file instead of trying to preallocate them with the wrong type. This mirrors the changes made in ggerganov/llama.cpp#801.

I've tested this with Alpaca 7B GGML and gpt4-x-alpaca-13b-native-4bit-128g, the latter with and without mmap, and all seems to work.

I'm not happy at how I broke the isolation between the loader and the Model with Model::new_loader2, but I'm going to revisit this once I remove loader1 as part of #150.

loader1 is still broken (i.e. using the old behaviour), but that's OK because it's going away soon ^_^

philpax added 4 commits April 24, 2023 02:53

fix #149 - load tensors by type, ignoring filetype

e09151d

chore: ignore too many arguments

5e5f3cc

chore: hide Model internals

ecb9175

refactor: decouple loading from model

c9e5c26

philpax merged commit 8254deb into rustformers:main Apr 25, 2023

philpax deleted the load-tensors-as-stored branch April 25, 2023 01:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix #149 - load tensors by type, ignoring filetype #152

fix #149 - load tensors by type, ignoring filetype #152

philpax commented Apr 24, 2023 •

edited

Loading

fix #149 - load tensors by type, ignoring filetype #152

fix #149 - load tensors by type, ignoring filetype #152

Conversation

philpax commented Apr 24, 2023 • edited Loading

philpax commented Apr 24, 2023 •

edited

Loading