buffer not aligned #231

iacore · 2023-04-11T13:17:53Z

I converted the pytorch checkpoint to safetensors. The buffer is not aligned

RWKV-4-Pile-430M-20220808-8066.pth is from https://huggingface.co/BlinkDL/rwkv-4-pile-430m
The convert script is here: https://github.com/iacore/rwkv-np/blob/main/convert.py

> xxd RWKV-4-Pile-430M-20220808-8066.safetensors | head -n 2
00000000: 66a7 0000 0000 0000 7b22 626c 6f63 6b73  f.......{"blocks
00000010: 2e30 2e61 7474 2e6b 6579 2e77 6569 6768  .0.att.key.weigh

The tensor data are all f32.

0xa766 % 4 == 2

Why not aligned: the offset count from after the metadata header (sized 0xa766).

The text was updated successfully, but these errors were encountered:

Narsil · 2023-04-11T15:24:01Z

Which version are you using ? Alignement was added in 0.3.0.

iacore · 2023-04-11T17:44:27Z

0.3.0

Narsil · 2023-04-13T06:43:30Z

Hey, I just took a look.

For this file: https://huggingface.co/BlinkDL/rwkv-4-pile-430m/blob/main/RWKV-4-Pile-430M-20220808-8066.pth

All tensor data are bfloat16, not f32 and alignment of f16 is respected there, no ?

iacore · 2023-04-13T11:26:01Z

The convert script is here: https://github.com/iacore/rwkv-np/blob/main/convert.py

The file I used is .safetensors, which has only float32 data. Please use this script to convert the model first.

iacore · 2023-04-19T17:52:14Z

The problem: offsets are calculated from the end of header. If the header size is not aligned by 4, even if the offsets are aligned, tho actual memory address won't be aligned

Narsil · 2023-04-19T18:19:13Z

I'm not sure I understand. The header get added empty spaces, until memory addresses are aligned.

The offsets alignment doesn't matter. I could share a script to showcase addresses alignement if you want.
I used this alignement regularly already to truly get zero-copy. Mostly on f32 so it could be possible that other alignment might have issues.

iacore · 2023-04-20T01:09:51Z

I'm not sure I understand. The header get added empty spaces, until memory addresses are aligned.

The offsets alignment doesn't matter. I could share a script to showcase addresses alignement if you want. I used this alignement regularly already to truly get zero-copy. Mostly on f32 so it could be possible that other alignment might have issues.

I know it's possible to add alignment on Linux/POSIX mmap. The problem is that the model created with safetensors's Python library doesn't load aligned with safetensor's Rust library.

Narsil · 2023-04-20T06:54:14Z

You can check this:

from huggingface_hub import hf_hub_download
import torch
from safetensors.torch import load_file, save_file

filename = hf_hub_download("BlinkDL/rwkv-4-pile-430m", filename="RWKV-4-Pile-430M-20220808-8066.pth")

weights = torch.load(filename, map_location="cpu")
save_file(weights, "out.safetensors")

import mmap
import torch
import json
import os
from huggingface_hub import hf_hub_download


def load_file(filename, device):
    with open(filename, mode="r", encoding="utf8") as file_obj:
        with mmap.mmap(file_obj.fileno(), length=0, access=mmap.ACCESS_READ) as m:
            header = m.read(8)
            n = int.from_bytes(header, "little")
            metadata_bytes = m.read(n)
            metadata = json.loads(metadata_bytes)

    size = os.stat(filename).st_size
    storage = torch.ByteStorage.from_file(filename, shared=False, size=size).untyped()
    offset = n + 8
    return {name: create_tensor(storage, info, offset) for name, info in metadata.items() if name != "__metadata__"}


DTYPES = {"F32": torch.float32, "BF16": torch.bfloat16}
ALIGNMENT = {torch.float32: 4, torch.bfloat16: 2}

device = "cpu"


def create_tensor(storage, info, offset):
    dtype = DTYPES[info["dtype"]]
    shape = info["shape"]
    start, stop = info["data_offsets"]
    print((start + offset) % ALIGNMENT[dtype])
    return torch.asarray(storage[start + offset : stop + offset], dtype=torch.uint8).view(dtype=dtype).reshape(shape)


weights = load_file("out.safetensors", device)

The loading is done in pure Python just so that you can mess with pointers easily.

the mmap initial pointer is always page aligned and what's important to check is storage + offset. Does this work correctly ?

iacore · 2023-04-20T14:14:36Z

Your example works correctly by coincidence. Please try this. I only changed a few lines to convert the weights to f32.

from huggingface_hub import hf_hub_download
import torch
from safetensors.torch import load_file, save_file

filename = hf_hub_download("BlinkDL/rwkv-4-pile-430m", filename="RWKV-4-Pile-430M-20220808-8066.pth")

weights = torch.load(filename, map_location="cpu")

for k in weights.keys():
    weights[k] = weights[k].float() # convert to float32

save_file(weights, "out.safetensors")

import mmap
import torch
import json
import os
from huggingface_hub import hf_hub_download


def load_file(filename, device):
    with open(filename, mode="r", encoding="utf8") as file_obj:
        with mmap.mmap(file_obj.fileno(), length=0, access=mmap.ACCESS_READ) as m:
            header = m.read(8)
            n = int.from_bytes(header, "little")
            metadata_bytes = m.read(n)
            metadata = json.loads(metadata_bytes)

    size = os.stat(filename).st_size
    storage = torch.ByteStorage.from_file(filename, shared=False, size=size).untyped()
    offset = n + 8
    # print(n)
    return {name: create_tensor(storage, info, offset) for name, info in metadata.items() if name != "__metadata__"}


DTYPES = {"F32": torch.float32, "BF16": torch.bfloat16}
ALIGNMENT = {torch.float32: 4, torch.bfloat16: 2}

device = "cpu"


def create_tensor(storage, info, offset):
    dtype = DTYPES[info["dtype"]]
    shape = info["shape"]
    start, stop = info["data_offsets"]
    print((start + offset) % ALIGNMENT[dtype])
    return torch.asarray(storage[start + offset : stop + offset], dtype=torch.uint8).view(dtype=dtype).reshape(shape)


weights = load_file("out.safetensors", device)

Narsil · 2023-04-20T15:29:19Z

Indeed that's pretty bad !

I created #235 to fix that.

I did some testing with various models on custom backends and I was pretty lucky I guess.

workaround for huggingface/safetensors#231

iacore mentioned this issue Apr 13, 2023

Standalone loader rustformers/llm#125

Merged

Narsil mentioned this issue Apr 20, 2023

Fixing alignment #235

Merged

Narsil closed this as completed in #235 Apr 21, 2023

davidar added a commit to davidar/eigenGPT that referenced this issue Jun 1, 2023

strip header from tensors before embedding

850affe

workaround for huggingface/safetensors#231

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

buffer not aligned #231

buffer not aligned #231

iacore commented Apr 11, 2023 •

edited

Loading

Narsil commented Apr 11, 2023

iacore commented Apr 11, 2023

Narsil commented Apr 13, 2023

iacore commented Apr 13, 2023

iacore commented Apr 19, 2023

Narsil commented Apr 19, 2023

iacore commented Apr 20, 2023

Narsil commented Apr 20, 2023

iacore commented Apr 20, 2023 •

edited

Loading

Narsil commented Apr 20, 2023

buffer not aligned #231

buffer not aligned #231

Comments

iacore commented Apr 11, 2023 • edited Loading

Narsil commented Apr 11, 2023

iacore commented Apr 11, 2023

Narsil commented Apr 13, 2023

iacore commented Apr 13, 2023

iacore commented Apr 19, 2023

Narsil commented Apr 19, 2023

iacore commented Apr 20, 2023

Narsil commented Apr 20, 2023

iacore commented Apr 20, 2023 • edited Loading

Narsil commented Apr 20, 2023

iacore commented Apr 11, 2023 •

edited

Loading

iacore commented Apr 20, 2023 •

edited

Loading