A work-in-progress, from-scratch implementation of a generative pre-trained transformer (GPT) in vanilla Pytorch. The purpose of this project is to be a personal sandbox + learning environment, looking at both training and inference, with a long-term aim of becoming a self-hosted language assistant running in a homelab environment.
Inspired by OpenAI's GPT 2, Andrej Karpathy's nanoGPT, and Hugging Face's GPT 2 implementation.
git clone git@github.com:bellthomas/gpt.local.git
cd gpt.local
# Download training data.
python -m data
> Downloading collection: bellthomas/herodotus
> ...
# Train a GPT.
python -m train --collection "openwebtext" --experiment "openwebtext-1" --device cpu
> *Experiment: openwebtext-1
> Data: ./gpt.local/data/openwebtext/{validation,training}
> Training... (parameters: 124.11M, device: cpu)
> (0) loss 10.9385 (9715.26ms, ~0.51 tflops)
> ...