Skip to content

NolanoOrg/smol-gpt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Smol GPT

Multimodal instruction-following model for text generation that runs on your CPU. Less than 14 GB of RAM required.

Demo

Disclaimer

Models hallucinates - this is meant for fun. We do not recommend using the model in production - unless you hallucinate for a living know what you are doing.

The implementation may contain bugs and int4 quantization performed is not optimal – This might lead to worse performance than the original model.

Usage

  1. git clone https://github.com/nolanoOrg/smol-gpt
  2. pip install -r requirements.txt
  3. cd cpp && make
  4. cd ..
  5. python3 app.py (May take a few minutes to download and load the model)
  6. Open http://127.0.0.1:4241/ in your browser.`

Contributing (Possible future directions)

Contributions are welcome. Please open an issue or a PR. New features will be community driven. Following features can be easily added for the model:

Features

  • Chat/Conversation mode is supported by the model, but not the app.
  • Increase Input/Output length.
  • GPTQ quantization.
  • Interesting Prompts.

Performance (Speed and Memory)

  • Reduce RAM usage by 4x (down to ~4 GB)
    • Current Flask implementation loads the Bert & CLIP models twice for some reason.
    • Offload T5 encoder after getting the hidden representations.
    • Shift Vision and Bert model to int4/int8 and offload after using.
  • Speed up 4x:
  • MMap Speed up.
  • Support Smoller GPT for running multimodal models in 4 GB of RAM.

Unknowns about the model

  • Performance on multiple collated images.
  • Couple with OCR to reason about text from images.

License

MIT

Communication

Misc

The model used are Clip and Bert following Blip-2 and Flan-T5 for instruction following.

About

Smol but mighty language model

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published