Skip to content

Latest commit

 

History

History
43 lines (36 loc) · 1.28 KB

README.md

File metadata and controls

43 lines (36 loc) · 1.28 KB

cudatiger

Cute Llama

An accelerated implementation of the Tiger optimizer for PyTorch, supercharged with Triton for enhanced CUDA GPU efficiency in under 100 lines of python/triton. Tiger is an extremely memory efficient Optimizer and also should be slightly faster than it's counterparts ADAM, SGD etc. Inspired by: bojone/tiger

Comparison

form

ToDo

  • Add benchmarks comparing ADAM, TIGER, SGD, etc.
  • Provide more examples.
  • Introduce testing.
  • Improve this README.
  • Push pypi
  • Improve Kernel

Citations

@misc{tigeropt,
  title={Tiger: A Tight-fisted Optimizer},
  author={Jianlin Su},
  year={2023},
  howpublished={\url{https://github.com/bojone/tiger}},
}
@article{Tillet2019TritonAI,
    title   = {Triton: an intermediate language and compiler for tiled neural network computations},
    author  = {Philippe Tillet and H. Kung and D. Cox},
    journal = {Proceedings of the 3rd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages},
    year    = {2019}
}

Art

@Midjourney

License:

This project is licensed under the MIT License.