Skip to content
Change the repository type filter

All

    Repositories list

    • MagicPIG

      Public
      MagicPIG: LSH Sampling for Efficient LLM Generation
      Python
      Apache License 2.0
      612920Updated Dec 12, 2024Dec 12, 2024
    • S2FT-Page

      Public
      JavaScript
      0000Updated Dec 10, 2024Dec 10, 2024
    • S2FT

      Public
      Python
      0400Updated Dec 9, 2024Dec 9, 2024
    • MagicDec

      Public
      Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding
      Python
      Apache License 2.0
      410060Updated Dec 4, 2024Dec 4, 2024
    • JavaScript
      1000Updated Dec 2, 2024Dec 2, 2024
    • Factor

      Public
      0100Updated Nov 7, 2024Nov 7, 2024
    • Speculative decoding for high-throughput long-context inference
      JavaScript
      Apache License 2.0
      0000Updated Sep 10, 2024Sep 10, 2024
    • Sirius

      Public
      Sirius, an efficient correction mechanism, which significantly boosts Contextual Sparsity models on reasoning tasks while maintaining its efficiency gain.
      Python
      42000Updated Sep 10, 2024Sep 10, 2024
    • MagicDec: Breaking the Latency-Throughput Tradeoff for Long Contexts with Speculative Decoding
      JavaScript
      Apache License 2.0
      0000Updated Sep 5, 2024Sep 5, 2024
    • TriForce

      Public
      [COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
      Python
      1323470Updated Aug 31, 2024Aug 31, 2024
    • Sequoia

      Public
      scalable and robust tree-based speculative decoding algorithm
      Python
      3632273Updated Aug 13, 2024Aug 13, 2024
    • A framework for few-shot evaluation of language models.
      Python
      MIT License
      1.9k000Updated Jun 10, 2024Jun 10, 2024
    • JavaScript
      1000Updated May 21, 2024May 21, 2024