Skip to content

Inference code for Mistral and Mixtral hacked up into original Llama implementation

License

Notifications You must be signed in to change notification settings

dzhulgakov/llama-mistral

 
 

Repository files navigation

Extremely hacky implementation of Mixtral 8x7B

New: API access

Try a much faster implementation of this model at https://app.fireworks.ai/

What is it?

Mistral dropped the new MoE model this morning: https://twitter.com/MistralAI/status/1733150512395038967

This is an attempt to hack the original Llama codebase to load it. The implementation is very naive and slow.

You need 2 x 80Gb or 4 x 40Gb cards to load it.

Implementation:

WARNING: There's no official reference model code. This implementation might be wrong. At least the generation looks coherent which is a good sign :)

Usage

Download the weights for Mixtral from HF or Torrent. HF is the easiest: https://huggingface.co/someone13574/mixtral-8x7b-32kseqlen/tree/main . Make sure you consolidate the weights

Run with 2 GPUs (~45GB required in each):

python example_text_completion.py path/to/mixtral/ path/to/mixtral/tokenizer.model

To run with 4 GPUs pass --num-gpus 4.

Edit prompt in the example if needed.

Sample output

Mistral hallucinates about Mistral:

Mistral.ai is a company that
> provides a platform for building, training, and deploying AI models.

The platform offers a variety of tools and services that can help developers and data scientists build and train AI models.

Some of the key features of Mistral.ai's platform include:

- A drag-and-drop

==================================

Simply put, the theory of relativity states that
> 1) the laws of physics are the same for all observers in uniform motion relative to one another, and 2) the speed of light in a vacuum is the same for all observers, regardless of their relative motion or of the motion of the light source.

The first postulate, the principle of

==================================

A brief message congratulating the team on the launch:

        Hi everyone,

        I just
> wanted to say a big congratulations on the launch of your new website.

        I think it looks fantastic and I am sure it will be a great success.

        Well done everyone and keep up the good work.

        Best wishes,

        XXXX

==================================

Translate English to French:

        sea otter => loutre de mer
        peppermint => menthe poivrée
        plush girafe => girafe peluche
        cheese =>
> fromage
        teddy bear => ourson en peluche
        polar bear => ours polaire
        cuddly panda => panda câlin
        fluffy sheep => mouton fluffy
        furry kitten => chaton poilu
        fuzzy

==================================

About

Inference code for Mistral and Mixtral hacked up into original Llama implementation

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 94.7%
  • Shell 5.3%