Implementation of the Mamba SSM with hf_integration.
To use the mamba-hf, follow these steps:
- Clone the repository to your local machine.
git clone https://github.com/LegallyCoder/mamba-hf
- Open a terminal or command prompt and navigate to the script's directory.
cd src
- Install the required packages using this command:
pip3 install -r requirements.txt
- Open new python file at the script's directory.
from modeling_mamba import MambaForCausalLM
from transformers import AutoTokenizer
model = MambaForCausalLM.from_pretrained('Q-bert/Mamba-130M')
tokenizer = AutoTokenizer.from_pretrained('Q-bert/Mamba-130M')
text = "Hi"
input_ids = tokenizer.encode(text, return_tensors="pt")
output = model.generate(input_ids, max_length=20, num_beams=5, no_repeat_ngram_size=2)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)
Hi, I'm looking for a new job. I've been working at a company for about a year now.
You can look at here Mamba Models Collection
The Mamba architecture was introduced in Mamba: Linear-Time Sequence Modeling with Selective State Spaces by Albert Gu and Tri Dao.
Thank for the simple implementation (https://github.com/johnma2006/mamba-minimal)
The official implementation is here: https://github.com/state-spaces/mamba/tree/main