Instructions to use state-spaces/mamba-2.8b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use state-spaces/mamba-2.8b with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("state-spaces/mamba-2.8b", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Regarding the Model size
#1
by Prakh24s - opened
Thank you for the amazing paper and model weights.
The model seems to be twice the size compared transformer based model for the same size (~5.9 GB for 3b transformer model vs 11.1GB Mamba model).
Is is expected?
This comment has been hidden
It's a float32 model, hence the size difference. Transformers are usually float16 or bfloat16.
Thank you for the answer!
Very excited for bigger/quantized models!
Prakh24s changed discussion status to closed
One more question: will float16 model still outperform Transformers as said in the paper?