Instructions to use state-spaces/mamba-2.8b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use state-spaces/mamba-2.8b with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("state-spaces/mamba-2.8b", dtype="auto") - Notebooks
- Google Colab
- Kaggle
size mismatch
File "/workspace/text-generation-webui/installer_files/env/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
return model_class.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/text-generation-webui/installer_files/env/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
) = cls._load_pretrained_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/text-generation-webui/installer_files/env/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4155, in _load_pretrained_model
raise RuntimeError(f"Error(s) in loading state_dict for {model.class.name}:\n\t{error_msg}")
RuntimeError: Error(s) in loading state_dict for MambaForCausalLM:
size mismatch for backbone.embeddings.weight: copying a param with shape torch.Size([50280, 2560]) from checkpoint, the shape in current model is torch.Size([50277, 768]).
size mismatch for backbone.layers.0.norm.weight: copying a param with shape torch.Size([2560]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for backbone.layers.0.mixer.A_log: copying a param with shape torch.Size([5120, 16]) from checkpoint, the shape in current model is torch.Size([1536, 16]).
size mismatch for backbone.layers.0.mixer.D: copying a param with shape torch.Size([5120]) from checkpoint, the shape in current model is torch.Size([1536]).