rkazants
/

tiny-mamba

Model card Files Files and versions

rkazants commited on Jun 30, 2025

Commit

0f49643

·

verified ·

1 Parent(s): 0e429bb

Update README.md

Files changed (1) hide show

README.md +34 -3

README.md CHANGED Viewed

@@ -1,3 +1,34 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+---
+Here is a code to create this tiny model:
+```python
+import os
+from transformers import MambaConfig, MambaForCausalLM, AutoTokenizer
+model_dir = "state-spaces/mamba-130m-hf"
+tokenizer = AutoTokenizer.from_pretrained(model_dir)
+# === Step 1: Define tiny model config ===
+config = MambaConfig(
+    d_model=64,  # Smaller hidden dimension
+    n_layer=2,  # Just one layer
+    d_state=16,  # Minimal state size
+    expand=2,  # No expansion (linear)
+    conv_kernel=3,  # Smallest convolution kernel
+    vocab_size=50280,
+)
+# === Step 2: Create model from config ===
+model = MambaForCausalLM(config)
+# === Step 4: Save model and tokenizer to disk ===
+output_dir = "./tiny-mamba"
+os.makedirs(output_dir, exist_ok=True)
+model.save_pretrained(output_dir)
+tokenizer.save_pretrained(output_dir)
+print(f"Tiny Mamba model and tokenizer saved to: {output_dir}")
+```