PierrunoYT
/

moondream3-preview

Image-Text-to-Text

text-generation

Model card Files Files and versions

vikhyatk commited on Sep 17, 2025

Commit

d1b7c10

·

verified ·

1 Parent(s): ca9c987

Update README.md

Files changed (1) hide show

README.md +12 -0

README.md CHANGED Viewed

@@ -17,4 +17,16 @@ For more details, please refer to our ||coming soon release blog post||. Or try
 ## Usage
 * TODO: Add usage examples

 ## Usage
+Load the model and prepare it for inference. We use [FlexAttention for inference](https://pytorch.org/blog/flexattention-for-inference/), so calling `.compile()` is critical for fast decoding. Our `compile` implementation also handles warmup, so you can start making requests directly once it returns.
+```
+    moondream = AutoModelForCausalLM.from_pretrained(
+        "moondream/moondream3-preview",
+        trust_remote_code=True,
+        dtype=torch.bfloat16,
+        device_map={"": "cuda"},
+    )
+    moondream.compile()
+```
 * TODO: Add usage examples