Instructions to use moondream/moondream3-preview with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use moondream/moondream3-preview with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="moondream/moondream3-preview", trust_remote_code=True)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("moondream/moondream3-preview", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use moondream/moondream3-preview with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "moondream/moondream3-preview" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "moondream/moondream3-preview", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/moondream/moondream3-preview
- SGLang
How to use moondream/moondream3-preview with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "moondream/moondream3-preview" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "moondream/moondream3-preview", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "moondream/moondream3-preview" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "moondream/moondream3-preview", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use moondream/moondream3-preview with Docker Model Runner:
docker model run hf.co/moondream/moondream3-preview
Update README.md
Browse files
README.md
CHANGED
|
@@ -32,8 +32,36 @@ moondream = AutoModelForCausalLM.from_pretrained(
|
|
| 32 |
moondream.compile()
|
| 33 |
```
|
| 34 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
* TODO: Add usage examples
|
| 36 |
-
* Query
|
| 37 |
-
* Caption
|
| 38 |
* Detect
|
| 39 |
* Point
|
|
|
|
| 32 |
moondream.compile()
|
| 33 |
```
|
| 34 |
|
| 35 |
+
The model comes with four skills, tailored towards different visual understanding tasks.
|
| 36 |
+
|
| 37 |
+
### Query
|
| 38 |
+
|
| 39 |
+
The `query` skill can be used to ask open-ended questions about images.
|
| 40 |
+
|
| 41 |
+
||TK -- code example for simple VQA||
|
| 42 |
+
|
| 43 |
+
By default, `query` runs in reasoning mode, allowing the model to "think" about the question before generating an answer. This is helpful for more complicated tasks, but sometimes the task you're running is simple and doesn't benefit from reasoning. To save on inference cost when this is the case, you can disable reasoning:
|
| 44 |
+
|
| 45 |
+
||TK -- example without reasoning||
|
| 46 |
+
|
| 47 |
+
If you want to stream outputs, pass in `stream=True`. You can control the temperature, top-p, and maximum number of tokens generated by passing in optional settings.
|
| 48 |
+
|
| 49 |
+
||TK -- stream + settings example||
|
| 50 |
+
|
| 51 |
+
Note that this isn't just for images; Moondream is also a strong general-purpose text model.
|
| 52 |
+
|
| 53 |
+
||TK -- text only example||
|
| 54 |
+
|
| 55 |
+
### Caption
|
| 56 |
+
|
| 57 |
+
Whether you want short, normal-sized or long descriptions of images, the `caption` skill has you covered.
|
| 58 |
+
|
| 59 |
+
||TK -- captioning example||
|
| 60 |
+
|
| 61 |
+
It accepts the same streaming and temperature etc. settings as the `query` skill.
|
| 62 |
+
|
| 63 |
+
---
|
| 64 |
+
|
| 65 |
* TODO: Add usage examples
|
|
|
|
|
|
|
| 66 |
* Detect
|
| 67 |
* Point
|