Instructions to use gaodrew/moondream-image2prompt-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use gaodrew/moondream-image2prompt-v1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="gaodrew/moondream-image2prompt-v1", trust_remote_code=True)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("gaodrew/moondream-image2prompt-v1", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use gaodrew/moondream-image2prompt-v1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "gaodrew/moondream-image2prompt-v1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "gaodrew/moondream-image2prompt-v1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/gaodrew/moondream-image2prompt-v1

SGLang

How to use gaodrew/moondream-image2prompt-v1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "gaodrew/moondream-image2prompt-v1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "gaodrew/moondream-image2prompt-v1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "gaodrew/moondream-image2prompt-v1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "gaodrew/moondream-image2prompt-v1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use gaodrew/moondream-image2prompt-v1 with Docker Model Runner:
```
docker model run hf.co/gaodrew/moondream-image2prompt-v1
```

Thank you to the Akash Network for sponsoring this project and providing A100s/H100s for compute!

Predict the prompt used to generate an AI image

This is a fine-tune of Moondream (5/20/24 version), a tiny vision language model created by the amazing vik. It was fine-tuned on 35,000 image-prompt pairs from the Diffusion DB dataset of Stable Diffusion images. It can predict the prompt used to generate an image, to an extent. It can usually get the style right and an artist whose work/subject matter resembles the image. Settings:

Batch Size: 16
Learning Rate: 5e-5

Thank you to Akash.net for providing A100s that I used in the process of this project and fine-tuning the model.

Colab

Fine-tuning Script

Based on the code provided by Vik, here is what I used to fine-tune.

Downloads last month: 5

Safetensors

Model size

2B params

Tensor type

F16

gaodrew
/

moondream-image2prompt-v1

Predict the prompt used to generate an AI image

Colab

Fine-tuning Script

Dataset used to train gaodrew/moondream-image2prompt-v1