Instructions to use Loke-60000/rin-mobile-preview with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Loke-60000/rin-mobile-preview with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="Loke-60000/rin-mobile-preview")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("Loke-60000/rin-mobile-preview")
model = AutoModelForMultimodalLM.from_pretrained("Loke-60000/rin-mobile-preview")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Loke-60000/rin-mobile-preview with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Loke-60000/rin-mobile-preview"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Loke-60000/rin-mobile-preview",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/Loke-60000/rin-mobile-preview

SGLang

How to use Loke-60000/rin-mobile-preview with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Loke-60000/rin-mobile-preview" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Loke-60000/rin-mobile-preview",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Loke-60000/rin-mobile-preview" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Loke-60000/rin-mobile-preview",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use Loke-60000/rin-mobile-preview with Docker Model Runner:
```
docker model run hf.co/Loke-60000/rin-mobile-preview
```

Rin-mobile is a compact model destined to run agentic work directly on a phone or a laptop, with no server and no cloud. It was trained on about 895,000 tokens to give it one steady voice, named Rin, that is clear and composed.

What it does

Text to text. Chat, coding, technical help, and long horizon agentic tasks that hold together across many steps.
Image to text. Look at a picture and describe or reason about it.
Speech to text. Take an audio clip and transcribe or answer from it.

It also does private step by step reasoning and tool calls.

Run it on device

Quantized for phone class hardware (about 4.4 GB):

ollama pull Loke-60000/rin-mobile-preview
ollama run Loke-60000/rin-mobile-preview "what is in this photo? image.png"
ollama run Loke-60000/rin-mobile-preview "transcribe this clip.wav"

Run it with transformers

from transformers import AutoProcessor, AutoModelForImageTextToText

model = AutoModelForImageTextToText.from_pretrained("Loke-60000/rin-mobile-preview")
processor = AutoProcessor.from_pretrained("Loke-60000/rin-mobile-preview")

Downloads last month: -

Safetensors

Model size

5B params

Tensor type

BF16