Instructions to use rhymes-ai/Aria with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use rhymes-ai/Aria with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="rhymes-ai/Aria")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("rhymes-ai/Aria")
model = AutoModelForMultimodalLM.from_pretrained("rhymes-ai/Aria")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use rhymes-ai/Aria with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "rhymes-ai/Aria"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rhymes-ai/Aria",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/rhymes-ai/Aria

SGLang

How to use rhymes-ai/Aria with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "rhymes-ai/Aria" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rhymes-ai/Aria",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "rhymes-ai/Aria" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rhymes-ai/Aria",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use rhymes-ai/Aria with Docker Model Runner:
```
docker model run hf.co/rhymes-ai/Aria
```

Base model not released

by adamo1139 - opened Oct 10, 2024

Discussion

adamo1139

Oct 10, 2024

•

edited Oct 10, 2024

Hi Rhymes Team,

Thank you for releasing a model with a permissive license. This model has the potential to disrupt workflows in many use cases after fine-tuning. However, the base model has not been released, which will likely make fine-tuning for downstream tasks more challenging for developers. Could you please release the weights of the pre-trained model before it was subjected to multimodal post-training data?

gopi87

Oct 10, 2024

yep cool model a little bit fine tune will make this model near to gpt o level perfomance !!

@MaziyarPanahi

MaziyarPanahi

Oct 10, 2024

This is a lovely model! Never done RLHF on a multimodal model, but there is always a first! :)

JunnanLi

Rhymes.AI org Oct 11, 2024

Thanks for your feedback!

We found that our post-training does not hinder performance on fine-tuning for downstream tasks.

Araki

Oct 11, 2024

Please consider releasing the base model. It's not about the benchmark results. For things outside the box that are not designed to work in question/answer pairs, an instruct-tuned model cannot and should not be used, as it will by design always have the assistant-like bias.

An Apache 2.0 licensed base model that is both competitive and has only ~4B active parameters would be very nice.

Delta36652

Oct 11, 2024

I support this initiative. Base model will be valuable on its own.

muratowski

Oct 12, 2024

in Rhymes.ai website, when I ask, which model it is, it replies: GPT-4

Icecream102

Oct 13, 2024

nina-summer

Oct 14, 2024

@Icecream102 Due to the more recent knowledge cutoff and the use of some open-source synthetic data during instruction fine-tuning, Aria occasionally experiences confusion in its self-identity.

Icecream102

Oct 16, 2024

So it's not Reflection 70B all over again? Assuming this is not the case, the only way a model would claim being GPT-4 (other than simply instructed to, which is irrelevant) is that the training data makes it believe so. Now, I can fully see this happening in several ways, ranging from benign to problematic. GPT-4 being such a dominant entity being discussed extensively online as well as in books, news, scientific papers, benchmarks, etc would allow for many weak signals about self-identity as an LLM to add up to hallucinating about being GPT-4. However, to minimize the risk for trouble, please dig thoroughly and prune out and/or expose anything you can find in whatever public open-source datasets you are referring to. The community would benefit from weeding out anything than strengthens this effect, since I would be easy enough for lawyers to "jump to conclusions", to put it mildly. Please help the community keep any open-source data of consequence clean from this type of contamination, even if the data is made open-source by some 3rd party. /Gabriel

Icecream102

Oct 16, 2024

Also - thank you for your generosity in making this model open-source! (base-model would be great as well! ;-) )

adamo1139

Dec 13, 2024

Rhymes.AI released base models for Aria about 2 weeks ago, noticed just now.

Aria-Base-8K
Aria-Base-64K

Very cool! Thank you ♥️♥️♥️

adamo1139 changed discussion status to closed Dec 14, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment