How to use FINAL-Bench/Darwin-31B-Opus with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="FINAL-Bench/Darwin-31B-Opus") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)
# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("FINAL-Bench/Darwin-31B-Opus") model = AutoModelForImageTextToText.from_pretrained("FINAL-Bench/Darwin-31B-Opus") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
How to use FINAL-Bench/Darwin-31B-Opus with vLLM:
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "FINAL-Bench/Darwin-31B-Opus" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FINAL-Bench/Darwin-31B-Opus", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'
docker model run hf.co/FINAL-Bench/Darwin-31B-Opus
How to use FINAL-Bench/Darwin-31B-Opus with SGLang:
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "FINAL-Bench/Darwin-31B-Opus" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FINAL-Bench/Darwin-31B-Opus", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "FINAL-Bench/Darwin-31B-Opus" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FINAL-Bench/Darwin-31B-Opus", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'
How to use FINAL-Bench/Darwin-31B-Opus with Docker Model Runner:
Thank you very much! This is a very interesting model. I tested it for several hours on my own tasks β the result is better than the original Gemma4. There were no tool call errors at all, and the model's reasoning is more accurate.
Β· Sign up or log in to comment