Minami-su/Amara-o1-dataset
Viewer • Updated • 2.46k • 5 • 3
“何が綴られていたのか、私たちの文明では到底理解できない”
(所阐述的内容超出了我们文明的理解范围)
— sasakure.UK
微调基于Qwen2.5-7B-Instruct
# Use a pipeline as a high-level helper
from transformers import pipeline
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe = pipeline("text-generation", model="Minami-su/Amara-o1-7B-Qwen")
pipe(messages)
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Minami-su/Amara-o1-7B-Qwen")
model = AutoModelForCausalLM.from_pretrained("Minami-su/Amara-o1-7B-Qwen")
| Model | Arena-Hard | AlpacaEval 2.0 |
|---|---|---|
| DeepSeek-V2.5-0905 | 76.2 | 50.5 |
| Qwen2.5-72B-Instruct | 81.2 | 49.1 |
| LLaMA-3.1 405B | 69.3 | 40.5 |
| Amara-o1-7B-Qwen | ? | 42.12 |
| GPT-4o-0513 | 80.4 | 51.1 |
| Claude-Sonnet-3.5-1022 | 85.2 | 52.0 |
| DeepSeek-V3 | 85.5 | 70.0 |
Note: English open-ended conversation evaluations. For AlpacaEval 2.0, we use the length-controlled win rate as the metric.