lbourdois commited on
Commit
043fe2d
·
verified ·
1 Parent(s): 493f1bb

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +95 -83
README.md CHANGED
@@ -1,84 +1,96 @@
1
- ---
2
- license: other
3
- language:
4
- - en
5
- library_name: transformers
6
- tags:
7
- - RLHF
8
- - Nexusflow
9
- - Athene
10
- - Chat Model
11
- base_model:
12
- - Qwen/Qwen2.5-72B-Instruct
13
- ---
14
- # Athene-V2-Chat-72B: Rivaling GPT-4o across Benchmarks
15
-
16
- <p align="center">
17
- <a href="https://huggingface.co/Nexusflow" target="_blank">Nexusflow HF</a> - <a href="https://discord.gg/HDSVmNAs3y" target="_blank">Nexusflow Discord</a> - <a href="https://nexusflow.ai/blogs/athene-v2" target="_blank">Athene-V2 Blogpost</a>
18
- </p>
19
-
20
-
21
- We introduce Athene-V2-Chat-72B, an open-weights LLM on-par with GPT-4o across benchmarks. It is currently the best open model according to [Chatbot Arena](https://lmarena.ai/?leaderboard), where it beats GPT-4o-0513 (the best GPT-4o model on Arena) in hard and math category, and is on-par with GPT-4o-0513 in coding, instruction following, longer query and multi-turn.
22
-
23
- It is trained through RLHF with Qwen-2.5-72B-Instruct as base model. Athene-V2-Chat-72B excels in chat, math, and coding. Its sister model, [Athene-V2-Agent-72B](https://huggingface.co/Nexusflow/Athene-V2-Agent), surpasses GPT-4o in complex function calling and agentic applications.
24
-
25
-
26
- <p align="center" width="100%">
27
- <a><img src="arena.png" alt="Arena" style="width: 100%; min-width: 300px; display: block; margin: auto;"></a>
28
- </p>
29
-
30
- <p align="center" width="100%">
31
- <a><img src="benchmark.png" alt="Benchmark" style="width: 100%; min-width: 300px; display: block; margin: auto;"></a>
32
- </p>
33
-
34
- - **Developed by:** The Nexusflow Team
35
- - **Model type:** Chat Model
36
- - **Finetuned from model:** [Qwen 2.5 72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct)
37
- - **License**: [Nexusflow Research License](https://huggingface.co/Nexusflow/Athene-V2-Chat/blob/main/Nexusflow_Research_License_.pdf)
38
- - **Blog**: https://nexusflow.ai/blogs/athene-v2
39
-
40
- ## Usage
41
- Athene-V2-Chat uses the same chat template as Qwen2.5-72B-Instruct. Below is an example simple usage using the Transformers library.
42
-
43
- ```Python
44
- from transformers import AutoModelForCausalLM, AutoTokenizer
45
-
46
- model_name = "Nexusflow/Athene-V2-Chat"
47
-
48
- model = AutoModelForCausalLM.from_pretrained(
49
- model_name,
50
- torch_dtype="auto",
51
- device_map="auto"
52
- )
53
- tokenizer = AutoTokenizer.from_pretrained(model_name)
54
-
55
- prompt = "Write a Python function to return the nth Fibonacci number in log n runtime."
56
-
57
- messages = [
58
- {"role": "user", "content": prompt}
59
- ]
60
-
61
- text = tokenizer.apply_chat_template(
62
- messages,
63
- tokenize=False,
64
- add_generation_prompt=True
65
- )
66
-
67
- model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
68
-
69
- generated_ids = model.generate(
70
- **model_inputs,
71
- max_new_tokens=2048
72
- )
73
-
74
- generated_ids = [
75
- output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
76
- ]
77
-
78
- response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
79
- ```
80
-
81
- Note that by adding a system prompt that encourages the model to think step by step, the model can improve further on difficult math queries and problems like counting `r`s in strawberry. For fairness consideration we **do not** include such system prompt during chat evaluation.
82
-
83
- ## Acknowledgment
 
 
 
 
 
 
 
 
 
 
 
 
84
  We would like to thank the [LMSYS Organization](https://lmsys.org/) for their support of testing the model. We would like to thank Qwen Team and the open source community for their efforts in providing the datasets and base models.
 
1
+ ---
2
+ license: other
3
+ language:
4
+ - zho
5
+ - eng
6
+ - fra
7
+ - spa
8
+ - por
9
+ - deu
10
+ - ita
11
+ - rus
12
+ - jpn
13
+ - kor
14
+ - vie
15
+ - tha
16
+ - ara
17
+ library_name: transformers
18
+ tags:
19
+ - RLHF
20
+ - Nexusflow
21
+ - Athene
22
+ - Chat Model
23
+ base_model:
24
+ - Qwen/Qwen2.5-72B-Instruct
25
+ ---
26
+ # Athene-V2-Chat-72B: Rivaling GPT-4o across Benchmarks
27
+
28
+ <p align="center">
29
+ <a href="https://huggingface.co/Nexusflow" target="_blank">Nexusflow HF</a> - <a href="https://discord.gg/HDSVmNAs3y" target="_blank">Nexusflow Discord</a> - <a href="https://nexusflow.ai/blogs/athene-v2" target="_blank">Athene-V2 Blogpost</a>
30
+ </p>
31
+
32
+
33
+ We introduce Athene-V2-Chat-72B, an open-weights LLM on-par with GPT-4o across benchmarks. It is currently the best open model according to [Chatbot Arena](https://lmarena.ai/?leaderboard), where it beats GPT-4o-0513 (the best GPT-4o model on Arena) in hard and math category, and is on-par with GPT-4o-0513 in coding, instruction following, longer query and multi-turn.
34
+
35
+ It is trained through RLHF with Qwen-2.5-72B-Instruct as base model. Athene-V2-Chat-72B excels in chat, math, and coding. Its sister model, [Athene-V2-Agent-72B](https://huggingface.co/Nexusflow/Athene-V2-Agent), surpasses GPT-4o in complex function calling and agentic applications.
36
+
37
+
38
+ <p align="center" width="100%">
39
+ <a><img src="arena.png" alt="Arena" style="width: 100%; min-width: 300px; display: block; margin: auto;"></a>
40
+ </p>
41
+
42
+ <p align="center" width="100%">
43
+ <a><img src="benchmark.png" alt="Benchmark" style="width: 100%; min-width: 300px; display: block; margin: auto;"></a>
44
+ </p>
45
+
46
+ - **Developed by:** The Nexusflow Team
47
+ - **Model type:** Chat Model
48
+ - **Finetuned from model:** [Qwen 2.5 72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct)
49
+ - **License**: [Nexusflow Research License](https://huggingface.co/Nexusflow/Athene-V2-Chat/blob/main/Nexusflow_Research_License_.pdf)
50
+ - **Blog**: https://nexusflow.ai/blogs/athene-v2
51
+
52
+ ## Usage
53
+ Athene-V2-Chat uses the same chat template as Qwen2.5-72B-Instruct. Below is an example simple usage using the Transformers library.
54
+
55
+ ```Python
56
+ from transformers import AutoModelForCausalLM, AutoTokenizer
57
+
58
+ model_name = "Nexusflow/Athene-V2-Chat"
59
+
60
+ model = AutoModelForCausalLM.from_pretrained(
61
+ model_name,
62
+ torch_dtype="auto",
63
+ device_map="auto"
64
+ )
65
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
66
+
67
+ prompt = "Write a Python function to return the nth Fibonacci number in log n runtime."
68
+
69
+ messages = [
70
+ {"role": "user", "content": prompt}
71
+ ]
72
+
73
+ text = tokenizer.apply_chat_template(
74
+ messages,
75
+ tokenize=False,
76
+ add_generation_prompt=True
77
+ )
78
+
79
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
80
+
81
+ generated_ids = model.generate(
82
+ **model_inputs,
83
+ max_new_tokens=2048
84
+ )
85
+
86
+ generated_ids = [
87
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
88
+ ]
89
+
90
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
91
+ ```
92
+
93
+ Note that by adding a system prompt that encourages the model to think step by step, the model can improve further on difficult math queries and problems like counting `r`s in strawberry. For fairness consideration we **do not** include such system prompt during chat evaluation.
94
+
95
+ ## Acknowledgment
96
  We would like to thank the [LMSYS Organization](https://lmsys.org/) for their support of testing the model. We would like to thank Qwen Team and the open source community for their efforts in providing the datasets and base models.