radm lbourdois commited on
Commit
b880c2e
Β·
verified Β·
1 Parent(s): c4af684

Improve language tag (#3)

Browse files

- Improve language tag (07605ba9ba29648b825bc11e9230b464d5b30e32)


Co-authored-by: LoΓ―ck BOURDOIS <lbourdois@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +48 -37
README.md CHANGED
@@ -1,37 +1,48 @@
1
- ---
2
- license: other
3
- license_name: nexusflowresearchlicense
4
- license_link: >-
5
- https://huggingface.co/Nexusflow/Athene-V2-Chat/resolve/main/Nexusflow_Research_License_.pdf
6
- language:
7
- - en
8
- library_name: transformers
9
- tags:
10
- - RLHF
11
- - Nexusflow
12
- - Athene
13
- - Chat Model
14
- base_model:
15
- - Qwen/Qwen2.5-72B-Instruct
16
- ---
17
- # Athene-V2-Chat-72B: Rivaling GPT-4o across Benchmarks
18
-
19
- - AWQ 4bit version of [Nexusflow/Athene-V2-Chat](https://huggingface.co/Nexusflow/Athene-V2-Chat)
20
- - [Quantization code](https://docs.vllm.ai/en/latest/quantization/auto_awq.html)
21
- - This model [only fits to 1 gpu](https://huggingface.co/radm/Athene-V2-Chat-AWQ/discussions/2). Use [kosbu/Athene-V2-Chat-AWQ](https://huggingface.co/kosbu/Athene-V2-Chat-AWQ) for multi-gpu support
22
-
23
- ## Eval AWQ version
24
-
25
- Evaluation results on [ZebraLogic](https://github.com/WildEval/ZeroEval/blob/main/result_dirs/zebra-grid.summary.md)
26
-
27
- ```
28
- β”‚ Model β”‚ Mode β”‚ N_Mode β”‚ N_Size β”‚ Puzzle Acc β”‚ Easy Puzzle Acc β”‚ Hard Puzzle Acc β”‚ Cell Acc β”‚ No answer β”‚ Total Puzzles β”‚ Reason Lens β”‚
29
- β”‚ o1-preview-2024-09-12 β”‚ greedy β”‚ single β”‚ 1 β”‚ 71.4 β”‚ 98.57 β”‚ 60.83 β”‚ 75.14 β”‚ 0.3 β”‚ 1000 β”‚ 1565.88 β”‚
30
- β”‚ claude-3-5-sonnet-20241022 β”‚ greedy β”‚ single β”‚ 1 β”‚ 36.2 β”‚ 91.07 β”‚ 14.86 β”‚ 54.27 β”‚ 0 β”‚ 1000 β”‚ 861.18 β”‚
31
- β”‚ Llama-3.1-405B-Inst-fp8@together β”‚ greedy β”‚ single β”‚ 1 β”‚ 32.6 β”‚ 87.14 β”‚ 11.39 β”‚ 45.8 β”‚ 12.5 β”‚ 1000 β”‚ 314.66 β”‚
32
- β”‚ Athene-V2-Chat-AWQ β”‚ greedy β”‚ single β”‚ 1 β”‚ 27.8 β”‚ 77.14 β”‚ 8.61 β”‚ 45.83 β”‚ 6.4 β”‚ 1000 β”‚ 1785.7 β”‚
33
- β”‚ Qwen2.5-72B-Instruct β”‚ greedy β”‚ single β”‚ 1 β”‚ 26.6 β”‚ 76.43 β”‚ 7.22 β”‚ 40.92 β”‚ 11.9 β”‚ 1000 β”‚ 1795.9 β”‚
34
- β”‚ Qwen2.5-32B-Instruct β”‚ greedy β”‚ single β”‚ 1 β”‚ 26.1 β”‚ 77.5 β”‚ 6.11 β”‚ 43.39 β”‚ 6.3 β”‚ 1000 β”‚ 1333.07 β”‚
35
- β”‚ Athene-70B β”‚ greedy β”‚ single β”‚ 1 β”‚ 16.7 β”‚ 52.5 β”‚ 2.78 β”‚ 32.98 β”‚ 21.1 β”‚ 1000 β”‚ 391.19 β”‚
36
- ```
37
-
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: nexusflowresearchlicense
4
+ license_link: https://huggingface.co/Nexusflow/Athene-V2-Chat/resolve/main/Nexusflow_Research_License_.pdf
5
+ language:
6
+ - zho
7
+ - eng
8
+ - fra
9
+ - spa
10
+ - por
11
+ - deu
12
+ - ita
13
+ - rus
14
+ - jpn
15
+ - kor
16
+ - vie
17
+ - tha
18
+ - ara
19
+ library_name: transformers
20
+ tags:
21
+ - RLHF
22
+ - Nexusflow
23
+ - Athene
24
+ - Chat Model
25
+ base_model:
26
+ - Qwen/Qwen2.5-72B-Instruct
27
+ ---
28
+ # Athene-V2-Chat-72B: Rivaling GPT-4o across Benchmarks
29
+
30
+ - AWQ 4bit version of [Nexusflow/Athene-V2-Chat](https://huggingface.co/Nexusflow/Athene-V2-Chat)
31
+ - [Quantization code](https://docs.vllm.ai/en/latest/quantization/auto_awq.html)
32
+ - This model [only fits to 1 gpu](https://huggingface.co/radm/Athene-V2-Chat-AWQ/discussions/2). Use [kosbu/Athene-V2-Chat-AWQ](https://huggingface.co/kosbu/Athene-V2-Chat-AWQ) for multi-gpu support
33
+
34
+ ## Eval AWQ version
35
+
36
+ Evaluation results on [ZebraLogic](https://github.com/WildEval/ZeroEval/blob/main/result_dirs/zebra-grid.summary.md)
37
+
38
+ ```
39
+ β”‚ Model β”‚ Mode β”‚ N_Mode β”‚ N_Size β”‚ Puzzle Acc β”‚ Easy Puzzle Acc β”‚ Hard Puzzle Acc β”‚ Cell Acc β”‚ No answer β”‚ Total Puzzles β”‚ Reason Lens β”‚
40
+ β”‚ o1-preview-2024-09-12 β”‚ greedy β”‚ single β”‚ 1 β”‚ 71.4 β”‚ 98.57 β”‚ 60.83 β”‚ 75.14 β”‚ 0.3 β”‚ 1000 β”‚ 1565.88 β”‚
41
+ β”‚ claude-3-5-sonnet-20241022 β”‚ greedy β”‚ single β”‚ 1 β”‚ 36.2 β”‚ 91.07 β”‚ 14.86 β”‚ 54.27 β”‚ 0 β”‚ 1000 β”‚ 861.18 β”‚
42
+ β”‚ Llama-3.1-405B-Inst-fp8@together β”‚ greedy β”‚ single β”‚ 1 β”‚ 32.6 β”‚ 87.14 β”‚ 11.39 β”‚ 45.8 β”‚ 12.5 β”‚ 1000 β”‚ 314.66 β”‚
43
+ β”‚ Athene-V2-Chat-AWQ β”‚ greedy β”‚ single β”‚ 1 β”‚ 27.8 β”‚ 77.14 β”‚ 8.61 β”‚ 45.83 β”‚ 6.4 β”‚ 1000 β”‚ 1785.7 β”‚
44
+ β”‚ Qwen2.5-72B-Instruct β”‚ greedy β”‚ single β”‚ 1 β”‚ 26.6 β”‚ 76.43 β”‚ 7.22 β”‚ 40.92 β”‚ 11.9 β”‚ 1000 β”‚ 1795.9 β”‚
45
+ β”‚ Qwen2.5-32B-Instruct β”‚ greedy β”‚ single β”‚ 1 β”‚ 26.1 β”‚ 77.5 β”‚ 6.11 β”‚ 43.39 β”‚ 6.3 β”‚ 1000 β”‚ 1333.07 β”‚
46
+ β”‚ Athene-70B β”‚ greedy β”‚ single β”‚ 1 β”‚ 16.7 β”‚ 52.5 β”‚ 2.78 β”‚ 32.98 β”‚ 21.1 β”‚ 1000 β”‚ 391.19 β”‚
47
+ ```
48
+