Improve language tag

#3
by lbourdois - opened
Files changed (1) hide show
  1. README.md +68 -56
README.md CHANGED
@@ -1,57 +1,69 @@
1
- ---
2
- license: apache-2.0
3
- datasets:
4
- - TIGER-Lab/WebInstruct-CFT
5
- language:
6
- - en
7
- base_model:
8
- - Qwen/Qwen2.5-32B-Instruct
9
- tags:
10
- - cft
11
- - math
12
- - reasoning
13
- pipeline_tag: text-generation
14
- library_name: transformers
15
- ---
16
-
17
- # Qwen2.5-32B-Instruct-CFT
18
-
19
- <div style="display: flex; gap: 4px; align-items: center">
20
- <a target="_blank" href="https://github.com/TIGER-AI-Lab/CritiqueFinetuning">
21
- <img style="height:18pt" src="https://img.shields.io/badge/-Code-black?style=flat&logo=github"/>
22
- </a>
23
- <a target="_blank" href="https://arxiv.org/abs/2501.17703">
24
- <img style="height:18pt" src="https://img.shields.io/badge/-Paper-green?style=flat&logo=arxiv"/>
25
- </a>
26
- <a target="_blank" href="https://tiger-ai-lab.github.io/CritiqueFineTuning">
27
- <img style="height:18pt" src="https://img.shields.io/badge/-📖%20Website-red?style=flat"/>
28
- </a>
29
- <a target="_blank" href="https://huggingface.co/datasets/TIGER-Lab/WebInstruct-CFT">
30
- <img style="height:18pt" src="https://img.shields.io/badge/-🤗%20Dataset-red?style=flat"/>
31
- </a>
32
- </div>
33
-
34
- ## Introduction
35
-
36
- Qwen2.5-32B-Instruct-CFT is a 32B parameter model fine-tuned using our novel Critique Fine-Tuning (CFT) approach. Built upon the Qwen2.5-32B-Instruct base model, this variant is trained to critique and analyze responses rather than simply imitate them, leading to enhanced reasoning capabilities.
37
-
38
- ## Key Features
39
-
40
- - Built on the powerful Qwen2.5-32B-Instruct foundation
41
- - Trained using Critique Fine-Tuning (CFT) methodology
42
- - Highly efficient training with minimal data requirements
43
- - Inherits the strong instruction-following capabilities of the base model
44
-
45
- ## Training Details
46
-
47
- ### Training Data
48
- - Dataset: [WebInstruct-CFT-4K](https://huggingface.co/datasets/TIGER-Lab/WebInstruct-CFT-4K)
49
- - Training format: (input=[query; noisy response], output=critique)
50
- - Teacher model: GPT-4o for generating critiques
51
-
52
- ### Training Infrastructure
53
- - Framework: LLaMA-Factory
54
- - Hardware: 8x NVIDIA H100 GPUs
55
- - Training time: ~1.5 hours with DeepSpeed Zero-3
56
-
 
 
 
 
 
 
 
 
 
 
 
 
57
  For more details about the model architecture, methodology, and comprehensive evaluation results, please visit our [project webpage](https://tiger-ai-lab.github.io/CritiqueFineTuning).
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - TIGER-Lab/WebInstruct-CFT
5
+ language:
6
+ - zho
7
+ - eng
8
+ - fra
9
+ - spa
10
+ - por
11
+ - deu
12
+ - ita
13
+ - rus
14
+ - jpn
15
+ - kor
16
+ - vie
17
+ - tha
18
+ - ara
19
+ base_model:
20
+ - Qwen/Qwen2.5-32B-Instruct
21
+ tags:
22
+ - cft
23
+ - math
24
+ - reasoning
25
+ pipeline_tag: text-generation
26
+ library_name: transformers
27
+ ---
28
+
29
+ # Qwen2.5-32B-Instruct-CFT
30
+
31
+ <div style="display: flex; gap: 4px; align-items: center">
32
+ <a target="_blank" href="https://github.com/TIGER-AI-Lab/CritiqueFinetuning">
33
+ <img style="height:18pt" src="https://img.shields.io/badge/-Code-black?style=flat&logo=github"/>
34
+ </a>
35
+ <a target="_blank" href="https://arxiv.org/abs/2501.17703">
36
+ <img style="height:18pt" src="https://img.shields.io/badge/-Paper-green?style=flat&logo=arxiv"/>
37
+ </a>
38
+ <a target="_blank" href="https://tiger-ai-lab.github.io/CritiqueFineTuning">
39
+ <img style="height:18pt" src="https://img.shields.io/badge/-📖%20Website-red?style=flat"/>
40
+ </a>
41
+ <a target="_blank" href="https://huggingface.co/datasets/TIGER-Lab/WebInstruct-CFT">
42
+ <img style="height:18pt" src="https://img.shields.io/badge/-🤗%20Dataset-red?style=flat"/>
43
+ </a>
44
+ </div>
45
+
46
+ ## Introduction
47
+
48
+ Qwen2.5-32B-Instruct-CFT is a 32B parameter model fine-tuned using our novel Critique Fine-Tuning (CFT) approach. Built upon the Qwen2.5-32B-Instruct base model, this variant is trained to critique and analyze responses rather than simply imitate them, leading to enhanced reasoning capabilities.
49
+
50
+ ## Key Features
51
+
52
+ - Built on the powerful Qwen2.5-32B-Instruct foundation
53
+ - Trained using Critique Fine-Tuning (CFT) methodology
54
+ - Highly efficient training with minimal data requirements
55
+ - Inherits the strong instruction-following capabilities of the base model
56
+
57
+ ## Training Details
58
+
59
+ ### Training Data
60
+ - Dataset: [WebInstruct-CFT-4K](https://huggingface.co/datasets/TIGER-Lab/WebInstruct-CFT-4K)
61
+ - Training format: (input=[query; noisy response], output=critique)
62
+ - Teacher model: GPT-4o for generating critiques
63
+
64
+ ### Training Infrastructure
65
+ - Framework: LLaMA-Factory
66
+ - Hardware: 8x NVIDIA H100 GPUs
67
+ - Training time: ~1.5 hours with DeepSpeed Zero-3
68
+
69
  For more details about the model architecture, methodology, and comprehensive evaluation results, please visit our [project webpage](https://tiger-ai-lab.github.io/CritiqueFineTuning).