dotvignesh
/

perry-7b

@@ -4,18 +4,19 @@ A generalist reasoning LLM trained on synthetic chain-of-thought traces over STE
 ## Overview
-Perry is a fine-tuned LLaMA model designed to improve reasoning capabilities through synthetic CoT supervision. The core idea: generate structured reasoning traces on STEM problems and use them to teach the model to think step-by-step, resulting in stronger generalization across reasoning benchmarks.
 Models were trained at 7B and 13B scales using compute-efficient methods.
 ## Results
-Improvements over baselines (as of Sep 2023):
-| Benchmark | Improvement |
-|-----------|-------------|
-| Winogrande | +4% |
-| ARC-Challenge | +6% |
 ## Usage
@@ -28,6 +29,6 @@ tokenizer = AutoTokenizer.from_pretrained("dotvignesh/perry-7b")
 ## Model Details
-- **Base model:** LLaMA
 - **Training data:** Synthetic CoT traces on STEM datasets
 - **Framework:** PyTorch / Transformers

 ## Overview
+Perry is a fine-tuned LLaMA 2 7B model designed to improve reasoning capabilities through synthetic CoT supervision. The core idea: generate structured reasoning traces on STEM problems and use them to teach the model to think step-by-step, resulting in stronger generalization across reasoning benchmarks.
 Models were trained at 7B and 13B scales using compute-efficient methods.
 ## Results
+Improvements over LLaMA 2 7B (as of Sep 2023):
+| Benchmark | Perry-7B | LLaMA 2 7B | Delta |
+|-----------|----------|------------|-------|
+| MMLU (5-shot) | 46.18 | 43.80 | +2.38 |
+| TruthfulQA (0-shot) | 40.08 | 38.98 | +1.10 |
+| GSM8K (5-shot) | 10.31 | 5.38 | +4.93 |
 ## Usage
 ## Model Details
+- **Base model:** LLaMA 2 7B
 - **Training data:** Synthetic CoT traces on STEM datasets
 - **Framework:** PyTorch / Transformers