| # Perry-7B | |
| A generalist reasoning LLM trained on synthetic chain-of-thought traces over STEM data. Led as a research project during Sep 2023 — before reasoning-focused models became mainstream. | |
| ## Overview | |
| Perry is a fine-tuned LLaMA 2 7B model designed to improve reasoning capabilities through synthetic CoT supervision. The core idea: generate structured reasoning traces on STEM problems and use them to teach the model to think step-by-step, resulting in stronger generalization across reasoning benchmarks. | |
| Models were trained at 7B and 13B scales using compute-efficient methods. | |
| ## Results | |
| Improvements over LLaMA 2 7B (as of Sep 2023): | |
| | Benchmark | Perry-7B | LLaMA 2 7B | Delta | | |
| |-----------|----------|------------|-------| | |
| | MMLU (5-shot) | 46.18 | 43.80 | +2.38 | | |
| | TruthfulQA (0-shot) | 40.08 | 38.98 | +1.10 | | |
| | GSM8K (5-shot) | 10.31 | 5.38 | +4.93 | | |
| ## Usage | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| model = AutoModelForCausalLM.from_pretrained("dotvignesh/perry-7b") | |
| tokenizer = AutoTokenizer.from_pretrained("dotvignesh/perry-7b") | |
| ``` | |
| ## Model Details | |
| - **Base model:** LLaMA 2 7B | |
| - **Training data:** Synthetic CoT traces on STEM datasets | |
| - **Framework:** PyTorch / Transformers |