File size: 9,108 Bytes
cfb38a2 22035b6 cfb38a2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 |
---
license: apache-2.0
language:
- en
base_model:
- meta-llama/Llama-3.2-3B-Instruct
---
---
<div align="center">
<img src="https://github.com/distil-labs/badges/blob/main/distillabs-logo.svg?raw=true" width="40%" alt="distil labs" />
</div>
---
<div align="center">
<table>
<tr>
<td align="center">
<a href="https://www.distillabs.ai/?utm_source=hugging-face&utm_medium=referral&utm_campaign=distil-resume-roast">
<img src="https://github.com/distil-labs/badges/blob/main/badge-distillabs-home.svg?raw=true" alt="Homepage"/>
</a>
</td>
<td align="center">
<a href="https://github.com/distil-labs">
<img src="https://github.com/distil-labs/badges/blob/main/badge-github.svg?raw=true" alt="GitHub"/>
</a>
</td>
<td align="center">
<a href="https://huggingface.co/distil-labs">
<img src="https://github.com/distil-labs/badges/blob/main/badge-huggingface.svg?raw=true" alt="Hugging Face"/>
</a>
</td>
</tr>
<tr>
<td align="center">
<a href="https://www.linkedin.com/company/distil-labs/">
<img src="https://github.com/distil-labs/badges/blob/main/badge-linkedin.svg?raw=true" alt="LinkedIn"/>
</a>
</td>
<td align="center">
<a href="https://distil-labs-community.slack.com/join/shared_invite/zt-36zqj87le-i3quWUn2bjErRq22xoE58g">
<img src="https://github.com/distil-labs/badges/blob/main/badge-slack.svg?raw=true" alt="Slack"/>
</a>
</td>
<td align="center">
<a href="https://x.com/distil_labs">
<img src="https://github.com/distil-labs/badges/blob/main/badge-twitter.svg?raw=true" alt="Twitter"/>
</a>
</td>
</tr>
</table>
</div>
# Resume Roaster AI
We trained an SLM (Small Language Model) assistant for automatic resume critique — a Llama-3.2-3B parameter model that generates "Roast Mode" feedback and professional improvement suggestions.
Run it locally to keep your personal data private, or deploy it for instant feedback!
### **1. Install Dependencies**
First, install **[Ollama](http://ollama.com/)** from their official website.
Then set up your Python environment:
```bash
# Create a virtual environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# Install required tools
pip install huggingface_hub ollama rich pymupdf
```
Available models hosted on HuggingFace:
- **[distil-labs/Distil-Rost-Resume-Llama-3.2-3B-Instruct](https://huggingface.co/distil-labs/Distil-Rost-Resume-Llama-3.2-3B-Instruct)**
### **2. Setup the Model**
Download your fine-tuned GGUF model and register it with Ollama.
```bash
hf download distil-labs/Distil-Rost-Resume-Llama-3.2-3B-Instruct --local-dir distil-model
cd distil-model
# Create the Ollama model from the Modelfile
ollama create roast_master -f Modelfile
```
### **3. Usage**
Now you can roast any resume PDF instantly from your terminal.
```bash
# Syntax: python roast.py <path_to_resume.pdf>
python roast.py my_resume.pdf
```
## ✨ Features
The assistant is trained to analyze resumes and output structured JSON containing:
- **💀 Roast Critique**
A sarcastic, humorous paragraph quoting specific problematic parts of the resume (typos, clichés, gaps).
- **✨ Professional Suggestions**
A list of **exactly 3** constructive, actionable tips to improve the resume.
- **📊 Rating**
An integer score **(1–10)** based on overall resume quality.
## 📊 Model Evaluation & Fine-Tuning Results
To validate the necessity of fine-tuning, we performed a strict **A/B Test** comparing the **Base Model** (Llama-3.2-3B-Instruct) against our **Fine-Tuned Student** (Llama-3.2-3B-Instruct).
### 1. The Engineering Challenge
We needed the model to satisfy three conflicting requirements simultaneously:
1. **Strict JSON Schema:** Output *only* valid JSON (no Markdown wrappers like ` ```json `, no conversational filler).
2. **Persona Shift:** Move from the base model's "Helpful Assistant" tone to a "Ruthless Roaster" persona.
3. **Context Awareness:** Cite specific details from the resume rather than giving generic advice.
### 2. Quantitative Results
| Metric | 🤖 Base Model (Llama-3.2-1B) | 👨🏫 Teacher Model (gpt-oss-120b) | 🔥 Fine-Tuned Student (Custom) |
| :--- | :--- | :--- | :--- |
| **JSON Valid Rate** | **70% (Failed)** | **100% (Passed)** | **100% (Passed)** <br>Matches Teacher performance. |
| **Persona Score** | **4/10 (Generic Snark)** <br>Often polite or apologetic. | **10/10 (Benchmark)** | **10/10 (Highly Contextual)** <br>Successfully mimicked the Teacher's tone. |
| **Hallucination** | **High** <br>(Invented formatting wrappers) | **None** | **None** <br>Strict adherence. |
> **Methodology:** A human evaluator reviewed outputs.
### **Evaluation Metrics**
| **Metric** | **Teacher** | **Student-Base (Untrained)** | **Student-Tuned (Ours)** |
|--------------------|-------------------|-------------------------------|---------------------------|
| **ROUGE-L** | 0.44 | 0.10 | **0.43 ⬆** |
| **METEOR** | 0.36 | 0.17 | **0.35 ⬆** |
### 3. Qualitative Comparison (Side-by-Side)
We fed the same resume ("Paritosh") to both models.
#### ❌ Base Model Output (Failure)
* **Status:** **CRASH** (JSON Parsing Error)
* **Why:** The model "hallucinated" a conversational intro and added a disclaimer at the end, breaking the JSON parser.
```text
Here is the critique for the resume:
{
"roast_critique": "The resume is okay but lacks specific details..."
}
Note: I tried to be critical but maintained a professional tone.
```
#### ✅ Fine-Tuned Model Output (Success)
**Status:** PASS (Valid JSON)
**Why:** The model output raw JSON immediately. The tone was specific to the user's *"Income Inequality"* project.
#### JSON
```json
{
"roast_critique": "PARITOSH, the future of computer science is you... in a really long time. Your resume is a jumbled mess of buzzwords. 'Integrated Unstructured data' sounds like something a 5-year-old would say, and don't even get me started on the 'Income Inequality Analysis' - who doesn't love a good spreadsheet? Oh, and please stop using Comic Sans.",
"professional_suggestions": [
"Provide specific examples of real-world problems you solved instead of listing generic tech stacks.",
"Quantify accomplishments. Instead of 'reduced response time', say 'improved response time from 2s to 0.4s'.",
"Get rid of the 'Certifications' section and replace it with real project experience."
],
"rating": 6
}
```
#### Example 1
<p align="center">
<img src="https://github.com/distil-labs/distil-resume-roast/blob/main/examples/rr-1.png?raw=true" width="550" alt="Example 1" />
</p>
---
#### Example 2
<p align="center">
<img src="https://github.com/distil-labs/distil-resume-roast/blob/main/examples/rr-2.png?raw=true" width="550" alt="Example 2" />
</p>
---
#### Example 3
<p align="center">
<img src="https://github.com/distil-labs/distil-resume-roast/blob/main/examples/rr-3.png?raw=true" width="550" alt="Example 3" />
</p>
### **Training Config**
- **Student:** Llama-3.2-3B-Instruct
- **Teacher:** openai.gpt-oss-120b
- **Dataset:** 10,000 synthetic examples
### 4. Conclusion
The fine-tuning process **successfully eliminated the formatting hallucinations** present in the base model and **significantly enhanced the "Roaster" persona**, making the outputs more structured, consistent, and aligned with the intended tone.
## ❓ FAQ
---
<details>
<summary><strong>Q: Why not just use ChatGPT or Claude?</strong></summary>
**Privacy and cost.**
Resumes contain sensitive personal data (PII). Sending them to cloud APIs risks exposure.
Our model runs **fully locally**, ensuring zero data leaks and costs **nothing** to run.
</details>
---
<details>
<summary><strong>Q: How accurate is a 3B model compared to GPT-4?</strong></summary>
Surprisingly good for this specific task!
Because it’s fine-tuned on **6,000+ high-quality roast-style examples**, it performs far better than a generic prompt to GPT-4.
It captures the **roast persona** more consistently and is extremely fast.
</details>
---
<details>
<summary><strong>Q: Can I use this for serious resume reviews?</strong></summary>
Yes!
The **Professional Suggestions** section is trained on real career guidance data.
You can ignore the roast and only use the actionable tips.
</details>
---
<details>
<summary><strong>Q: The model is too mean! Can I change it?</strong></summary>
The model is intentionally “brutally honest.”
But since it outputs **structured JSON**, you can simply hide the `roast` field and show only the suggestions.
</details>
---
<details>
<summary><strong>Q: What hardware do I need?</strong></summary>
**Minimum:**
- 8GB RAM (CPU Mode)
- Works well on modern laptops (Mac M1/M2/M3 recommended)
**Recommended:**
- NVIDIA GPU with **4GB+ VRAM** for 2–5s inference
</details>
|