File size: 2,149 Bytes
37ef5e1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2d70b92
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
license: apache-2.0
tags:
  - text-generation
  - instruction-following
  - llm
  - resume-parsing
  - unsloth
language:
  - en
library_name: unsloth
model_name: DeepSeek-R1-Distill-Llama-8B
pipeline_tag: text2text-generation
---

# Model Card: Resume Information Extractor (LLM-based)

## Overview

This model is a distilled, instruction-tuned version of the `DeepSeek-R1-Distill-Llama-8B` language model, optimized for extracting structured information from resumes in English. It was built using the [Unsloth](https://github.com/unslothai/unsloth) library for efficient fine-tuning and inference.

Given a raw resume text, the model outputs structured JSON containing:
- `skills`: list of skills mentioned
- `education`: simplified school-degree-major format
- `experience`: list of job roles

## Intended Uses

This model is designed for:
- HR software to parse applicant resumes automatically
- Applicant tracking systems (ATS)
- AI assistants helping with recruiting and screening
- EdTech or job board platforms classifying user profiles

Example Input Prompt:
```text
You are an experienced HR and now you will review a resume then extract key information from it.

# Input
Here is the resume text:
[PASTE RESUME TEXT HERE]

### Response
<think>
```

Expected Output:
```json
{
  "skills": [...],
  "education": [...],
  "experience": [...]
}
```

## Training & Technical Details

- **Base model**: `unsloth/DeepSeek-R1-Distill-Llama-8B`
- **Library**: `Unsloth` with support for 4-bit quantization (`bitsandbytes`)
- **Fine-tuning style**: Instruction-tuning using formatted HR task prompts
- **Max sequence length**: 8096 tokens
- **Hardware requirements**: ~16GB GPU RAM (with 4-bit loading)

## Limitations

- Performance may degrade with non-English or poorly formatted resumes
- Only extracts roles (not company names or dates)
- Cannot handle multi-lingual documents
- Does not validate output schema; use external validators if needed

## Citation

If you use this model, please cite the following components:
- Unsloth: https://github.com/unslothai/unsloth
- DeepSeek LLM: https://github.com/deepseek-ai

## License

Apache 2.0