Spaces:

Rithankoushik
/

job-parser-space

Sleeping

App Files Files Community

job-parser-space / README.md

Rithankoushik

Update README.md

c1e5867 verified 6 months ago

preview code

raw

history blame contribute delete

1.8 kB

A newer version of the Streamlit SDK is available: 1.53.0

Upgrade

metadata

title: Job Description Parser
emoji: 🧠
colorFrom: blue
colorTo: purple
sdk: streamlit
sdk_version: 1.48.0
app_file: app.py
pinned: false

Job Parser Model (Qwen Fine-Tuned)

This repository contains a fine-tuned version of the Qwen model, specifically adapted to parse job descriptions into structured JSON format.

💼 Use Case

The model takes raw job descriptions (JD) as input and outputs structured JSON data containing:

Job Titles
Company Name & Website
Skills
Compensation
Location
Work Mode
Experience
Qualification
Industry
Posted Date
Notice Period
Job Type

Perfect for building:

Resume & JD analyzers
Job boards with smart filtering
HR automation tools
Job matching engines

🧠 Model Details

Base Model: Qwen (qwen/Qwen3-0.6B)
Fine-tuned on: 80+ custom-labeled job descriptions
Trained using: Hugging Face Transformers & TRL's SFTTrainer
Dataset Format: Few-shot prompting with Qwen’s <|im_start|> / <|im_end|> chat template
Output: Structured JSON response

🚀 How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "Rithankoushik/job-parser-model-qwen"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16)

prompt = """<|im_start|>system
You are a helpful assistant that extracts structured information from job descriptions.
<|im_end|>
<|im_start|>user
[Paste job description here]
<|im_end|>
<|im_start|>assistant
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1024)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))