job-parser-space / README.md
Rithankoushik's picture
Update README.md
c1e5867 verified

A newer version of the Streamlit SDK is available: 1.53.0

Upgrade
metadata
title: Job Description Parser
emoji: 🧠
colorFrom: blue
colorTo: purple
sdk: streamlit
sdk_version: 1.48.0
app_file: app.py
pinned: false

Job Parser Model (Qwen Fine-Tuned)

This repository contains a fine-tuned version of the Qwen model, specifically adapted to parse job descriptions into structured JSON format.


💼 Use Case

The model takes raw job descriptions (JD) as input and outputs structured JSON data containing:

  • Job Titles
  • Company Name & Website
  • Skills
  • Compensation
  • Location
  • Work Mode
  • Experience
  • Qualification
  • Industry
  • Posted Date
  • Notice Period
  • Job Type

Perfect for building:

  • Resume & JD analyzers
  • Job boards with smart filtering
  • HR automation tools
  • Job matching engines

🧠 Model Details

  • Base Model: Qwen (qwen/Qwen3-0.6B)
  • Fine-tuned on: 80+ custom-labeled job descriptions
  • Trained using: Hugging Face Transformers & TRL's SFTTrainer
  • Dataset Format: Few-shot prompting with Qwen’s <|im_start|> / <|im_end|> chat template
  • Output: Structured JSON response

🚀 How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "Rithankoushik/job-parser-model-qwen"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16)

prompt = """<|im_start|>system
You are a helpful assistant that extracts structured information from job descriptions.
<|im_end|>
<|im_start|>user
[Paste job description here]
<|im_end|>
<|im_start|>assistant
"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1024)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))