Dolores AI - Immigration Case Manager (Qwen 2.5 LoRA)

Dolores is a specialized AI Immigration Case Manager fine-tuned using LoRA (Low-Rank Adaptation) on Qwen 2.5-3B-Instruct. Her mission is to de-mystify the complex immigration journey, breaking it down into manageable, actionable steps with high empathy and precision.

Model Details

Base Model: Qwen/Qwen2.5-3B-Instruct
Fine-tuning Method: LoRA (Low-Rank Adaptation)
LoRA Rank: 16
LoRA Alpha: 32
Training Data: Immigration law documents, case examples, and expert guidance
Use Case: Immigration consultation, visa guidance, document preparation
Model Size: ~3B parameters (LoRA adapter only)

Training Details

Training Configuration

Epochs: 3
Batch Size: 4 (effective: 16 with gradient accumulation)
Learning Rate: 2e-4
Quantization: 4-bit (QLoRA) during training
Max Sequence Length: 2048 tokens
Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Training Data

Fine-tuned on curated immigration law datasets including:

U.S. immigration policies and procedures
Visa types and requirements (H-1B, O-1, EB-1, etc.)
Green card processes
Case examples and expert guidance
Document preparation instructions

Usage

This is a LoRA adapter that needs to be loaded with the base model. For production use, see the merged version: JustiGuide/DoloresAI-Qwen25-Merged

Loading the LoRA Adapter

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model_id = "Qwen/Qwen2.5-3B-Instruct"
lora_adapter_id = "JustiGuide/DoloresAI-Qwen25"

# Load base model
model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(model, lora_adapter_id)

tokenizer = AutoTokenizer.from_pretrained(base_model_id)

Inference Example

system_prompt = "You are Dolores, a specialized AI Immigration Case Manager."

question = "What is an H-1B visa and who qualifies for it?"

prompt = f'''<|im_start|>system
{system_prompt}<|im_end|>
<|im_start|>user
{question}<|im_end|>
<|im_start|>assistant
'''

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    repetition_penalty=1.1,
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Deployment

For production deployment, use the merged model: JustiGuide/DoloresAI-Qwen25-Merged

HuggingFace Inference Endpoint

GPU: Nvidia L4 (24GB VRAM)
Scale to Zero: Enabled
Region: us-east-1

Performance

Inference Speed: ~10-20 tokens/second (on L4 GPU)
Context Length: Up to 2048 tokens
Quality: High accuracy on immigration-specific questions

Limitations

Provides general immigration guidance, not legal advice
Always consult with a licensed immigration attorney for specific cases
Trained primarily on U.S. immigration law
May not have information on very recent policy changes

License

Apache 2.0 License (following base model's license)

Contact

Organization: JustiGuide
Website: https://justi.guide

Built with ❤️ by JustiGuide to make immigration more accessible

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for JustiGuide/DoloresAI-Qwen25

Base model

Qwen/Qwen2.5-3B

Finetuned

Qwen/Qwen2.5-3B-Instruct

Adapter

(1033)

this model

JustiGuide
/

DoloresAI-Qwen25