Dolores AI - Immigration Case Manager (Qwen 2.5 LoRA)
Dolores is a specialized AI Immigration Case Manager fine-tuned using LoRA (Low-Rank Adaptation) on Qwen 2.5-3B-Instruct. Her mission is to de-mystify the complex immigration journey, breaking it down into manageable, actionable steps with high empathy and precision.
Model Details
- Base Model: Qwen/Qwen2.5-3B-Instruct
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- LoRA Rank: 16
- LoRA Alpha: 32
- Training Data: Immigration law documents, case examples, and expert guidance
- Use Case: Immigration consultation, visa guidance, document preparation
- Model Size: ~3B parameters (LoRA adapter only)
Training Details
Training Configuration
- Epochs: 3
- Batch Size: 4 (effective: 16 with gradient accumulation)
- Learning Rate: 2e-4
- Quantization: 4-bit (QLoRA) during training
- Max Sequence Length: 2048 tokens
- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Training Data
Fine-tuned on curated immigration law datasets including:
- U.S. immigration policies and procedures
- Visa types and requirements (H-1B, O-1, EB-1, etc.)
- Green card processes
- Case examples and expert guidance
- Document preparation instructions
Usage
This is a LoRA adapter that needs to be loaded with the base model. For production use, see the merged version: JustiGuide/DoloresAI-Qwen25-Merged
Loading the LoRA Adapter
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base_model_id = "Qwen/Qwen2.5-3B-Instruct"
lora_adapter_id = "JustiGuide/DoloresAI-Qwen25"
# Load base model
model = AutoModelForCausalLM.from_pretrained(
base_model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Load LoRA adapter
model = PeftModel.from_pretrained(model, lora_adapter_id)
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
Inference Example
system_prompt = "You are Dolores, a specialized AI Immigration Case Manager."
question = "What is an H-1B visa and who qualifies for it?"
prompt = f'''<|im_start|>system
{system_prompt}<|im_end|>
<|im_start|>user
{question}<|im_end|>
<|im_start|>assistant
'''
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
top_p=0.9,
do_sample=True,
repetition_penalty=1.1,
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Deployment
For production deployment, use the merged model: JustiGuide/DoloresAI-Qwen25-Merged
HuggingFace Inference Endpoint
- GPU: Nvidia L4 (24GB VRAM)
- Scale to Zero: Enabled
- Region: us-east-1
Performance
- Inference Speed: ~10-20 tokens/second (on L4 GPU)
- Context Length: Up to 2048 tokens
- Quality: High accuracy on immigration-specific questions
Limitations
- Provides general immigration guidance, not legal advice
- Always consult with a licensed immigration attorney for specific cases
- Trained primarily on U.S. immigration law
- May not have information on very recent policy changes
License
Apache 2.0 License (following base model's license)
Contact
- Organization: JustiGuide
- Website: https://justi.guide
Built with ❤️ by JustiGuide to make immigration more accessible