File size: 3,388 Bytes
c6d9b01 59c0f4d c6d9b01 59c0f4d c6d9b01 59c0f4d c6d9b01 9e0009b c6d9b01 9e0009b c6d9b01 9e0009b c6d9b01 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 |
---
language:
- en
license: apache-2.0
tags:
- legal
- immigration
- assistant
- qwen2
- fine-tuned
base_model: Qwen/Qwen2-7B-Instruct
model_type: qwen2
pipeline_tag: text-generation
---
# DoloresAI - Immigration Law Assistant
DoloresAI is a specialized legal assistant fine-tuned on immigration law, designed to provide accurate and helpful information about U.S. immigration processes, visa types, and legal procedures.
## Model Details
- **Base Model**: Qwen/Qwen2-7B-Instruct
- **Model Type**: Qwen2ForCausalLM
- **Parameters**: 7B
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
- **Vocabulary Size**: 151,665 tokens
- **Precision**: FP16
- **Context Length**: 32,768 tokens
- **Fixed on**: 2026-01-11
## Changes in This Version
This is a fixed version of the DoloresAI merged model with vocabulary mismatch resolved:
- Fixed vocabulary size mismatch between model (151,936) and tokenizer (151,665)
- Model embeddings properly resized to match tokenizer: 151,665 tokens
- Ready for deployment on HuggingFace Inference Endpoints without CUDA errors
## Training
This model was fine-tuned using LoRA adapters on immigration law data and then merged with the base model. The embeddings have been properly resized to match the tokenizer vocabulary size.
## Intended Use
DoloresAI is designed to assist with:
- Immigration process information
- Visa type explanations
- Legal procedure guidance
- Document requirements
- Timeline estimates
- Form instructions
**Important**: This model provides information only and should not be considered legal advice. Always consult with a licensed immigration attorney for specific legal matters.
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "JustiGuide/DoloresAI-Merged"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float16,
device_map="auto"
)
prompt = "What are the requirements for an H-1B visa?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
top_p=0.9,
do_sample=True
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
## Deployment
### HuggingFace Inference Endpoints
For production deployment, use these environment variables to avoid CUDA errors:
```bash
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
CUDA_LAUNCH_BLOCKING=1
TORCH_USE_CUDA_DSA=1
TRANSFORMERS_OFFLINE=0
HF_HUB_ENABLE_HF_TRANSFER=1
MODEL_LOAD_TIMEOUT=600
```
Recommended hardware: Nvidia A10G or better
## Verification
The vocabulary sizes have been verified to match:
- Model vocab size: 151,665 ✅
- Tokenizer vocab size: 151,665 ✅
- Match: ✅
## Limitations
- Trained primarily on U.S. immigration law
- Knowledge cutoff based on training data
- Not a replacement for legal counsel
- May require additional context for complex cases
## License
Apache 2.0
## Citation
```bibtex
@misc{doloresai2025,
title={DoloresAI: Immigration Law Assistant},
author={JustiGuide},
year={2025},
publisher={HuggingFace},
howpublished={\url{https://huggingface.co/JustiGuide/DoloresAI-Merged}}
}
```
## Model Card Authors
JustiGuide Team
## Model Card Contact
For questions or issues, please open an issue on the model repository.
|