File size: 3,388 Bytes
c6d9b01
 
 
 
 
 
 
 
 
 
 
 
 
 
59c0f4d
c6d9b01
59c0f4d
c6d9b01
59c0f4d
 
c6d9b01
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9e0009b
 
 
c6d9b01
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9e0009b
c6d9b01
 
9e0009b
 
c6d9b01
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
---
language:
- en
license: apache-2.0
tags:
- legal
- immigration
- assistant
- qwen2
- fine-tuned
base_model: Qwen/Qwen2-7B-Instruct
model_type: qwen2
pipeline_tag: text-generation
---

# DoloresAI - Immigration Law Assistant

DoloresAI is a specialized legal assistant fine-tuned on immigration law, designed to provide accurate and helpful information about U.S. immigration processes, visa types, and legal procedures.

## Model Details

- **Base Model**: Qwen/Qwen2-7B-Instruct
- **Model Type**: Qwen2ForCausalLM
- **Parameters**: 7B
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
- **Vocabulary Size**: 151,665 tokens
- **Precision**: FP16
- **Context Length**: 32,768 tokens
- **Fixed on**: 2026-01-11

## Changes in This Version

This is a fixed version of the DoloresAI merged model with vocabulary mismatch resolved:
- Fixed vocabulary size mismatch between model (151,936) and tokenizer (151,665)
- Model embeddings properly resized to match tokenizer: 151,665 tokens
- Ready for deployment on HuggingFace Inference Endpoints without CUDA errors

## Training

This model was fine-tuned using LoRA adapters on immigration law data and then merged with the base model. The embeddings have been properly resized to match the tokenizer vocabulary size.

## Intended Use

DoloresAI is designed to assist with:
- Immigration process information
- Visa type explanations
- Legal procedure guidance
- Document requirements
- Timeline estimates
- Form instructions

**Important**: This model provides information only and should not be considered legal advice. Always consult with a licensed immigration attorney for specific legal matters.

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "JustiGuide/DoloresAI-Merged"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)

prompt = "What are the requirements for an H-1B visa?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9,
    do_sample=True
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```

## Deployment

### HuggingFace Inference Endpoints

For production deployment, use these environment variables to avoid CUDA errors:

```bash
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True
CUDA_LAUNCH_BLOCKING=1
TORCH_USE_CUDA_DSA=1
TRANSFORMERS_OFFLINE=0
HF_HUB_ENABLE_HF_TRANSFER=1
MODEL_LOAD_TIMEOUT=600
```

Recommended hardware: Nvidia A10G or better

## Verification

The vocabulary sizes have been verified to match:
- Model vocab size: 151,665 ✅
- Tokenizer vocab size: 151,665 ✅
- Match: ✅

## Limitations

- Trained primarily on U.S. immigration law
- Knowledge cutoff based on training data
- Not a replacement for legal counsel
- May require additional context for complex cases

## License

Apache 2.0

## Citation

```bibtex
@misc{doloresai2025,
  title={DoloresAI: Immigration Law Assistant},
  author={JustiGuide},
  year={2025},
  publisher={HuggingFace},
  howpublished={\url{https://huggingface.co/JustiGuide/DoloresAI-Merged}}
}
```

## Model Card Authors

JustiGuide Team

## Model Card Contact

For questions or issues, please open an issue on the model repository.