|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: Qwen/Qwen3-0.6B |
|
|
tags: |
|
|
- legal |
|
|
- sft |
|
|
- lora |
|
|
- trl |
|
|
- eula |
|
|
datasets: |
|
|
- AxelDlv00/EULAI |
|
|
language: |
|
|
- fr |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
<div align="center"> |
|
|
<img src="icons/icon-base.png" alt="EULAI Logo" width="120"> |
|
|
|
|
|
**You lie? EULAI!** |
|
|
|
|
|
**Local AI Browser Assistant for Legal Document Analysis** |
|
|
|
|
|
*[Axel Delaval](https://axeldlv00.github.io/axel-delaval-personal-page/) • 28 January 2026* |
|
|
<br /> |
|
|
|
|
|
[](https://github.com/AxelDlv00/EULAI) |
|
|
[](./LICENSE) [](https://huggingface.co/AxelDlv00/EULAI)[](https://huggingface.co/AxelDlv00/EULAI-q4f16_1-MLC) [](https://huggingface.co/datasets/AxelDlv00/EULAI) |
|
|
|
|
|
</div> |
|
|
|
|
|
# EULAI - Fine-tuned Qwen3-0.6B for EULA Analysis |
|
|
|
|
|
This model is a fine-tuned version of **Qwen/Qwen3-0.6B** using the **LoRA (Low-Rank Adaptation)** method. It specializes in analyzing contracts (EULAs) and privacy policies to extract structured risk-based summaries. |
|
|
|
|
|
## Training Configuration |
|
|
|
|
|
Details extracted from the training configuration: |
|
|
- **Base Model**: `Qwen/Qwen3-0.6B` |
|
|
- **Method**: PEFT / LoRA |
|
|
- **LoRA Hyperparameters**: |
|
|
- `r`: 16 |
|
|
- `alpha`: 32 |
|
|
- `dropout`: 0.05 |
|
|
- `target_modules`: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
|
|
- **Training**: |
|
|
- `epochs`: 3 |
|
|
- `learning_rate`: 1e-4 |
|
|
- `max_length`: 2048 |
|
|
- `packing`: True |
|
|
|
|
|
## Usage (Inference) |
|
|
|
|
|
To use this model, you need to load the adapter on top of the original base model. |
|
|
|
|
|
```python |
|
|
from peft import PeftModel |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
import torch |
|
|
|
|
|
model_id = "Qwen/Qwen3-0.6B" |
|
|
adapter_id = "AxelDlv00/EULAI" |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(adapter_id) |
|
|
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto") |
|
|
model = PeftModel.from_pretrained(model, adapter_id) |
|
|
|
|
|
prompt = "We collect your GPS data continuously even when the application is closed." |
|
|
messages = [{"role": "user", "content": prompt}] |
|
|
|
|
|
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, enable_thinking=False, return_tensors="pt").to("cuda") |
|
|
|
|
|
outputs = model.generate(inputs, max_new_tokens=512, temperature=0.1) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
## Output Format |
|
|
|
|
|
The model generates structured bullet points in this format: |
|
|
|
|
|
``` |
|
|
- [BLOCKER/BAD/GOOD/NEUTRAL] : Title : Explanation |
|
|
``` |