qlora_mistral_y02_V2 β Y02 Green Patent Classifier
This is a QLoRA fine-tuned adapter for mistralai/Mistral-7B-Instruct-v0.2, trained to classify patent claims as GREEN (Y02) or NOT GREEN.
It was developed as the Judge agent's brain in a 3-agent MAS pipeline for Y02 green patent classification (M4 Final Assignment β AAU).
Model Details
| Property | Value |
|---|---|
| Base model | mistralai/Mistral-7B-Instruct-v0.2 |
| Adapter type | QLoRA (4-bit NF4) |
| LoRA rank | r=16, alpha=32 |
| Target modules | q, k, v, o, gate, up, down projections |
| Max sequence length | 512 tokens |
| Training epochs | 1 (full) |
| Effective batch size | 32 (batch=8, grad_accum=4) |
| Learning rate | 2e-4 with cosine scheduler |
| Warmup ratio | 0.05 |
| Precision | bfloat16 |
| Hardware | NVIDIA L4 GPU |
| Trainer | SFTTrainer (trl 0.29.0) |
Training Data
- Source:
patents_50k_green.parquetβ 50,000 patent claims with Y02 silver labels derived from CPC codes - Train split: 28,500 rows (
train_silver, 95%) - Eval split: 1,500 rows (5% held-out, stratified)
- Label balance: 50% GREEN / 50% NOT GREEN
- Prompt format: Mistral
[INST]...[/INST]chat template - Target output: Strict JSON β
{"is_green": 0/1, "rationale": "one sentence"}
Training History
| Version | Epochs | Trainer | Result |
|---|---|---|---||
| V2 (this model) | 1.0 (full) | SFTTrainer | Stable β loss 0.87β0.83 |
V2 used Mistral [INST] template and trained to completion, resuming from
checkpoint-800 after an SSH disconnection at step 725/891.
Usage
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
BASE_MODEL = "mistralai/Mistral-7B-Instruct-v0.2"
ADAPTER_DIR = "qlora_mistral_y02_V2"
bnb = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL)
base = AutoModelForCausalLM.from_pretrained(
BASE_MODEL, quantization_config=bnb, device_map="auto"
)
model = PeftModel.from_pretrained(base, ADAPTER_DIR)
model.eval()
claim = "A photovoltaic solar panel system for residential energy generation."
prompt = (
"You are an expert patent examiner for Y02 green technology. "
"Classify the following patent claim as GREEN (1) or NOT GREEN (0).\n\n"
'Return STRICT JSON only: {"is_green": 0 or 1, "rationale": "one sentence"}\n\n'
f"Patent claim:\n{claim}"
)
messages = [{"role": "user", "content": prompt}]
formatted = tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
inputs = tokenizer(formatted, return_tensors="pt").to(model.device)
with torch.no_grad():
out = model.generate(
**inputs,
max_new_tokens=150,
do_sample=False,
pad_token_id=tokenizer.eos_token_id,
)
response = tokenizer.decode(out[0][inputs["input_ids"].shape[1]:],
skip_special_tokens=True)
print(response)
# {"is_green": 1, "rationale": "Photovoltaic system directly generates
# renewable electricity, qualifying under Y02E 10/50."}
Role in MAS Pipeline
This adapter was used as the Judge agent in a 3-agent pipeline:
Advocate (this model) β argues FOR green classification
Skeptic (this model) β argues AGAINST green classification
Judge (this model) β weighs both sides β final JSON verdict
All 3 agents share this same model β differentiated only by role-specific prompts. HITL was triggered when confidence < 0.65 or deadlock=True.
Framework Versions
| Library | Version |
|---|---|
| PEFT | 0.18.1 |
| TRL | 0.29.0 |
| Transformers | 4.57.6 |
| PyTorch | 2.9.1 |
| Datasets | 4.6.1 |
| Tokenizers | 0.22.2 |
- Downloads last month
- 12
Model tree for alinashrestha/qlora-mistral-y02-v2
Base model
mistralai/Mistral-7B-Instruct-v0.2