qlora_mistral_y02_V2 β€” Y02 Green Patent Classifier

This is a QLoRA fine-tuned adapter for mistralai/Mistral-7B-Instruct-v0.2, trained to classify patent claims as GREEN (Y02) or NOT GREEN.

It was developed as the Judge agent's brain in a 3-agent MAS pipeline for Y02 green patent classification (M4 Final Assignment β€” AAU).


Model Details

Property Value
Base model mistralai/Mistral-7B-Instruct-v0.2
Adapter type QLoRA (4-bit NF4)
LoRA rank r=16, alpha=32
Target modules q, k, v, o, gate, up, down projections
Max sequence length 512 tokens
Training epochs 1 (full)
Effective batch size 32 (batch=8, grad_accum=4)
Learning rate 2e-4 with cosine scheduler
Warmup ratio 0.05
Precision bfloat16
Hardware NVIDIA L4 GPU
Trainer SFTTrainer (trl 0.29.0)

Training Data

  • Source: patents_50k_green.parquet β€” 50,000 patent claims with Y02 silver labels derived from CPC codes
  • Train split: 28,500 rows (train_silver, 95%)
  • Eval split: 1,500 rows (5% held-out, stratified)
  • Label balance: 50% GREEN / 50% NOT GREEN
  • Prompt format: Mistral [INST]...[/INST] chat template
  • Target output: Strict JSON β€” {"is_green": 0/1, "rationale": "one sentence"}

Training History

| Version | Epochs | Trainer | Result | |---|---|---|---|| | V2 (this model) | 1.0 (full) | SFTTrainer | Stable β€” loss 0.87β†’0.83 | V2 used Mistral [INST] template and trained to completion, resuming from checkpoint-800 after an SSH disconnection at step 725/891.


Usage

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

BASE_MODEL  = "mistralai/Mistral-7B-Instruct-v0.2"
ADAPTER_DIR = "qlora_mistral_y02_V2"

bnb = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
)

tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL)
base      = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL, quantization_config=bnb, device_map="auto"
)
model = PeftModel.from_pretrained(base, ADAPTER_DIR)
model.eval()

claim = "A photovoltaic solar panel system for residential energy generation."

prompt = (
    "You are an expert patent examiner for Y02 green technology. "
    "Classify the following patent claim as GREEN (1) or NOT GREEN (0).\n\n"
    'Return STRICT JSON only: {"is_green": 0 or 1, "rationale": "one sentence"}\n\n'
    f"Patent claim:\n{claim}"
)

messages  = [{"role": "user", "content": prompt}]
formatted = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)
inputs = tokenizer(formatted, return_tensors="pt").to(model.device)

with torch.no_grad():
    out = model.generate(
        **inputs,
        max_new_tokens=150,
        do_sample=False,
        pad_token_id=tokenizer.eos_token_id,
    )

response = tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], 
                             skip_special_tokens=True)
print(response)
# {"is_green": 1, "rationale": "Photovoltaic system directly generates 
#  renewable electricity, qualifying under Y02E 10/50."}

Role in MAS Pipeline

This adapter was used as the Judge agent in a 3-agent pipeline:

Advocate (this model) β†’ argues FOR green classification
Skeptic  (this model) β†’ argues AGAINST green classification  
Judge    (this model) β†’ weighs both sides β†’ final JSON verdict

All 3 agents share this same model β€” differentiated only by role-specific prompts. HITL was triggered when confidence < 0.65 or deadlock=True.


Framework Versions

Library Version
PEFT 0.18.1
TRL 0.29.0
Transformers 4.57.6
PyTorch 2.9.1
Datasets 4.6.1
Tokenizers 0.22.2
Downloads last month
12
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for alinashrestha/qlora-mistral-y02-v2

Adapter
(1137)
this model