Peter512/patents-50k-green
Viewer • Updated • 50k • 11
Binary classifier for green patent detection (Y02 CPC codes). Fine-tuned from AI-Growth-Lab/PatentSBERTa using a QLoRA-powered multi-agent system with exception-based HITL.
is_green (Y02 CPC codes)| Metric | Value |
|---|---|
| F1 | 0.8097 |
| Precision | 0.8213 |
| Recall | 0.7986 |
| Accuracy | 0.8126 |
Progression: Baseline F1=0.7696 → A2=0.8099 → A3=0.8115 → Final=0.8097
The QLoRA adapter (Llama-3.2-3B-Instruct) was trained on patent classification prompts and its learned domain knowledge was encoded into the Advocate agent's system prompt. The slight regression from A3 to Final is within noise and reflects the 100-claim gold set being a small fraction of the 35k silver training data.
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("AI-Growth-Lab/PatentSBERTa", use_fast=False)
model = AutoModelForSequenceClassification.from_pretrained("Peter512/patentsbert-green-final")
model.eval()
text = "A photovoltaic cell comprising a perovskite absorber layer..."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=256)
with torch.no_grad():
logits = model(**inputs).logits
label = logits.argmax().item() # 0=not_green, 1=green
Base model
AI-Growth-Lab/PatentSBERTa