PatentSBERTa Green Patent Classifier — Assignment 3
Binary classifier for green patent detection (Y02 CPC codes). Fine-tuned from AI-Growth-Lab/PatentSBERTa using a 3-agent CrewAI debate system (Advocate / Skeptic / Judge).
Training
- Base model: AI-Growth-Lab/PatentSBERTa (MPNet-based)
- Task: Binary classification —
is_green(Y02 CPC codes) - Training data: 35,000 silver labels + 100 gold labels (CrewAI MAS)
- MAS process: 3-agent debate — Advocate argues green, Skeptic challenges,
Judge produces
{"label": 0/1, "confidence": "low/medium/high", "rationale": "..."}. 100% agent agreement (0 human overrides — no low-confidence outputs). - Fine-tuning: 1 epoch, lr=2e-5, max_length=256, batch_size=16, fp16
Evaluation (eval_silver, 5,000 claims)
| Metric | Value |
|---|---|
| F1 | 0.8115 |
| Precision | 0.8224 |
| Recall | 0.8010 |
| Accuracy | 0.8142 |
Assignment 2 baseline: F1=0.8099 | Original baseline: F1=0.7696
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("AI-Growth-Lab/PatentSBERTa", use_fast=False)
model = AutoModelForSequenceClassification.from_pretrained("Peter512/patentsbert-green-a3")
model.eval()
text = "A photovoltaic cell comprising a perovskite absorber layer..."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=256)
with torch.no_grad():
logits = model(**inputs).logits
label = logits.argmax().item() # 0=not_green, 1=green
- Downloads last month
- 39
Model tree for Peter512/patentsbert-green-a3
Base model
AI-Growth-Lab/PatentSBERTa