Peter512/patents-50k-green
Viewer • Updated • 50k • 11
Binary classifier for green patent detection (Y02 CPC codes). Fine-tuned from AI-Growth-Lab/PatentSBERTa using active learning + GPT-4o-mini HITL gold labels.
is_green (Y02 CPC codes)| Metric | Value |
|---|---|
| F1 | 0.8099 |
| Precision | 0.8207 |
| Recall | 0.7994 |
| Accuracy | 0.8126 |
Baseline (frozen PatentSBERTa + Logistic Regression): F1=0.7696
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("AI-Growth-Lab/PatentSBERTa", use_fast=False)
model = AutoModelForSequenceClassification.from_pretrained("Peter512/patentsbert-green-a2")
model.eval()
text = "A photovoltaic cell comprising a perovskite absorber layer..."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=256)
with torch.no_grad():
logits = model(**inputs).logits
label = logits.argmax().item() # 0=not_green, 1=green
Base model
AI-Growth-Lab/PatentSBERTa