language:
- en
license: mit
library_name: transformers
pipeline_tag: text-classification
tags:
- patents
- green-tech
- qlora
- peft
- sequence-classification
model-index:
- name: assignment3-patentsberta-qlora-gold100
results:
- task:
type: text-classification
name: Green patent detection
dataset:
name: patents_50k_green (eval_silver)
type: custom
metrics:
- type: f1
value: 0.5006382068
name: Macro F1
- type: accuracy
value: 0.5008
name: Accuracy
Assignment 3 Model — Green Patent Detection (QLoRA + PatentSBERTa)
Model Summary
This repository contains the final downstream Assignment 3 classifier for green patent detection.
Workflow:
- Baseline uncertainty sampling on patent claims.
- QLoRA-based labeling/rationale generation on top-100 high-risk examples.
- Final PatentSBERTa fine-tuning on
train_silver + 100 gold high-risk.
Base Model
AI-Growth-Lab/PatentSBERTa(sequence classification head, 2 labels)
Training Setup
- Seed: 42
- Train rows (augmented): 20,100
- Eval rows: 5,000
- Gold rows: 100
- Hardware used: NVIDIA L4
- Frameworks:
transformers,datasets,torch,peft,bitsandbytes
Results
Assignment 3 final model (this repo)
- Eval accuracy: 0.5008
- Eval macro F1: 0.5006382068
- Gold100 accuracy: 0.53
- Gold100 macro F1: 0.5037482842
Comparison table (Assignment requirement)
| Model Version | Training Data Source | F1 Score (Eval Set) |
|---|---|---|
| 1. Baseline | Frozen Embeddings (No Fine-tuning) | 0.7727474956 |
| 2. Assignment 2 Model | Fine-tuned on Silver + Gold (Simple LLM) | 0.4975369710 |
| 3. Assignment 3 Model | Fine-tuned on Silver + Gold (QLoRA) | 0.5006382068 |
Reflection (2–3 sentences)
Compared to Assignment 2, the Assignment 3 QLoRA workflow produced a small improvement in eval macro F1 (+0.0031).
This indicates that the advanced data-generation approach provided a measurable but modest downstream gain over the simpler Assignment 2 setup in this run.
However, both fine-tuned pipelines remained substantially below the frozen-embedding baseline, suggesting that data quality and labeling strategy still dominate final performance.
Extended Reflection on Part E Results
The observed F1 score results show that downstream fine-tuning underperformed the frozen-embedding baseline in this run: Baseline macro F1 = 0.7727, Assignment 2 = 0.4975, and Assignment 3 = 0.5006 on the eval set. Although the advanced QLoRA workflow in Assignment 3 improved slightly over Assignment 2 (+0.0031), both fine-tuned models remained far below the baseline, indicating that additional training did not translate into better generalization here.
One plausible explanation is label quality in the high-risk set. In Assignment 3, the 100 uncertain examples were finalized using an auto-accept policy (no independent human correction), so potential labeling errors in the most ambiguous cases may have been passed directly into training. Because these examples are deliberately selected near the decision boundary, they are highly influential; if their labels are noisy, they can destabilize class boundaries and reduce macro F1 on eval data.
Another interpretation is that the fine-tuning stage is more sensitive to supervision quality and distribution mismatch than the linear baseline. A strong frozen-embedding + logistic model can be robust when labels are imperfect, while full downstream fine-tuning may overfit to noisy or weakly validated labels. Overall, the results suggest that the quality of gold labels on uncertain samples is a critical bottleneck, and that true human adjudication on high-risk claims is likely necessary to realize the intended gains from advanced workflows such as QLoRA.
Intended Use
- Educational/research use for green patent classification experiments.
- Binary label output: non-green (0) vs green (1).
Limitations
- Dataset and labels are project-specific and may not generalize broadly.
- Part C used automated acceptance policy for gold labels in this run (no manual overrides).
- Model should not be used for legal/commercial patent decisions without human review.
Files in this Repository
config.jsonmodel.safetensors- tokenizer files
- optional: training/evaluation summaries
Example Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
repo_id = "<your-username>/assignment3-patentsberta-qlora-gold100"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForSequenceClassification.from_pretrained(repo_id)
text = "A method for reducing CO2 emissions in industrial heat recovery systems..."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=256)
with torch.no_grad():
logits = model(**inputs).logits
pred = torch.argmax(logits, dim=-1).item()
print({"label": int(pred)}) # 0=not_green, 1=green