Links

Model: https://huggingface.co/Signe22/patentsberta-green-hitl

Dataset: https://huggingface.co/datasets/Signe22/patents-50k-green-hitl

Video: https://aaudk-my.sharepoint.com/:v:/g/personal/de63sv_student_aau_dk/IQAhFz1G1dSGRq_XVpBwxJxiAa4zONceO5y_To505Y47h0A

Baseline Model (Frozen Embeddings)

As a starting point, I trained a fast baseline classifier using frozen PatentSBERTa embeddings. PatentSBERTa was used solely as a feature extractor, and a lightweight linear classifier was trained on top of the fixed embeddings. This baseline model provides initial performance estimates and probabilistic outputs needed for subsequent uncertainty sampling.

Identify High-Risk Examples (Uncertainty Sampling)

To select candidates for human annotation, I applied uncertainty sampling to the predictions of the baseline classifier. High-risk examples were defined as claims for which the model was most uncertain about the green label.

Implement LLM → Human HITL (Gold Labels)

To improve label quality on uncertain examples, I implemented a Human-in-the-Loop (HITL) workflow where a large language model first evaluates patent claims and suggests a preliminary label. A human annotator then reviews the claim text together with the LLM’s suggestion and assigns the final gold label. This process ensures that high-risk samples are corrected using human judgment while benefiting from LLM guidance.

HITL review

In the HITL review, the human annotator agreed with the LLM on all 100 cases. Because no overrides occurred, I instead report two cases where the human explicitly reviewed and confirmed the LLM’s suggestion.

HITL Example: Green Patent Claim Review

Patent Claim (Full Text)

Claim 1:
A method of cleaning a side edge of a thin film photovoltaic substrate, wherein the substrate defines a face surface terminating at a first side edge, and wherein a thin film is present on the face surface and the first side edge of the substrate, the method comprising:

transporting the substrate in a machine direction to move the substrate past a first laser source; and
focusing a first laser beam generated by the first laser source onto the first side edge of the substrate such that the first laser beam removes the thin film present on the first side edge of the substrate, while the thin film layer on the face surface of the substrate is substantially unaffected by the first laser beam focused onto the first side edge of the substrate.

LLM Evaluation

  • Suggested label: 1 (Green technology)
  • Confidence: high
  • Rationale: “The claim directly concerns technology that mitigates environmental impact, specifically renewable energy (photovoltaic substrate).”

Human Review (Final Gold Label)

  • Final label: 1 (Green technology)
  • Human notes: “Material related to solar technology.”

HITL Outcome

  • Human override: No

Patent Claim (Full Text)

Claim 2:
A system for displaying braking information comprising:

a friction braking sensor configured to generate friction brake data; a regenerative braking sensor configured to generate regenerative brake data; a processor configured to receive the friction brake data and the regenerative brake data and determine a value or a percentage of application of friction braking based on the friction brake data; and a display communicatively coupled to the processor, the display configured to display an image showing a first indicator for the regenerative brake data and a second indicator for the value or the percentage of the application of friction braking.

LLM Evaluation

  • Suggested label: 1 (Green technology)
  • Confidence: high
  • Rationale: “The claim directly concerns regenerative braking, which is a form of renewable energy and emissions reduction.”

Human Review (Final Gold Label)

  • Final label: 1 (Green technology)
  • Human notes: “Regenerative braking converts energy used during braking into electricity”

HITL Outcome

  • Human override: No

Model training

The model was fine-tuned for one epoch using a maximum sequence length of 256 tokens and a learning rate of 2e-5, following the recommended settings to keep computation reasonable. Tokenization was performed using the PatentSBERTa tokenizer prior to training.

Model performance was evaluated on the held-out eval_silver split to assess generalization, and separately on the gold_100 set to analyze performance on human-labeled examples.

Evaluation Results

The final model was evaluated on both the held-out silver-labeled evaluation set and the human-labeled gold set.

Evaluation on eval_silver (Silver Labels)

Metric Score
Accuracy 0.807
Precision 0.815
Recall 0.791
F1-score 0.803

Evaluation on gold_100 (Human Labels)

Metric Score
Accuracy 0.610
Precision 0.093
Recall 1.000
F1-score 0.170
Downloads last month
21
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support