Signe22 commited on
Commit
21859f9
·
verified ·
1 Parent(s): f201ae2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -1
README.md CHANGED
@@ -1,3 +1,7 @@
 
 
 
 
1
  # Links
2
  Model: https://huggingface.co/Signe22/patentsberta-green-hitl
3
 
@@ -47,6 +51,7 @@ In the HITL review, the human annotator agreed with the LLM on all 100 cases. Be
47
  - **Human override:** No
48
 
49
  ---
 
50
  > **Claim 2:**
51
  > A system for displaying braking information comprising:
52
  >
@@ -71,4 +76,26 @@ In the HITL review, the human annotator agreed with the LLM on all 100 cases. Be
71
  # Model training
72
  The model was fine-tuned for one epoch using a maximum sequence length of 256 tokens and a learning rate of 2e-5, following the recommended settings to keep computation reasonable. Tokenization was performed using the PatentSBERTa tokenizer prior to training.
73
 
74
- Model performance was evaluated on the held-out eval_silver split to assess generalization, and separately on the gold_100 set to analyze performance on human-labeled examples.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ ---
5
  # Links
6
  Model: https://huggingface.co/Signe22/patentsberta-green-hitl
7
 
 
51
  - **Human override:** No
52
 
53
  ---
54
+ ### Patent Claim (Full Text)
55
  > **Claim 2:**
56
  > A system for displaying braking information comprising:
57
  >
 
76
  # Model training
77
  The model was fine-tuned for one epoch using a maximum sequence length of 256 tokens and a learning rate of 2e-5, following the recommended settings to keep computation reasonable. Tokenization was performed using the PatentSBERTa tokenizer prior to training.
78
 
79
+ Model performance was evaluated on the held-out eval_silver split to assess generalization, and separately on the gold_100 set to analyze performance on human-labeled examples.
80
+
81
+ ## Evaluation Results
82
+
83
+ The final model was evaluated on both the held-out silver-labeled evaluation set and the human-labeled gold set.
84
+
85
+ ### Evaluation on `eval_silver` (Silver Labels)
86
+
87
+ | Metric | Score |
88
+ |------------|-------|
89
+ | Accuracy | **0.807** |
90
+ | Precision | **0.815** |
91
+ | Recall | **0.791** |
92
+ | F1-score | **0.803** |
93
+
94
+ ### Evaluation on `gold_100` (Human Labels)
95
+
96
+ | Metric | Score |
97
+ |------------|-------|
98
+ | Accuracy | **0.610** |
99
+ | Precision | **0.093** |
100
+ | Recall | **1.000** |
101
+ | F1-score | **0.170** |