Amit5674 commited on
Commit
b2f876b
verified
1 Parent(s): 0f524ea

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +213 -0
README.md CHANGED
@@ -32,6 +32,23 @@ Fine-tuned [dicta-il/neodictabert](https://huggingface.co/dicta-il/neodictabert)
32
  - **Accuracy:** 96.78%
33
  - **F1 Score:** 96.20%
34
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  ## Usage
36
 
37
  from transformers import AutoTokenizer, AutoModelForSequenceClassification
@@ -40,3 +57,199 @@ import torch
40
  model_name = "Amit5674/NLI-hebrew-binary-correctness-metric"
41
  tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
42
  model = AutoModelForSequenceClassification.from_pretrained(model_name, trust_remote_code=True)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
  - **Accuracy:** 96.78%
33
  - **F1 Score:** 96.20%
34
 
35
+ ## Architecture
36
+
37
+ - **Base Model:** `dicta-il/neodictabert`
38
+ - **Classification Head:** Binary (softmax over 2 classes)
39
+ - **Input Format:** `[CLS] source_article [SEP] summary_claim [SEP]`
40
+ - **Output:** Probability distribution over [contradiction, entailment]
41
+
42
+ ## Training Configuration
43
+
44
+ - **Learning Rate:** 2e-5
45
+ - **Epochs:** 2
46
+ - **Batch Size:** 2 per device (effective: 16 with gradient accumulation)
47
+ - **Max Sequence Length:** 4,096 tokens
48
+ - **Learning Rate Scheduler:** Linear
49
+ - **Warmup Steps:** 500
50
+ - **Best Model Selection:** Based on eval_f1
51
+
52
  ## Usage
53
 
54
  from transformers import AutoTokenizer, AutoModelForSequenceClassification
 
57
  model_name = "Amit5674/NLI-hebrew-binary-correctness-metric"
58
  tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
59
  model = AutoModelForSequenceClassification.from_pretrained(model_name, trust_remote_code=True)
60
+ model.eval()
61
+
62
+ # Example usage
63
+ article = "讬砖专讗诇 讛转讞讬诇讛 讘讛专注砖讛 专讙注 讗讞专讬 讛驻住拽转 讛讗砖. 讛诪诪砖诇讛 讛讜讚讬注讛 注诇 爪注讚讬诐 讞讚砖讬诐..."
64
+ summary = "讬砖专讗诇 讛转讞讬诇讛 诇讛转专讙砖 专讙注 讗讞专讬 讛驻住拽转 讛讗砖"
65
+
66
+ # Tokenize
67
+ inputs = tokenizer(
68
+ article,
69
+ summary,
70
+ return_tensors="pt",
71
+ padding="max_length",
72
+ max_length=4096,
73
+ truncation=True
74
+ )
75
+
76
+ # Predict
77
+ with torch.no_grad():
78
+ outputs = model(**inputs)
79
+ logits = outputs.logits[0]
80
+ probs = torch.softmax(logits, dim=-1)
81
+ predicted_class_idx = torch.argmax(probs).item()
82
+ predicted_class = model.config.id2label[predicted_class_idx]
83
+ confidence = probs[predicted_class_idx].item()
84
+
85
+ probabilities = {
86
+ model.config.id2label[i]: float(probs[i].item())
87
+ for i in range(model.config.num_labels)
88
+ }
89
+
90
+ print(f"Prediction: {predicted_class}")
91
+ print(f"Confidence: {confidence:.4f}")
92
+ print(f"Probabilities: {probabilities}")For detailed inference examples, see the inference scripts and server API documentation.
93
+
94
+ ## Input Format
95
+
96
+ - **Premise:** Source article text (full document)
97
+ - **Hypothesis:** Summary claim (can be full summary or individual claim)
98
+ - **Processing:** Binary classification (entailment vs contradiction)
99
+
100
+ ## Output Format
101
+
102
+ - **Prediction:** String label (`"entailment"` or `"contradiction"`)
103
+ - **Confidence:** Probability of predicted class (0.0 to 1.0)
104
+ - **Probabilities:** Dictionary with probabilities for both classes:
105
+ - `{"entailment": 0.9678, "contradiction": 0.0322}`
106
+
107
+ ## Use Cases
108
+
109
+ - **Production Fact-Checking:** Fast yes/no contradiction detection for Hebrew summaries
110
+ - **Quality Control:** Automated validation of summary factuality
111
+ - **Batch Processing:** Efficient processing of large document-summary pairs
112
+ - **Real-Time Validation:** Low-latency factuality checking in summary generation pipelines
113
+
114
+ ## Limitations
115
+
116
+ - Max sequence length: 4,096 tokens (may truncate very long articles)
117
+ - Binary classification: Cannot identify specific error types (use multi-label models for detailed error analysis)
118
+ - Context dependency: Performance may vary with article length and complexity
119
+ - Hebrew-specific: Optimized for Hebrew text; may not generalize to other languages
120
+
121
+ ## Citation
122
+
123
+ @misc{hebrew_binary_nli_classifier,
124
+ title={Hebrew Binary NLI Classifier for Factuality Checking},
125
+ author={Your Name},
126
+ year={2025},
127
+ publisher={Hugging Face}
128
+ }---
129
+ license: apache-2.0
130
+ language:
131
+ - he
132
+ base_model:
133
+ - dicta-il/neodictabert
134
+ tags:
135
+ - nli
136
+ - natural-language-inference
137
+ - hebrew
138
+ - fact-checking
139
+ - contradiction-detection
140
+ pipeline_tag: text-classification
141
+ library_name: transformers
142
+ metrics:
143
+ - accuracy
144
+ - f1
145
+ ---
146
+
147
+ # Hebrew Binary NLI Classifier for Factuality Checking
148
+
149
+ ## Model Description
150
+
151
+ Fine-tuned [dicta-il/neodictabert](https://huggingface.co/dicta-il/neodictabert) for binary Natural Language Inference in Hebrew. Detects whether a summary claim contradicts a source article.
152
+
153
+ **Task:** Entailment vs Contradiction Detection
154
+ **Language:** Hebrew
155
+ **Max Context:** 4,096 tokens
156
+
157
+ ## Performance
158
+
159
+ - **Accuracy:** 96.78%
160
+ - **F1 Score:** 96.20%
161
+
162
+ ## Architecture
163
+
164
+ - **Base Model:** `dicta-il/neodictabert`
165
+ - **Classification Head:** Binary (softmax over 2 classes)
166
+ - **Input Format:** `[CLS] source_article [SEP] summary_claim [SEP]`
167
+ - **Output:** Probability distribution over [contradiction, entailment]
168
+
169
+ ## Training Configuration
170
+
171
+ - **Learning Rate:** 2e-5
172
+ - **Epochs:** 2
173
+ - **Batch Size:** 2 per device (effective: 16 with gradient accumulation)
174
+ - **Max Sequence Length:** 4,096 tokens
175
+ - **Learning Rate Scheduler:** Linear
176
+ - **Warmup Steps:** 500
177
+ - **Best Model Selection:** Based on eval_f1
178
+
179
+ ## Usage
180
+
181
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
182
+ import torch
183
+
184
+ model_name = "Amit5674/NLI-hebrew-binary-correctness-metric"
185
+ tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
186
+ model = AutoModelForSequenceClassification.from_pretrained(model_name, trust_remote_code=True)
187
+ model.eval()
188
+
189
+ # Example usage
190
+ article = "讬砖专讗诇 讛转讞讬诇讛 讘讛专注砖讛 专讙注 讗讞专讬 讛驻住拽转 讛讗砖. 讛诪诪砖诇讛 讛讜讚讬注讛 注诇 爪注讚讬诐 讞讚砖讬诐..."
191
+ summary = "讬砖专讗诇 讛转讞讬诇讛 诇讛转专讙砖 专讙注 讗讞专讬 讛驻住拽转 讛讗砖"
192
+
193
+ # Tokenize
194
+ inputs = tokenizer(
195
+ article,
196
+ summary,
197
+ return_tensors="pt",
198
+ padding="max_length",
199
+ max_length=4096,
200
+ truncation=True
201
+ )
202
+
203
+ # Predict
204
+ with torch.no_grad():
205
+ outputs = model(**inputs)
206
+ logits = outputs.logits[0]
207
+ probs = torch.softmax(logits, dim=-1)
208
+ predicted_class_idx = torch.argmax(probs).item()
209
+ predicted_class = model.config.id2label[predicted_class_idx]
210
+ confidence = probs[predicted_class_idx].item()
211
+
212
+ probabilities = {
213
+ model.config.id2label[i]: float(probs[i].item())
214
+ for i in range(model.config.num_labels)
215
+ }
216
+
217
+ print(f"Prediction: {predicted_class}")
218
+ print(f"Confidence: {confidence:.4f}")
219
+ print(f"Probabilities: {probabilities}")For detailed inference examples, see the inference scripts and server API documentation.
220
+
221
+ ## Input Format
222
+
223
+ - **Premise:** Source article text (full document)
224
+ - **Hypothesis:** Summary claim (can be full summary or individual claim)
225
+ - **Processing:** Binary classification (entailment vs contradiction)
226
+
227
+ ## Output Format
228
+
229
+ - **Prediction:** String label (`"entailment"` or `"contradiction"`)
230
+ - **Confidence:** Probability of predicted class (0.0 to 1.0)
231
+ - **Probabilities:** Dictionary with probabilities for both classes:
232
+ - `{"entailment": 0.9678, "contradiction": 0.0322}`
233
+
234
+ ## Use Cases
235
+
236
+ - **Production Fact-Checking:** Fast yes/no contradiction detection for Hebrew summaries
237
+ - **Quality Control:** Automated validation of summary factuality
238
+ - **Batch Processing:** Efficient processing of large document-summary pairs
239
+ - **Real-Time Validation:** Low-latency factuality checking in summary generation pipelines
240
+
241
+ ## Limitations
242
+
243
+ - Max sequence length: 4,096 tokens (may truncate very long articles)
244
+ - Binary classification: Cannot identify specific error types (use multi-label models for detailed error analysis)
245
+ - Context dependency: Performance may vary with article length and complexity
246
+ - Hebrew-specific: Optimized for Hebrew text; may not generalize to other languages
247
+
248
+ ## Citation
249
+
250
+ @misc{hebrew_binary_nli_classifier,
251
+ title={Hebrew Binary NLI Classifier for Factuality Checking},
252
+ author={Your Name},
253
+ year={2025},
254
+ publisher={Hugging Face}
255
+ }