Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,68 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
base_model:
|
| 4 |
+
- dicta-il/neodictabert
|
| 5 |
+
---
|
| 6 |
+
------
|
| 7 |
+
license: apache-2.0
|
| 8 |
+
language:
|
| 9 |
+
- he
|
| 10 |
+
base_model:
|
| 11 |
+
- dicta-il/neodictabert
|
| 12 |
+
tags:
|
| 13 |
+
- multi-label-classification
|
| 14 |
+
- hebrew
|
| 15 |
+
- fact-checking
|
| 16 |
+
- error-detection
|
| 17 |
+
pipeline_tag: text-classification
|
| 18 |
+
library_name: transformers
|
| 19 |
+
---
|
| 20 |
+
|
| 21 |
+
# Hebrew Multi-Label Error Type Classifier (Setup 2: Claim-Level)
|
| 22 |
+
|
| 23 |
+
## Model Description
|
| 24 |
+
|
| 25 |
+
Fine-tuned [dicta-il/neodictabert](https://huggingface.co/dicta-il/neodictabert) for multi-label classification of factual error types in Hebrew text at the claim/sentence level.
|
| 26 |
+
|
| 27 |
+
**Task:** Identify specific error types in corrupted Hebrew claims
|
| 28 |
+
**Language:** Hebrew
|
| 29 |
+
**Max Context:** 3,072 tokens
|
| 30 |
+
**Granularity:** Sentence-level (individual claims)
|
| 31 |
+
|
| 32 |
+
## Input/Output
|
| 33 |
+
|
| 34 |
+
### Input
|
| 35 |
+
- **Premise:** Original correct claim (Hebrew sentence)
|
| 36 |
+
- **Hypothesis:** Corrupted/modified claim (Hebrew sentence)
|
| 37 |
+
|
| 38 |
+
### Output
|
| 39 |
+
Multi-label classification with probability scores for each error type. Threshold: 0.5 (probabilities โฅ 0.5 indicate presence of that error type).
|
| 40 |
+
|
| 41 |
+
## Supported Error Types
|
| 42 |
+
|
| 43 |
+
The model detects both **cross-language** and **Hebrew-specific** error types:
|
| 44 |
+
|
| 45 |
+
### Cross-Language Errors
|
| 46 |
+
- `entity_person_swap` - Swapping person names
|
| 47 |
+
- `entity_location_swap` - Swapping place names
|
| 48 |
+
- `entity_organization_swap` - Swapping organization names
|
| 49 |
+
- `entity_date_swap` - Changing dates
|
| 50 |
+
- `number_swap` - Altering numerical values
|
| 51 |
+
- `measure_unit_swap` - Changing units of measurement
|
| 52 |
+
- `sentence_negation` - Adding/removing negation
|
| 53 |
+
|
| 54 |
+
|
| 55 |
+
### Hebrew-Specific Errors
|
| 56 |
+
- `hebrew_root_pattern_confusion` - Morphological root/binyan errors
|
| 57 |
+
- `morphological_connective_confusion` - Logical connector errors (ืฉืื โ ืื ืฉ)
|
| 58 |
+
- `verb_gender_swap` - Verb gender disagreement
|
| 59 |
+
- `noun_gender_swap` - Noun/pronoun gender errors
|
| 60 |
+
- `homographic_gender_errors` - Gender mismatches in homographic words (ืืืจื, ืืจืฆื)
|
| 61 |
+
- `specificity_shift_errors` - Definite article changes (ื-)
|
| 62 |
+
- `construct_state_confusion` - Smichut (ืกืืืืืช) errors
|
| 63 |
+
- `subject_verb_person_mismatch` - Person agreement errors
|
| 64 |
+
- `verb_tense_swap` - Tense inconsistencies
|
| 65 |
+
- `evidentiality_source_attribution_collapse` - Attribution removal
|
| 66 |
+
- `impersonal_to_personal_verb_errors` - Voice changes (passive โ active)
|
| 67 |
+
- `verb_template_agency_reversal` - Agency reversals (ืืืืืฉ โ ืืชืืืฉ)
|
| 68 |
+
- `directional_preposition_swap` - Directional errors (ื-X ื-Y โ ื-Y ื-X)
|