SadeghK commited on
Commit
80cb543
·
verified ·
1 Parent(s): 696fc13

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -3
README.md CHANGED
@@ -1,3 +1,29 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - fa
5
+ metrics:
6
+ - accuracy
7
+ base_model:
8
+ - HooshvareLab/bert-fa-base-uncased
9
+ ---
10
+
11
+ # Hamnevise
12
+ Persian Diacritization using masked-language model(MLM) with`HooshvareLab/bert-fa-base-uncased` for the words with same written form but different spelling in a given sentence.
13
+
14
+ **Architecture:**
15
+ ```
16
+ Input → ParsBERT (context) + Char CNN (morphology)
17
+ → Shared Fusion
18
+ → Word-Specific Classifiers
19
+ → Only valid outputs per word
20
+ ```
21
+
22
+ ## 📊 Dataset Format
23
+ CSV file with three columns:
24
+
25
+ ```csv
26
+ sentence,word,word_with_diacritics
27
+ اشکال در سیستم گرمایش، باعث سرد شدن ساختمان شد.,اشکال,اِشکال
28
+ اشکال در فرهنگ‌های باستانی، نمادها و معانی خاصی داشته‌اند.,اشکال,اَشکال
29
+ ```