scoup123 commited on
Commit
3c7ff9e
·
verified ·
1 Parent(s): 3b0785d

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -0
README.md ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - scoup123/AffixIdentifier
4
+ language:
5
+ - tr
6
+ metrics:
7
+ - accuracy
8
+ pipeline_tag: text-classification
9
+ ---
10
+ Model Description
11
+ Given 2 words in Turkish, the model predicts whether they share an affix or not. Fine-tuned on dbmdz/bert-base-turkish-cased, fine-tuned on a task similar to NLI, but on word level and with 2 labels. It was created as a final project for one of my classes.
12
+
13
+ Developed by: Scoup123
14
+ Model type: BERT
15
+ Language(s) (NLP): Turkish
16
+ Finetuned from model [optional]: dbmdz/bert-base-turkish-cased
17
+ Model Sources [optional]
18
+ Repository: [More Information Needed]
19
+ Paper [optional]: in-works
20
+ Uses
21
+ It can be used in morphological analyzing tasks.
22
+
23
+ Direct Use
24
+ It can probably be used without additional finetuning on Turkish.
25
+
26
+ Training Details
27
+ Training Data
28
+ scoup123/affixfinder
29
+
30
+ The dataset used was generated from a generated dataset mentioned in the paper titled Turkish language resources: Morphological parser, morphological disambiguator and web corpus.
31
+
32
+ Evaluation
33
+ Test Accuracy: 0.9874 Precision: 0.9874 Recall: 0.9874 F1 Score: 0.9874
34
+
35
+ **It should be used with caution as these scores are too high.
36
+
37
+ Testing Data, Factors & Metrics
38
+ Testing Data
39
+ A testing split data was created from the training data
40
+
41
+ Summary
42
+ This model aims to create an affix identifier for Turkish.
43
+
44
+ Model Examination [optional]
45
+ I have just created it, so further testing needed to check if it actually works. Additionally, you should check it if it works before using it.
46
+
47
+ [More Information Needed]
48
+
49
+ Environmental Impact
50
+ Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
51
+
52
+ Hardware Type: Free Colab T4 GPU
53
+ Hours used: ~2.5 hours
54
+ Cloud Provider: Google
55
+ Compute Region: Europe
56
+ Carbon Emitted: [More Information Needed]
57
+ Citation [optional]
58
+ APA:
59
+
60
+ Sak, H., Güngör, T., & Saraçlar, M. (2008). Turkish language resources: Morphological parser, morphological disambiguator and web corpus. In Advances in natural language processing (pp. 417-427). Springer Berlin Heidelberg.
61
+
62
+ Model Card Authors [optional]
63
+ Kaan Bayar
64
+
65
+ Model Card Contact
66
+ kaan.bayar13@gmail.com