scoup123 commited on
Commit
488c581
·
verified ·
1 Parent(s): 3c7ff9e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -33
README.md CHANGED
@@ -7,60 +7,95 @@ metrics:
7
  - accuracy
8
  pipeline_tag: text-classification
9
  ---
10
- Model Description
11
- Given 2 words in Turkish, the model predicts whether they share an affix or not. Fine-tuned on dbmdz/bert-base-turkish-cased, fine-tuned on a task similar to NLI, but on word level and with 2 labels. It was created as a final project for one of my classes.
12
-
13
- Developed by: Scoup123
14
- Model type: BERT
15
- Language(s) (NLP): Turkish
16
- Finetuned from model [optional]: dbmdz/bert-base-turkish-cased
17
- Model Sources [optional]
18
- Repository: [More Information Needed]
19
- Paper [optional]: in-works
20
- Uses
 
 
 
 
 
 
 
 
 
 
 
 
21
  It can be used in morphological analyzing tasks.
 
22
 
23
- Direct Use
24
  It can probably be used without additional finetuning on Turkish.
25
 
26
- Training Details
27
- Training Data
 
 
28
  scoup123/affixfinder
29
 
30
  The dataset used was generated from a generated dataset mentioned in the paper titled Turkish language resources: Morphological parser, morphological disambiguator and web corpus.
31
 
32
- Evaluation
33
- Test Accuracy: 0.9874 Precision: 0.9874 Recall: 0.9874 F1 Score: 0.9874
 
 
 
 
 
34
 
35
  **It should be used with caution as these scores are too high.
36
 
37
- Testing Data, Factors & Metrics
38
- Testing Data
 
 
39
  A testing split data was created from the training data
40
 
41
- Summary
42
- This model aims to create an affix identifier for Turkish.
 
 
 
43
 
44
- Model Examination [optional]
45
  I have just created it, so further testing needed to check if it actually works. Additionally, you should check it if it works before using it.
46
 
47
  [More Information Needed]
48
 
49
- Environmental Impact
50
- Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
 
 
 
 
 
 
 
 
 
51
 
52
- Hardware Type: Free Colab T4 GPU
53
- Hours used: ~2.5 hours
54
- Cloud Provider: Google
55
- Compute Region: Europe
56
- Carbon Emitted: [More Information Needed]
57
- Citation [optional]
58
- APA:
59
 
60
- Sak, H., Güngör, T., & Saraçlar, M. (2008). Turkish language resources: Morphological parser, morphological disambiguator and web corpus. In Advances in natural language processing (pp. 417-427). Springer Berlin Heidelberg.
 
 
 
 
 
 
 
 
 
 
61
 
62
- Model Card Authors [optional]
63
  Kaan Bayar
64
 
65
- Model Card Contact
 
66
  kaan.bayar13@gmail.com
 
7
  - accuracy
8
  pipeline_tag: text-classification
9
  ---
10
+ # Model Card for Model ID
11
+
12
+ ### Model Description
13
+ Given 2 words in Turkish, the model predicts whether they share an affix or not. Fine-tuned on dbmdz/bert-base-turkish-cased,
14
+ fine-tuned on a task similar to NLI, but on word level and with 2 labels. It was created as a final project for one of my classes.
15
+
16
+
17
+
18
+ - **Developed by:** Scoup123
19
+ - **Model type:** BERT
20
+ - **Language(s) (NLP):** Turkish
21
+ - **Finetuned from model [optional]:** dbmdz/bert-base-turkish-cased
22
+
23
+ ### Model Sources [optional]
24
+
25
+ <!-- Provide the basic links for the model. -->
26
+
27
+ - **Repository:** [More Information Needed]
28
+ - **Paper [optional]:** in-works
29
+ -
30
+
31
+ ## Uses
32
+
33
  It can be used in morphological analyzing tasks.
34
+ ### Direct Use
35
 
 
36
  It can probably be used without additional finetuning on Turkish.
37
 
38
+ ## Training Details
39
+
40
+ ### Training Data
41
+
42
  scoup123/affixfinder
43
 
44
  The dataset used was generated from a generated dataset mentioned in the paper titled Turkish language resources: Morphological parser, morphological disambiguator and web corpus.
45
 
46
+
47
+ ## Evaluation
48
+
49
+ Test Accuracy: 0.9874
50
+ Precision: 0.9874
51
+ Recall: 0.9874
52
+ F1 Score: 0.9874
53
 
54
  **It should be used with caution as these scores are too high.
55
 
56
+ ### Testing Data, Factors & Metrics
57
+
58
+ #### Testing Data
59
+
60
  A testing split data was created from the training data
61
 
62
+ #### Summary
63
+
64
+ This model aims to create an affix identifier for Turkish.
65
+
66
+ ## Model Examination [optional]
67
 
 
68
  I have just created it, so further testing needed to check if it actually works. Additionally, you should check it if it works before using it.
69
 
70
  [More Information Needed]
71
 
72
+ ## Environmental Impact
73
+
74
+
75
+
76
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
77
+
78
+ - **Hardware Type:** Free Colab T4 GPU
79
+ - **Hours used:** ~2.5 hours
80
+ - **Cloud Provider:** Google
81
+ - **Compute Region:** Europe
82
+ - **Carbon Emitted:** [More Information Needed]
83
 
 
 
 
 
 
 
 
84
 
85
+ ## Citation [optional]
86
+
87
+ **APA:**
88
+
89
+ Sak, H., Güngör, T., & Saraçlar, M. (2008). Turkish language resources: Morphological parser, morphological disambiguator and web corpus.
90
+ In Advances in natural language processing (pp. 417-427). Springer Berlin Heidelberg.
91
+
92
+
93
+
94
+
95
+ ## Model Card Authors [optional]
96
 
 
97
  Kaan Bayar
98
 
99
+ ## Model Card Contact
100
+
101
  kaan.bayar13@gmail.com