Kundyzka commited on
Commit
e318f4b
·
verified ·
1 Parent(s): 4ab4acb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -3
README.md CHANGED
@@ -1,3 +1,72 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - Kundyzka/informatics_kaz
5
+ language:
6
+ - kk
7
+ metrics:
8
+ - name: F1 (Before Training)
9
+ type: F1 Score
10
+ value: 24.586
11
+ - name: Exact Match (Before Training)
12
+ type: Exact Match
13
+ value: 11.818
14
+ - name: F1 (After Training)
15
+ type: F1 Score
16
+ value: 63.317
17
+ - name: Exact Match (After Training)
18
+ type: Exact Match
19
+ value: 43.162
20
+ base_model:
21
+ - google-bert/bert-base-multilingual-cased
22
+ new_version: Kundyzka/bert-base-multilingual-informatics-kaz
23
+ pipeline_tag: question-answering
24
+ library_name: adapter-transformers
25
+ tags:
26
+ - computerscience
27
+ - informatics
28
+ ---
29
+
30
+ # Description
31
+
32
+ This model is a fine-tuned version of `google-bert/bert-base-multilingual-cased` using the `Kundyzka/informatics_kaz` dataset. Developed by **Kundyz Maksutova**, PhD Candidate, this model is specifically optimized for question-answering tasks in the Kazakh language, with a focus on computer science and informatics.
33
+
34
+ ### Key Features:
35
+ - **Developer**: Kundyz Maksutova, PhD Candidate
36
+ - **Base Model**: `google-bert/bert-base-multilingual-cased`
37
+ - **Dataset**: `Kundyzka/informatics_kaz`
38
+ - **Language**: Kazakh (`kk`)
39
+ - **Task**: Question Answering (`pipeline_tag: question-answering`)
40
+ - **Library**: `adapter-transformers`
41
+
42
+ ### Performance:
43
+ This model demonstrates significant improvements after fine-tuning, as highlighted by the following metrics:
44
+
45
+ - **Before Training**:
46
+ - F1 Score: 24.586
47
+ - Exact Match (EM): 11.818
48
+ - **After Training**:
49
+ - F1 Score: 63.317
50
+ - Exact Match (EM): 43.162
51
+
52
+ These metrics were evaluated on the `Kundyzka/informatics_kaz` dataset, indicating a substantial enhancement in the model’s ability to handle domain-specific questions.
53
+
54
+ ### Intended Use:
55
+ This model is intended for question-answering applications in the Kazakh language. Potential use cases include:
56
+ - **Educational Platforms**: Assisting students with queries in informatics and computer science.
57
+ - **Research Projects**: Supporting research in Kazakh natural language processing.
58
+ - **AI Applications**: Enhancing intelligent systems, chatbots, or virtual assistants requiring Kazakh language support.
59
+
60
+ ### Limitations:
61
+ - **Domain-Specific Training**: The model is optimized for informatics and computer science topics, and performance may degrade on unrelated queries.
62
+ - **Language Support**: The model supports only the Kazakh language and does not handle multilingual tasks.
63
+ - **Bias**: Potential biases in the dataset may influence model outputs.
64
+
65
+ ### Tags:
66
+ - `computerscience`
67
+ - `informatics`
68
+ - `question-answering`
69
+ - `Kazakh`
70
+ - `adapter-transformers`
71
+
72
+ This model is a step forward in enabling high-quality question-answering systems for low-resource languages like Kazakh. For further details, customization, or fine-tuning, refer to the model repository.