Frinkles commited on
Commit
0277281
·
verified ·
1 Parent(s): 6ab90d2

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -0
README.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: text-generation
3
+ tags:
4
+ - phi3
5
+ - LLM
6
+ - onnx
7
+ language:
8
+ - ja
9
+ library_name: transformers
10
+ ---
11
+ # Phi 3 Model with Extended Vocabulary and Fine-Tuning for Japanese
12
+
13
+ ## Overview
14
+
15
+ This project is a proof of concept that extends the base vocabulary of the Phi 3 model and then applies supervised fine-tuning to teach it a new language (Japanese). Despite using a very small custom dataset, the improvement in Japanese language understanding is substantial.
16
+
17
+ ## Model Details
18
+
19
+ - **Base Model**: Phi 3
20
+ - **Objective**: Extend the base vocabulary and fine-tune for Japanese language understanding.
21
+ - **Dataset**: Custom dataset of 1,000 entries generated using ChatGPT-4.
22
+ - **Language**: Japanese
23
+
24
+ ## Dataset
25
+
26
+ The dataset used for this project was generated with the assistance of ChatGPT-4. It comprises 1,000 entries, carefully curated to cover a diverse range of topics and linguistic structures.
27
+
28
+ ## Training
29
+
30
+ ### Vocabulary Extension
31
+
32
+ The base vocabulary of the Phi 3 model was extended to include new Japanese tokens. This was a crucial step to enable the model to comprehend and generate Japanese text more effectively.
33
+
34
+ ### Fine-Tuning
35
+
36
+ Supervised fine-tuning was performed on the extended model using the custom dataset. Despite the small dataset size, the model showed significant improvement in understanding and generating Japanese text.
37
+
38
+ ## Results
39
+
40
+ Even with the limited dataset and vocabulary size, the fine-tuned model demonstrated substantial improvements over the base model in terms of Japanese language understanding and generation.
41
+
42
+ ## Future Work
43
+
44
+ 1. **Dataset Expansion**: Increase the size and diversity of the dataset to further enhance model performance.
45
+ 2. **Evaluation**: Conduct comprehensive evaluation and benchmarking against standard Japanese language tasks.
46
+ 3. **Optimization**: Optimize the model for better performance and efficiency.