Frinkles
/

JapaneseModelV1-ONNX

Text Generation

Model card Files Files and versions

Frinkles commited on Jul 17, 2024

Commit

0277281

·

verified ·

1 Parent(s): 6ab90d2

Create README.md

Files changed (1) hide show

README.md +46 -0

README.md ADDED Viewed

	@@ -0,0 +1,46 @@

+---
+pipeline_tag: text-generation
+tags:
+- phi3
+- LLM
+- onnx
+language:
+- ja
+library_name: transformers
+---
+# Phi 3 Model with Extended Vocabulary and Fine-Tuning for Japanese
+## Overview
+This project is a proof of concept that extends the base vocabulary of the Phi 3 model and then applies supervised fine-tuning to teach it a new language (Japanese). Despite using a very small custom dataset, the improvement in Japanese language understanding is substantial.
+## Model Details
+- **Base Model**: Phi 3
+- **Objective**: Extend the base vocabulary and fine-tune for Japanese language understanding.
+- **Dataset**: Custom dataset of 1,000 entries generated using ChatGPT-4.
+- **Language**: Japanese
+## Dataset
+The dataset used for this project was generated with the assistance of ChatGPT-4. It comprises 1,000 entries, carefully curated to cover a diverse range of topics and linguistic structures.
+## Training
+### Vocabulary Extension
+The base vocabulary of the Phi 3 model was extended to include new Japanese tokens. This was a crucial step to enable the model to comprehend and generate Japanese text more effectively.
+### Fine-Tuning
+Supervised fine-tuning was performed on the extended model using the custom dataset. Despite the small dataset size, the model showed significant improvement in understanding and generating Japanese text.
+## Results
+Even with the limited dataset and vocabulary size, the fine-tuned model demonstrated substantial improvements over the base model in terms of Japanese language understanding and generation.
+## Future Work
+1. **Dataset Expansion**: Increase the size and diversity of the dataset to further enhance model performance.
+2. **Evaluation**: Conduct comprehensive evaluation and benchmarking against standard Japanese language tasks.
+3. **Optimization**: Optimize the model for better performance and efficiency.