Instructions to use Frinkles/JapaneseModelV1-ONNX with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Frinkles/JapaneseModelV1-ONNX with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Frinkles/JapaneseModelV1-ONNX") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Frinkles/JapaneseModelV1-ONNX", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Frinkles/JapaneseModelV1-ONNX with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Frinkles/JapaneseModelV1-ONNX" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Frinkles/JapaneseModelV1-ONNX", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Frinkles/JapaneseModelV1-ONNX
- SGLang
How to use Frinkles/JapaneseModelV1-ONNX with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Frinkles/JapaneseModelV1-ONNX" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Frinkles/JapaneseModelV1-ONNX", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Frinkles/JapaneseModelV1-ONNX" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Frinkles/JapaneseModelV1-ONNX", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Frinkles/JapaneseModelV1-ONNX with Docker Model Runner:
docker model run hf.co/Frinkles/JapaneseModelV1-ONNX
Phi 3 Model with Extended Vocabulary and Fine-Tuning for Japanese
Overview
This project is a proof of concept that extends the base vocabulary of the Phi 3 model and then applies supervised fine-tuning to teach it a new language (Japanese). Despite using a very small custom dataset, the improvement in Japanese language understanding is substantial.
Model Details
- Base Model: Phi 3
- Objective: Extend the base vocabulary and fine-tune for Japanese language understanding.
- Dataset: Custom dataset of 1,000 entries generated using ChatGPT-4.
- Language: Japanese
Dataset
The dataset used for this project was generated with the assistance of ChatGPT-4. It comprises 1,000 entries, carefully curated to cover a diverse range of topics and linguistic structures.
Training
Vocabulary Extension
The base vocabulary of the Phi 3 model was extended to include new Japanese tokens. This was a crucial step to enable the model to comprehend and generate Japanese text more effectively.
Fine-Tuning
Supervised fine-tuning was performed on the extended model using the custom dataset. Despite the small dataset size, the model showed significant improvement in understanding and generating Japanese text.
Results
Even with the limited dataset and vocabulary size, the fine-tuned model demonstrated substantial improvements over the base model in terms of Japanese language understanding and generation.
Future Work
- Dataset Expansion: Increase the size and diversity of the dataset to further enhance model performance.
- Evaluation: Conduct comprehensive evaluation and benchmarking against standard Japanese language tasks.
- Optimization: Optimize the model for better performance and efficiency.