Instructions to use CallMeDaniel/Llama-2-7b-chat-hf_vn with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use CallMeDaniel/Llama-2-7b-chat-hf_vn with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="CallMeDaniel/Llama-2-7b-chat-hf_vn") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("CallMeDaniel/Llama-2-7b-chat-hf_vn") model = AutoModelForCausalLM.from_pretrained("CallMeDaniel/Llama-2-7b-chat-hf_vn") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use CallMeDaniel/Llama-2-7b-chat-hf_vn with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "CallMeDaniel/Llama-2-7b-chat-hf_vn" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "CallMeDaniel/Llama-2-7b-chat-hf_vn", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/CallMeDaniel/Llama-2-7b-chat-hf_vn
- SGLang
How to use CallMeDaniel/Llama-2-7b-chat-hf_vn with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "CallMeDaniel/Llama-2-7b-chat-hf_vn" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "CallMeDaniel/Llama-2-7b-chat-hf_vn", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "CallMeDaniel/Llama-2-7b-chat-hf_vn" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "CallMeDaniel/Llama-2-7b-chat-hf_vn", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use CallMeDaniel/Llama-2-7b-chat-hf_vn with Docker Model Runner:
docker model run hf.co/CallMeDaniel/Llama-2-7b-chat-hf_vn
- Vietnamese Fine-tuned Llama-2-7b-chat-hf
Vietnamese Fine-tuned Llama-2-7b-chat-hf
This repository contains a Vietnamese-tuned version of the Llama-2-7b-chat-hf model, which has been fine-tuned on Vietnamese datasets using LoRA (Low-Rank Adaptation) techniques.
Model Details
This model is a fine-tuned version of the Llama-2-7b-chat-hf model, specifically adapted for improved performance on Vietnamese language tasks. It uses LoRA fine-tuning to efficiently adapt the large language model to Vietnamese data while maintaining much of the original model's general knowledge and capabilities.
Model Description
- Developed by: Daniel Du
- Model type: Large Language Model
- Language(s) (NLP): Vietnamese
- License: [More Information Needed]
- Finetuned from model [optional]: meta-llama/Llama-2-7b-chat-hf
- Language: Vietnamese
Direct Use
You can use this model directly with the Hugging Face Transformers library:
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel, PeftConfig
# Load the base model
base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-chat-hf")
# Load the LoRA configuration and model
peft_model_id = "CallMeMrFern/Llama-2-7b-chat-hf_vn"
config = PeftConfig.from_pretrained(peft_model_id)
model = PeftModel.from_pretrained(base_model, peft_model_id)
# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf")
# Example usage
input_text = "Xin chào, hôm nay thời tiết thế nào?"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downstream Use [optional]
[More Information Needed]
Out-of-Scope Use
[More Information Needed]
Bias, Risks, and Limitations
- This model is specifically fine-tuned for Vietnamese and may not perform as well on other languages.
- The model inherits limitations from the base Llama-2-7b-chat-hf model.
- Performance may vary depending on the specific task and domain.
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
How to Get Started with the Model
Use the code below to get started with the model.
[More Information Needed]
Training Details
Training Data
Dataset: alpaca_translate_GPT_35_10_20k.json (Vietnamese translation of the Alpaca dataset)
Training Hyperparameters
- Training regime: [More Information Needed]
Speeds, Sizes, Times [optional]
[More Information Needed]
Evaluation
Testing Data, Factors & Metrics
Testing Data
[More Information Needed]
Factors
[More Information Needed]
Metrics
[More Information Needed]
Results
Summary
Model Examination [optional]
[More Information Needed]
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: [More Information Needed]
- Hours used: [More Information Needed]
- Cloud Provider: [More Information Needed]
- Compute Region: [More Information Needed]
- Carbon Emitted: [More Information Needed]
Model Architecture and Objective
[More Information Needed]
Citation
If you use this model in your research, please cite:
@misc{vietnamese_llama2_7b_chat,
author = {[Your Name]},
title = {Vietnamese Fine-tuned Llama-2-7b-chat-hf},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://huggingface.co/CallMeMrFern/Llama-2-7b-chat-hf_vn}}
}
Training procedure
The following bitsandbytes quantization config was used during training:
- quant_method: bitsandbytes
- load_in_8bit: True
- load_in_4bit: False
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: fp4
- bnb_4bit_use_double_quant: False
- bnb_4bit_compute_dtype: float32
Framework versions
- PEFT 0.6.3.dev0
Model Description
This model is a fine-tuned version of the Llama-2-7b-chat-hf model, specifically adapted for improved performance on Vietnamese language tasks. It uses LoRA fine-tuning to efficiently adapt the large language model to Vietnamese data while maintaining much of the original model's general knowledge and capabilities.
Fine-tuning Details
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- LoRA Config:
- Target Modules:
["q_proj", "v_proj"] - Precision: 8-bit
- Target Modules:
- Dataset:
alpaca_translate_GPT_35_10_20k.json(Vietnamese translation of the Alpaca dataset)
Training Procedure
The model was fine-tuned using the following command:
python finetune/lora.py \
--base_model meta-llama/Llama-2-7b-chat-hf \
--model_type llama \
--data_dir data/general/alpaca_translate_GPT_35_10_20k.json \
--output_dir finetuned/meta-llama/Llama-2-7b-chat-hf \
--lora_target_modules '["q_proj", "v_proj"]' \
--micro_batch_size 1
For multi-GPU training, a distributed training approach was used.
Evaluation Results
[Include any evaluation results, perplexity scores, or benchmark performances here]
Acknowledgements
- This project is part of the TF07 Course offered by ProtonX.
- We thank the creators of the original Llama-2-7b-chat-hf model and the Hugging Face team for their tools and resources.
- Appreciation to VietnamAIHub/Vietnamese_LLMs for the translated dataset.
- Downloads last month
- 7
Model tree for CallMeDaniel/Llama-2-7b-chat-hf_vn
Base model
meta-llama/Llama-2-7b-chat-hf