Instructions to use Zhao-Ching/TWLLM-Llama2-Extend-VTLoss with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Zhao-Ching/TWLLM-Llama2-Extend-VTLoss with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Zhao-Ching/TWLLM-Llama2-Extend-VTLoss") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Zhao-Ching/TWLLM-Llama2-Extend-VTLoss") model = AutoModelForCausalLM.from_pretrained("Zhao-Ching/TWLLM-Llama2-Extend-VTLoss") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Zhao-Ching/TWLLM-Llama2-Extend-VTLoss with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Zhao-Ching/TWLLM-Llama2-Extend-VTLoss" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Zhao-Ching/TWLLM-Llama2-Extend-VTLoss", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Zhao-Ching/TWLLM-Llama2-Extend-VTLoss
- SGLang
How to use Zhao-Ching/TWLLM-Llama2-Extend-VTLoss with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Zhao-Ching/TWLLM-Llama2-Extend-VTLoss" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Zhao-Ching/TWLLM-Llama2-Extend-VTLoss", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Zhao-Ching/TWLLM-Llama2-Extend-VTLoss" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Zhao-Ching/TWLLM-Llama2-Extend-VTLoss", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Zhao-Ching/TWLLM-Llama2-Extend-VTLoss with Docker Model Runner:
docker model run hf.co/Zhao-Ching/TWLLM-Llama2-Extend-VTLoss
Model Card for Model ID
This is a fine-tuned model for question or statements rewrite task focused on Traditional Chinese specifically.
In this version , we have adjusted the way the model calculates loss.
(The original training process (i.e. SFTTrainer class from trl) calculates CE on whole prompt template.)
In order to prevent the model from copying the original sentence, the total loss we use will be counted as three parts :
- Context Loss (from the beginning to
<rephrased>) - Answer Loss (from
<rephrased>to</rephrased>) - Variety Loss (VTLoss) , it calculates the IOU of orignal tokenized sentence and rewritten tokenized sentence , trying to encourage the model to generate as diverse text as possible.
Noted that the answer loss will take a larger weight than context loss since the answer is more important part that we shall take care of.
Model Details
the prompt template should be used as follow:
<task>
你是一名熱於助人的AI小幫手,請將敘述語句或者問句變得更加通順與簡潔。
</task>
原始句子:
<origin>
{before}
</origin>
修改後:
<rephrased>
{after}
</rephrased>
Noted that {before} {after} are the original question/statement and rewritten question/statement respcetively.
Moreover , this model is not the best rewrite tool compared with many open source LLMs , it is a trial version.
But we'll still make improvements on it.
Model Description
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
- Developed by: [--]
- Funded by [optional]: [--]
- Shared by [optional]: [--]
- Model type: [--]
- Language(s) (NLP): [Traditional Chinese]
- License: [--]
- Finetuned from model [optional]: [Taiwan LLM base v2.0]
Training Details
Training Data
Generate from GPT4o and artificial human feedback.
Custom Traditional Chinese BenchMark Dataset , with rewritten answers came from Gemini.
Also , the evaluation task is assigned to GPTo with custom rubrics.
[More Information Needed]
Training Procedure
Training Hyperparameters
- Training regime: [QLoRA]
More Information [optional]
[--]
Model Card Authors [optional]
[--]
Model Card Contact
[--]
- Downloads last month
- -