Instructions to use zjunlp/zhixi-13b-diff with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use zjunlp/zhixi-13b-diff with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="zjunlp/zhixi-13b-diff")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("zjunlp/zhixi-13b-diff") model = AutoModelForCausalLM.from_pretrained("zjunlp/zhixi-13b-diff") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use zjunlp/zhixi-13b-diff with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "zjunlp/zhixi-13b-diff" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zjunlp/zhixi-13b-diff", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/zjunlp/zhixi-13b-diff
- SGLang
How to use zjunlp/zhixi-13b-diff with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "zjunlp/zhixi-13b-diff" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zjunlp/zhixi-13b-diff", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "zjunlp/zhixi-13b-diff" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zjunlp/zhixi-13b-diff", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use zjunlp/zhixi-13b-diff with Docker Model Runner:
docker model run hf.co/zjunlp/zhixi-13b-diff
Update README.md
Browse files
README.md
CHANGED
|
@@ -3,7 +3,10 @@
|
|
| 3 |
<a href="" target="_blank"><img src="https://github.com/zjunlp/CaMA/blob/main/assets/logo.jpg?raw=true" alt="ZJU-CaMA" style="width: 30%; min-width: 30px; display: block; margin: auto;"></a>
|
| 4 |
</p>
|
| 5 |
|
|
|
|
| 6 |
> This is the result of the weight difference between `Llama 13B` and `CaMA-13B`. You can click [here](https://github.com/zjunlp/cama) to learn more.
|
|
|
|
|
|
|
| 7 |
# CaMA: A Chinese-English Bilingual LLaMA Model
|
| 8 |
|
| 9 |
With the birth of ChatGPT, artificial intelligence has also entered the "iPhone moment," where various large language models (LLMs) have sprung up like mushrooms. The wave of these large models has quickly swept through artificial intelligence fields beyond natural language processing. However, training such a model requires extremely high hardware costs, and open-source language models are scarce due to various reasons, making Chinese language models even more scarce. It wasn't until the open-sourcing of LLaMA that a variety of language models based on LLaMA started to emerge. This project is also based on the LLaMA model. To further enhance Chinese language capabilities without compromising its original language distribution, we first <b>(1) perform additional pre-training on LLaMA (13B) using Chinese corpora, aiming to improve the model's Chinese comprehension and knowledge base while preserving its original English and code abilities to the greatest extent possible;</b> then, <b>(2) we fine-tune the model from the first step using an instruction dataset to enhance the language model's understanding of human instructions.</b>
|
|
@@ -193,7 +196,7 @@ Our pre-trained model has demonstrated certain abilities in instruction followin
|
|
| 193 |
The effectiveness of information extraction is illustrated in the following figure. We tested different instructions for different tasks as well as the same instructions for the same task, and achieved good results for all of them.
|
| 194 |
|
| 195 |
<p align="center" width="100%">
|
| 196 |
-
<a href="" target="_blank"><img src="https://github.com/zjunlp/CaMA/blob/main/assets/ie-case.jpg" alt="IE" style="width: 60%; min-width: 60px; display: block; margin: auto;"></a>
|
| 197 |
</p>
|
| 198 |
|
| 199 |
|
|
@@ -463,7 +466,7 @@ We offer two methods: the first one is **command-line interaction**, and the sec
|
|
| 463 |
```
|
| 464 |
Here is a screenshot of the web-based interaction:
|
| 465 |
<p align="center" width="100%">
|
| 466 |
-
<a href="" target="_blank"><img src="https://github.com/zjunlp/CaMA/blob/main/assets/finetune_web.jpg" alt="finetune-web" style="width: 100%; min-width: 100px; display: block; margin: auto;"></a>
|
| 467 |
</p>
|
| 468 |
|
| 469 |
**3. Usage of Instruction tuning Model**
|
|
@@ -476,7 +479,7 @@ python examples/generate_lora_web.py --base_model ./CaMA --lora_weights ./LoRA
|
|
| 476 |
|
| 477 |
Here is a screenshot of the web-based interaction:
|
| 478 |
<p align="center" width="100%">
|
| 479 |
-
<a href="" target="_blank"><img src="https://github.com/zjunlp/CaMA/blob/main/assets/lora_web.png" alt="finetune-web" style="width: 100%; min-width: 100px; display: block; margin: auto;"></a>
|
| 480 |
</p>
|
| 481 |
|
| 482 |
The `instruction` is a required parameter, while `input` is an optional parameter. For general tasks (such as the examples provided in section `1.3`), you can directly enter the input in the `instruction` field. For information extraction tasks (as shown in the example in section `1.2`), please enter the instruction in the `instruction` field and the sentence to be extracted in the `input` field. We provide an information extraction prompt in section `2.5`.
|
|
@@ -499,7 +502,7 @@ For information extraction tasks such as named entity recognition (NER), event e
|
|
| 499 |
>
|
| 500 |
> (2) Instruction tuning stage using LoRA. This stage enables the model to understand human instructions and generate appropriate responses.
|
| 501 |
|
| 502 |
-

|
| 503 |
|
| 504 |
<h3 id="3-1">3.1 Dataset Construction (Pretraining)</h3>
|
| 505 |
|
|
|
|
| 3 |
<a href="" target="_blank"><img src="https://github.com/zjunlp/CaMA/blob/main/assets/logo.jpg?raw=true" alt="ZJU-CaMA" style="width: 30%; min-width: 30px; display: block; margin: auto;"></a>
|
| 4 |
</p>
|
| 5 |
|
| 6 |
+
|
| 7 |
> This is the result of the weight difference between `Llama 13B` and `CaMA-13B`. You can click [here](https://github.com/zjunlp/cama) to learn more.
|
| 8 |
+
|
| 9 |
+
|
| 10 |
# CaMA: A Chinese-English Bilingual LLaMA Model
|
| 11 |
|
| 12 |
With the birth of ChatGPT, artificial intelligence has also entered the "iPhone moment," where various large language models (LLMs) have sprung up like mushrooms. The wave of these large models has quickly swept through artificial intelligence fields beyond natural language processing. However, training such a model requires extremely high hardware costs, and open-source language models are scarce due to various reasons, making Chinese language models even more scarce. It wasn't until the open-sourcing of LLaMA that a variety of language models based on LLaMA started to emerge. This project is also based on the LLaMA model. To further enhance Chinese language capabilities without compromising its original language distribution, we first <b>(1) perform additional pre-training on LLaMA (13B) using Chinese corpora, aiming to improve the model's Chinese comprehension and knowledge base while preserving its original English and code abilities to the greatest extent possible;</b> then, <b>(2) we fine-tune the model from the first step using an instruction dataset to enhance the language model's understanding of human instructions.</b>
|
|
|
|
| 196 |
The effectiveness of information extraction is illustrated in the following figure. We tested different instructions for different tasks as well as the same instructions for the same task, and achieved good results for all of them.
|
| 197 |
|
| 198 |
<p align="center" width="100%">
|
| 199 |
+
<a href="" target="_blank"><img src="https://github.com/zjunlp/CaMA/blob/main/assets/ie-case.jpg?raw=true" alt="IE" style="width: 60%; min-width: 60px; display: block; margin: auto;"></a>
|
| 200 |
</p>
|
| 201 |
|
| 202 |
|
|
|
|
| 466 |
```
|
| 467 |
Here is a screenshot of the web-based interaction:
|
| 468 |
<p align="center" width="100%">
|
| 469 |
+
<a href="" target="_blank"><img src="https://github.com/zjunlp/CaMA/blob/main/assets/finetune_web.jpg?raw=true" alt="finetune-web" style="width: 100%; min-width: 100px; display: block; margin: auto;"></a>
|
| 470 |
</p>
|
| 471 |
|
| 472 |
**3. Usage of Instruction tuning Model**
|
|
|
|
| 479 |
|
| 480 |
Here is a screenshot of the web-based interaction:
|
| 481 |
<p align="center" width="100%">
|
| 482 |
+
<a href="" target="_blank"><img src="https://github.com/zjunlp/CaMA/blob/main/assets/lora_web.png?raw=true" alt="finetune-web" style="width: 100%; min-width: 100px; display: block; margin: auto;"></a>
|
| 483 |
</p>
|
| 484 |
|
| 485 |
The `instruction` is a required parameter, while `input` is an optional parameter. For general tasks (such as the examples provided in section `1.3`), you can directly enter the input in the `instruction` field. For information extraction tasks (as shown in the example in section `1.2`), please enter the instruction in the `instruction` field and the sentence to be extracted in the `input` field. We provide an information extraction prompt in section `2.5`.
|
|
|
|
| 502 |
>
|
| 503 |
> (2) Instruction tuning stage using LoRA. This stage enables the model to understand human instructions and generate appropriate responses.
|
| 504 |
|
| 505 |
+

|
| 506 |
|
| 507 |
<h3 id="3-1">3.1 Dataset Construction (Pretraining)</h3>
|
| 508 |
|