PygmalionAI/PIPPA
Updated • 537 • 236
How to use ricecake/Orca-2-13B-Pygmalion-LoRA with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="ricecake/Orca-2-13B-Pygmalion-LoRA") # Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("ricecake/Orca-2-13B-Pygmalion-LoRA")
model = AutoModelForCausalLM.from_pretrained("ricecake/Orca-2-13B-Pygmalion-LoRA")How to use ricecake/Orca-2-13B-Pygmalion-LoRA with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ricecake/Orca-2-13B-Pygmalion-LoRA"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "ricecake/Orca-2-13B-Pygmalion-LoRA",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker model run hf.co/ricecake/Orca-2-13B-Pygmalion-LoRA
How to use ricecake/Orca-2-13B-Pygmalion-LoRA with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "ricecake/Orca-2-13B-Pygmalion-LoRA" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "ricecake/Orca-2-13B-Pygmalion-LoRA",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "ricecake/Orca-2-13B-Pygmalion-LoRA" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "ricecake/Orca-2-13B-Pygmalion-LoRA",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'How to use ricecake/Orca-2-13B-Pygmalion-LoRA with Docker Model Runner:
docker model run hf.co/ricecake/Orca-2-13B-Pygmalion-LoRA
++ This model's response was too short, so I re-trained it, check this out: https://huggingface.co/ricecake/Orca-2-13B-Pyg-and-Bluemoon
This LoRA adapter is a fine-tuned version of microsoft/Orca-2-13b on the PygmalionAI/PIPPA dataset. It achieves the following results on the evaluation set:
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| No log | 0.0 | 1 | 3.2585 |
| 1.9811 | 0.05 | 536 | 2.0113 |
| 1.9507 | 0.1 | 1072 | 1.9877 |
| 1.9576 | 0.15 | 1608 | 1.9766 |
| 1.9308 | 0.2 | 2144 | 1.9671 |
| 1.9193 | 0.25 | 2680 | 1.9597 |
| 1.8522 | 0.3 | 3216 | 1.9530 |
| 1.895 | 0.35 | 3752 | 1.9483 |
| 1.869 | 0.4 | 4288 | 1.9432 |
| 1.8664 | 0.45 | 4824 | 1.9383 |
| 1.8661 | 0.5 | 5360 | 1.9347 |
| 1.8576 | 0.55 | 5896 | 1.9337 |
| 1.8573 | 0.6 | 6432 | 1.9286 |
| 1.8665 | 0.65 | 6968 | 1.9280 |
| 1.8429 | 0.7 | 7504 | 1.9243 |
| 1.8621 | 0.75 | 8040 | 1.9221 |
| 1.8074 | 0.8 | 8576 | 1.9209 |
| 1.8199 | 0.85 | 9112 | 1.9202 |
| 1.8733 | 0.9 | 9648 | 1.9193 |
| 1.8387 | 0.95 | 10184 | 1.9190 |
Base model
microsoft/Orca-2-13b