Instructions to use ondevicellm/tinyllama_mole_dpo_ep3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ondevicellm/tinyllama_mole_dpo_ep3 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="ondevicellm/tinyllama_mole_dpo_ep3", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("ondevicellm/tinyllama_mole_dpo_ep3", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use ondevicellm/tinyllama_mole_dpo_ep3 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ondevicellm/tinyllama_mole_dpo_ep3"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ondevicellm/tinyllama_mole_dpo_ep3",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/ondevicellm/tinyllama_mole_dpo_ep3

SGLang

How to use ondevicellm/tinyllama_mole_dpo_ep3 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "ondevicellm/tinyllama_mole_dpo_ep3" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ondevicellm/tinyllama_mole_dpo_ep3",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "ondevicellm/tinyllama_mole_dpo_ep3" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ondevicellm/tinyllama_mole_dpo_ep3",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use ondevicellm/tinyllama_mole_dpo_ep3 with Docker Model Runner:
```
docker model run hf.co/ondevicellm/tinyllama_mole_dpo_ep3
```

tinyllama_mole_dpo_ep3

This model is a fine-tuned version of ondevicellm/tinyllama_mole_sft_ultrachat_ep3 on the HuggingFaceH4/ultrafeedback_binarized dataset. It achieves the following results on the evaluation set:

Loss: 0.6285
Rewards/chosen: -0.3050
Rewards/rejected: -0.5353
Rewards/accuracies: 0.6806
Rewards/margins: 0.2302
Logps/rejected: -354.2071
Logps/chosen: -373.1399
Logits/rejected: -1.6731
Logits/chosen: -1.8041

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-07
train_batch_size: 8
eval_batch_size: 8
seed: 42
distributed_type: multi-GPU
num_devices: 4
gradient_accumulation_steps: 2
total_train_batch_size: 64
total_eval_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 100
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Rewards/chosen	Rewards/rejected	Rewards/accuracies	Rewards/margins	Logps/rejected	Logps/chosen	Logits/rejected	Logits/chosen
0.6896	0.1	100	0.6899	0.0064	-0.0013	0.6448	0.0076	-300.8089	-342.0017	-1.7574	-1.8918
0.6762	0.21	200	0.6756	-0.0293	-0.0716	0.6627	0.0423	-307.8423	-345.5688	-1.7501	-1.8839
0.6499	0.31	300	0.6587	-0.0875	-0.1813	0.6687	0.0938	-318.8118	-351.3895	-1.7358	-1.8688
0.6374	0.42	400	0.6451	-0.1726	-0.3218	0.6746	0.1493	-332.8632	-359.8953	-1.7164	-1.8482
0.6348	0.52	500	0.6377	-0.2696	-0.4550	0.6647	0.1854	-346.1808	-369.6013	-1.6884	-1.8208
0.6308	0.63	600	0.6333	-0.2783	-0.4815	0.6726	0.2032	-348.8291	-370.4673	-1.6965	-1.8269
0.62	0.73	700	0.6312	-0.2323	-0.4505	0.6806	0.2182	-345.7306	-365.8656	-1.6841	-1.8149
0.6055	0.84	800	0.6287	-0.2877	-0.5169	0.6865	0.2292	-352.3697	-371.4099	-1.6793	-1.8099
0.6357	0.94	900	0.6285	-0.3050	-0.5353	0.6806	0.2302	-354.2071	-373.1399	-1.6731	-1.8041

Framework versions

Transformers 4.37.0
Pytorch 2.1.2+cu118
Datasets 2.16.1
Tokenizers 0.15.0

Downloads last month: 6

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for ondevicellm/tinyllama_mole_dpo_ep3

Base model

ondevicellm/tinyllama_mole_sft_ultrachat_ep3

Finetuned

(1)

this model

ondevicellm
/

tinyllama_mole_dpo_ep3

tinyllama_mole_dpo_ep3

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ondevicellm/tinyllama_mole_dpo_ep3

Dataset used to train ondevicellm/tinyllama_mole_dpo_ep3