Instructions to use fenguhao/Llama-3-Base-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use fenguhao/Llama-3-Base-8B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="fenguhao/Llama-3-Base-8B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("fenguhao/Llama-3-Base-8B")
model = AutoModelForCausalLM.from_pretrained("fenguhao/Llama-3-Base-8B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use fenguhao/Llama-3-Base-8B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "fenguhao/Llama-3-Base-8B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "fenguhao/Llama-3-Base-8B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/fenguhao/Llama-3-Base-8B

SGLang

How to use fenguhao/Llama-3-Base-8B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "fenguhao/Llama-3-Base-8B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "fenguhao/Llama-3-Base-8B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "fenguhao/Llama-3-Base-8B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "fenguhao/Llama-3-Base-8B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use fenguhao/Llama-3-Base-8B with Docker Model Runner:
```
docker model run hf.co/fenguhao/Llama-3-Base-8B
```

Llama-3-Base-8B / README.md

fenguhao

End of training

cb85392 verified almost 2 years ago

preview code

raw

history blame contribute delete

9.35 kB

	---
	base_model: princeton-nlp/Llama-3-Base-8B-SFT
	tags:
	- alignment-handbook
	- generated_from_trainer
	- trl
	- dpo
	- generated_from_trainer
	datasets:
	- HuggingFaceH4/ultrafeedback_binarized
	model-index:
	- name: Llama-3-Base-8B
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Llama-3-Base-8B

	This model is a fine-tuned version of [princeton-nlp/Llama-3-Base-8B-SFT](https://huggingface.co/princeton-nlp/Llama-3-Base-8B-SFT) on the HuggingFaceH4/ultrafeedback_binarized dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.6285
	- Rewards/chosen: 0.5979
	- Rewards/rejected: 0.1801
	- Rewards/accuracies: 0.6620
	- Rewards/margins: 0.4178
	- Logps/rejected: -2212.5046
	- Logps/chosen: -2612.9824
	- Logits/rejected: -1.3033
	- Logits/chosen: -1.3358

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 1e-06
	- train_batch_size: 2
	- eval_batch_size: 4
	- seed: 42
	- distributed_type: multi-GPU
	- num_devices: 4
	- gradient_accumulation_steps: 2
	- total_train_batch_size: 16
	- total_eval_batch_size: 16
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 1

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rewards/chosen \| Rewards/rejected \| Rewards/accuracies \| Rewards/margins \| Logps/rejected \| Logps/chosen \| Logits/rejected \| Logits/chosen \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------------:\|:----------------:\|:------------------:\|:---------------:\|:--------------:\|:------------:\|:---------------:\|:-------------:\|
	\| 0.6694 \| 0.03 \| 100 \| 0.6733 \| 0.4668 \| 0.3687 \| 0.5500 \| 0.0980 \| -2193.6436 \| -2626.0984 \| -1.2047 \| -1.2463 \|
	\| 0.6496 \| 0.05 \| 200 \| 0.6497 \| 0.8935 \| 0.6578 \| 0.6040 \| 0.2357 \| -2164.7385 \| -2583.4270 \| -1.1621 \| -1.2030 \|
	\| 0.6358 \| 0.08 \| 300 \| 0.6672 \| 0.6703 \| 0.4436 \| 0.5900 \| 0.2266 \| -2186.1528 \| -2605.7471 \| -1.2202 \| -1.2617 \|
	\| 0.6783 \| 0.1 \| 400 \| 0.7144 \| 0.2834 \| 0.0925 \| 0.5680 \| 0.1909 \| -2221.2676 \| -2644.4390 \| -1.3598 \| -1.4017 \|
	\| 0.751 \| 0.13 \| 500 \| 0.6889 \| 1.3453 \| 0.9758 \| 0.6020 \| 0.3696 \| -2132.9402 \| -2538.2405 \| -1.4750 \| -1.5419 \|
	\| 0.6921 \| 0.16 \| 600 \| 0.6644 \| 0.8464 \| 0.5451 \| 0.6220 \| 0.3014 \| -2176.0090 \| -2588.1318 \| -1.2841 \| -1.3381 \|
	\| 0.6437 \| 0.18 \| 700 \| 0.6724 \| 0.8250 \| 0.4796 \| 0.6420 \| 0.3454 \| -2182.5566 \| -2590.2764 \| -1.4526 \| -1.4817 \|
	\| 0.8109 \| 0.21 \| 800 \| 0.6655 \| 1.1490 \| 0.7473 \| 0.6380 \| 0.4017 \| -2155.7832 \| -2557.8708 \| -1.5267 \| -1.5761 \|
	\| 0.6725 \| 0.24 \| 900 \| 0.6836 \| 1.4258 \| 0.9989 \| 0.6160 \| 0.4269 \| -2130.6240 \| -2530.1914 \| -1.4486 \| -1.4910 \|
	\| 0.7027 \| 0.26 \| 1000 \| 0.6690 \| 0.8152 \| 0.4729 \| 0.6260 \| 0.3424 \| -2183.2278 \| -2591.2505 \| -1.5095 \| -1.5565 \|
	\| 0.6421 \| 0.29 \| 1100 \| 0.6513 \| 0.5281 \| 0.1941 \| 0.6640 \| 0.3340 \| -2211.1040 \| -2619.9661 \| -1.5382 \| -1.5785 \|
	\| 0.6217 \| 0.31 \| 1200 \| 0.6436 \| 0.7372 \| 0.3396 \| 0.6460 \| 0.3976 \| -2196.5581 \| -2599.0544 \| -1.6345 \| -1.6765 \|
	\| 0.7365 \| 0.34 \| 1300 \| 0.6400 \| 0.9183 \| 0.5227 \| 0.6240 \| 0.3956 \| -2178.2437 \| -2580.9446 \| -1.5597 \| -1.6009 \|
	\| 0.7057 \| 0.37 \| 1400 \| 0.6468 \| 0.9514 \| 0.5619 \| 0.6140 \| 0.3895 \| -2174.3254 \| -2577.6377 \| -1.6716 \| -1.7117 \|
	\| 0.6396 \| 0.39 \| 1500 \| 0.6498 \| 0.9546 \| 0.5405 \| 0.6400 \| 0.4141 \| -2176.4675 \| -2577.3193 \| -1.6244 \| -1.6600 \|
	\| 0.5835 \| 0.42 \| 1600 \| 0.6488 \| 0.9504 \| 0.5356 \| 0.6480 \| 0.4148 \| -2176.9568 \| -2577.7402 \| -1.6255 \| -1.6706 \|
	\| 0.629 \| 0.44 \| 1700 \| 0.6501 \| 1.2484 \| 0.8056 \| 0.6100 \| 0.4428 \| -2149.9568 \| -2547.9316 \| -1.5737 \| -1.6192 \|
	\| 0.6495 \| 0.47 \| 1800 \| 0.6440 \| 1.2029 \| 0.7629 \| 0.6280 \| 0.4400 \| -2154.2307 \| -2552.4846 \| -1.4589 \| -1.4973 \|
	\| 0.6465 \| 0.5 \| 1900 \| 0.6641 \| 0.2111 \| -0.0941 \| 0.6280 \| 0.3052 \| -2239.9255 \| -2651.6641 \| -1.4961 \| -1.5323 \|
	\| 0.6866 \| 0.52 \| 2000 \| 0.6480 \| 0.5747 \| 0.1977 \| 0.6600 \| 0.3770 \| -2210.75 \| -2615.3054 \| -1.4509 \| -1.4934 \|
	\| 0.6441 \| 0.55 \| 2100 \| 0.6358 \| 0.8809 \| 0.4502 \| 0.6480 \| 0.4307 \| -2185.4985 \| -2584.6841 \| -1.4418 \| -1.4842 \|
	\| 0.6752 \| 0.58 \| 2200 \| 0.6346 \| 0.9311 \| 0.5075 \| 0.6560 \| 0.4236 \| -2179.7668 \| -2579.6636 \| -1.3193 \| -1.3656 \|
	\| 0.5646 \| 0.6 \| 2300 \| 0.6396 \| 0.6599 \| 0.2912 \| 0.6480 \| 0.3686 \| -2201.3948 \| -2606.7883 \| -1.2832 \| -1.3116 \|
	\| 0.6519 \| 0.63 \| 2400 \| 0.6451 \| 0.4237 \| 0.0937 \| 0.6400 \| 0.3300 \| -2221.1460 \| -2630.4050 \| -1.4460 \| -1.4777 \|
	\| 0.6292 \| 0.65 \| 2500 \| 0.6313 \| 0.8682 \| 0.4231 \| 0.6460 \| 0.4452 \| -2188.2095 \| -2585.9512 \| -1.4040 \| -1.4397 \|
	\| 0.5985 \| 0.68 \| 2600 \| 0.6274 \| 0.8396 \| 0.3650 \| 0.6640 \| 0.4746 \| -2194.0144 \| -2588.8174 \| -1.3580 \| -1.3860 \|
	\| 0.6323 \| 0.71 \| 2700 \| 0.6328 \| 0.6585 \| 0.2012 \| 0.6640 \| 0.4573 \| -2210.3958 \| -2606.9260 \| -1.2622 \| -1.2938 \|
	\| 0.6174 \| 0.73 \| 2800 \| 0.6305 \| 0.8505 \| 0.3762 \| 0.6580 \| 0.4744 \| -2192.8989 \| -2587.7209 \| -1.3312 \| -1.3635 \|
	\| 0.5972 \| 0.76 \| 2900 \| 0.6310 \| 0.6521 \| 0.2290 \| 0.6600 \| 0.4231 \| -2207.6130 \| -2607.5659 \| -1.3492 \| -1.3840 \|
	\| 0.6645 \| 0.79 \| 3000 \| 0.6291 \| 0.7035 \| 0.2579 \| 0.6520 \| 0.4456 \| -2204.7251 \| -2602.4238 \| -1.3330 \| -1.3678 \|
	\| 0.5786 \| 0.81 \| 3100 \| 0.6310 \| 0.5452 \| 0.1222 \| 0.6580 \| 0.4230 \| -2218.2944 \| -2618.2534 \| -1.3173 \| -1.3498 \|
	\| 0.604 \| 0.84 \| 3200 \| 0.6375 \| 0.3327 \| -0.0527 \| 0.6540 \| 0.3854 \| -2235.7852 \| -2639.5032 \| -1.3444 \| -1.3760 \|
	\| 0.6704 \| 0.86 \| 3300 \| 0.6269 \| 0.7327 \| 0.2896 \| 0.6540 \| 0.4431 \| -2201.5579 \| -2599.5049 \| -1.3241 \| -1.3585 \|
	\| 0.6365 \| 0.89 \| 3400 \| 0.6271 \| 0.6900 \| 0.2577 \| 0.6560 \| 0.4323 \| -2204.7437 \| -2603.7739 \| -1.3038 \| -1.3371 \|
	\| 0.6621 \| 0.92 \| 3500 \| 0.6279 \| 0.6303 \| 0.2073 \| 0.6580 \| 0.4230 \| -2209.7827 \| -2609.7432 \| -1.2991 \| -1.3321 \|
	\| 0.6597 \| 0.94 \| 3600 \| 0.6294 \| 0.5540 \| 0.1441 \| 0.6580 \| 0.4099 \| -2216.1082 \| -2617.3774 \| -1.3028 \| -1.3348 \|
	\| 0.671 \| 0.97 \| 3700 \| 0.6285 \| 0.5945 \| 0.1774 \| 0.6600 \| 0.4171 \| -2212.7783 \| -2613.3303 \| -1.3033 \| -1.3358 \|
	\| 0.6328 \| 0.99 \| 3800 \| 0.6283 \| 0.5985 \| 0.1803 \| 0.6580 \| 0.4182 \| -2212.4902 \| -2612.9258 \| -1.3032 \| -1.3356 \|


	### Framework versions

	- Transformers 4.36.2
	- Pytorch 2.1.2
	- Datasets 2.14.6
	- Tokenizers 0.15.2