Instructions to use Pection/llama3-finetune with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Pection/llama3-finetune with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Pection/llama3-finetune")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Pection/llama3-finetune") model = AutoModelForCausalLM.from_pretrained("Pection/llama3-finetune") - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Pection/llama3-finetune with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Pection/llama3-finetune" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Pection/llama3-finetune", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Pection/llama3-finetune
- SGLang
How to use Pection/llama3-finetune with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Pection/llama3-finetune" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Pection/llama3-finetune", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Pection/llama3-finetune" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Pection/llama3-finetune", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use Pection/llama3-finetune with Docker Model Runner:
docker model run hf.co/Pection/llama3-finetune
Commit History
Update config.json 040728d verified
Update README.md bf315e2 verified
Update config.json ac5be09 verified
Update config.json 8546a04 verified
Update README.md b0cff90 verified
Update README.md ae0b463 verified
Update README.md 95cdb69 verified
Update README.md f57ef81 verified
Update README.md c882bc2 verified
Update README.md 6a1452a verified
Update README.md 54d383b verified
Update README.md d33e1d8 verified
Add: json 3f7a57f
Naphat commited on
Del: json f259c6e
Naphat commited on
Add: config.json b26d2a0
Naphat commited on
Delete config.json 892ee32
Naphat commited on
Update: git attributes d2fe434
Naphat commited on
Update: config.json 2902b77
Naphat commited on
Update README.md f3856fc verified
Update README.md 917ed89 verified
Update README.md e96efe7 verified
Update README.md 52602de verified
Update README.md 8242bc6 verified
Update README.md 4359a3c verified
Update: config file a480b5f
Naphat commited on
Configure Git LFS for large files d4561ad
Naphat commited on
Update: config file 755aa74
Naphat commited on
Add model files using Git LFS ec94218
Naphat commited on
README.md 5db139f
Naphat commited on