Text Generation
Transformers
Safetensors
llama
model-collaboration
instruction-following
conversational
Instructions to use bunsenfeng/PFA_switcher_1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use bunsenfeng/PFA_switcher_1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="bunsenfeng/PFA_switcher_1") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("bunsenfeng/PFA_switcher_1", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use bunsenfeng/PFA_switcher_1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "bunsenfeng/PFA_switcher_1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "bunsenfeng/PFA_switcher_1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/bunsenfeng/PFA_switcher_1
- SGLang
How to use bunsenfeng/PFA_switcher_1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "bunsenfeng/PFA_switcher_1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "bunsenfeng/PFA_switcher_1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "bunsenfeng/PFA_switcher_1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "bunsenfeng/PFA_switcher_1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use bunsenfeng/PFA_switcher_1 with Docker Model Runner:
docker model run hf.co/bunsenfeng/PFA_switcher_1
Improve model card for Switch Generation model with paper, GitHub links, usage, and metadata
#1
by nielsr HF Staff - opened
This PR significantly improves the model card for the "Switch Generation" model.
Key updates include:
- Comprehensive Description: The boilerplate text has been replaced with a detailed summary derived from the paper's abstract, explaining the novel "Switch Generation" concept.
- Metadata Enrichment:
- The
pipeline_tag: text-generationhas been added for better discoverability on the Hugging Face Hub. - Relevant tags such as
llama,model-collaboration, andinstruction-followinghave been included. - The
base_modelhas been explicitly listed (allenai/Llama-3.1-Tulu-3-8B). - The license is set to
otheras no explicit license was found in the source materials.
- The
- Linked Resources: Direct links to the academic paper (Don't Throw Away Your Pretrained Model) and the associated GitHub repository (
https://github.com/BunsenFeng/switch_generation) have been added. - Getting Started Guide: A "How to Get Started" section, including code snippets for environment setup and inference, has been extracted directly from the GitHub README.
These changes make the model card much more informative and user-friendly for researchers and practitioners.
bunsenfeng changed pull request status to merged