Instructions to use programasweights/paw-4b-gpt2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use programasweights/paw-4b-gpt2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="programasweights/paw-4b-gpt2")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("programasweights/paw-4b-gpt2", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use programasweights/paw-4b-gpt2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "programasweights/paw-4b-gpt2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "programasweights/paw-4b-gpt2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/programasweights/paw-4b-gpt2
- SGLang
How to use programasweights/paw-4b-gpt2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "programasweights/paw-4b-gpt2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "programasweights/paw-4b-gpt2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "programasweights/paw-4b-gpt2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "programasweights/paw-4b-gpt2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use programasweights/paw-4b-gpt2 with Docker Model Runner:
docker model run hf.co/programasweights/paw-4b-gpt2
| { | |
| "add_prefix_space": false, | |
| "backend": "tokenizers", | |
| "bos_token": null, | |
| "clean_up_tokenization_spaces": false, | |
| "eos_token": "<|im_end|>", | |
| "errors": "replace", | |
| "extra_special_tokens": { | |
| "<prefix_1>": "<prefix_1>", | |
| "<prefix_2>": "<prefix_2>", | |
| "<prefix_3>": "<prefix_3>", | |
| "<prefix_4>": "<prefix_4>", | |
| "<prefix_5>": "<prefix_5>", | |
| "<prefix_6>": "<prefix_6>", | |
| "<prefix_7>": "<prefix_7>", | |
| "<prefix_8>": "<prefix_8>", | |
| "<prefix_9>": "<prefix_9>", | |
| "<prefix_10>": "<prefix_10>", | |
| "<prefix_11>": "<prefix_11>", | |
| "<prefix_12>": "<prefix_12>", | |
| "<prefix_13>": "<prefix_13>", | |
| "<prefix_14>": "<prefix_14>", | |
| "<prefix_15>": "<prefix_15>", | |
| "<prefix_16>": "<prefix_16>", | |
| "<prefix_17>": "<prefix_17>", | |
| "<prefix_18>": "<prefix_18>", | |
| "<prefix_19>": "<prefix_19>", | |
| "<prefix_20>": "<prefix_20>", | |
| "<prefix_21>": "<prefix_21>", | |
| "<prefix_22>": "<prefix_22>", | |
| "<prefix_23>": "<prefix_23>", | |
| "<prefix_24>": "<prefix_24>", | |
| "<prefix_25>": "<prefix_25>", | |
| "<prefix_26>": "<prefix_26>", | |
| "<prefix_27>": "<prefix_27>", | |
| "<prefix_28>": "<prefix_28>", | |
| "<prefix_29>": "<prefix_29>", | |
| "<prefix_30>": "<prefix_30>", | |
| "<prefix_31>": "<prefix_31>", | |
| "<prefix_32>": "<prefix_32>", | |
| "<prefix_33>": "<prefix_33>", | |
| "<prefix_34>": "<prefix_34>", | |
| "<prefix_35>": "<prefix_35>", | |
| "<prefix_36>": "<prefix_36>", | |
| "<prefix_37>": "<prefix_37>", | |
| "<prefix_38>": "<prefix_38>", | |
| "<prefix_39>": "<prefix_39>", | |
| "<prefix_40>": "<prefix_40>", | |
| "<prefix_41>": "<prefix_41>", | |
| "<prefix_42>": "<prefix_42>", | |
| "<prefix_43>": "<prefix_43>", | |
| "<prefix_44>": "<prefix_44>", | |
| "<prefix_45>": "<prefix_45>", | |
| "<prefix_46>": "<prefix_46>", | |
| "<prefix_47>": "<prefix_47>", | |
| "<prefix_48>": "<prefix_48>", | |
| "<prefix_49>": "<prefix_49>", | |
| "<prefix_50>": "<prefix_50>", | |
| "<prefix_51>": "<prefix_51>", | |
| "<prefix_52>": "<prefix_52>", | |
| "<prefix_53>": "<prefix_53>", | |
| "<prefix_54>": "<prefix_54>", | |
| "<prefix_55>": "<prefix_55>", | |
| "<prefix_56>": "<prefix_56>", | |
| "<prefix_57>": "<prefix_57>", | |
| "<prefix_58>": "<prefix_58>", | |
| "<prefix_59>": "<prefix_59>", | |
| "<prefix_60>": "<prefix_60>", | |
| "<prefix_61>": "<prefix_61>", | |
| "<prefix_62>": "<prefix_62>", | |
| "<prefix_63>": "<prefix_63>", | |
| "<prefix_64>": "<prefix_64>" | |
| }, | |
| "is_local": true, | |
| "model_max_length": 1010000, | |
| "pad_token": "<|endoftext|>", | |
| "split_special_tokens": false, | |
| "tokenizer_class": "Qwen2Tokenizer", | |
| "unk_token": null | |
| } |