Text Generation
Transformers
TensorBoard
Safetensors
Hindi
English
gpt2
text2text-generation
text-generation-inference
Instructions to use kvrma/finetunemodel with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use kvrma/finetunemodel with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="kvrma/finetunemodel")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("kvrma/finetunemodel") model = AutoModelForCausalLM.from_pretrained("kvrma/finetunemodel") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use kvrma/finetunemodel with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "kvrma/finetunemodel" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kvrma/finetunemodel", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/kvrma/finetunemodel
- SGLang
How to use kvrma/finetunemodel with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "kvrma/finetunemodel" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kvrma/finetunemodel", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "kvrma/finetunemodel" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kvrma/finetunemodel", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use kvrma/finetunemodel with Docker Model Runner:
docker model run hf.co/kvrma/finetunemodel
Commit History
Update README.md 0150156 verified
Update README.md bde2384 verified
mahek jasani commited on
Update README.md 1b605f0 verified
mahek jasani commited on
Update README.md 2d49fe3 verified
mahek jasani commited on
Update README.md 8754eb3 verified
mahek jasani commited on
Update README.md f88d3c2 verified
mahek jasani commited on
Update README.md 79a56cf verified
mahek jasani commited on
Upload 11 files 214ffe4 verified
mahek jasani commited on
Upload 4 files 16a82f9 verified
mahek jasani commited on
Create README.md 1aef375 verified
mahek jasani commited on
Upload 4 files 5042aa2 verified
mahek jasani commited on
initial commit 64b9748 verified
mahek jasani commited on