Instructions to use HuggingFaceM4/VLM_WebSight_finetuned with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use HuggingFaceM4/VLM_WebSight_finetuned with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="HuggingFaceM4/VLM_WebSight_finetuned", trust_remote_code=True)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("HuggingFaceM4/VLM_WebSight_finetuned", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use HuggingFaceM4/VLM_WebSight_finetuned with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "HuggingFaceM4/VLM_WebSight_finetuned" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HuggingFaceM4/VLM_WebSight_finetuned", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/HuggingFaceM4/VLM_WebSight_finetuned
- SGLang
How to use HuggingFaceM4/VLM_WebSight_finetuned with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "HuggingFaceM4/VLM_WebSight_finetuned" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HuggingFaceM4/VLM_WebSight_finetuned", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "HuggingFaceM4/VLM_WebSight_finetuned" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "HuggingFaceM4/VLM_WebSight_finetuned", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use HuggingFaceM4/VLM_WebSight_finetuned with Docker Model Runner:
docker model run hf.co/HuggingFaceM4/VLM_WebSight_finetuned
Commit History
lets try throught he config 1b2ecd5
let it figure out automatically 9baf545
perhaps with this syntax a263d03
trying that 0e28336
perhaps it's the model type b7b417b
fix 0bfa212
fix 719f253
typo 7aa0923
trying the autoconfig 1f31f46
nope, wasn't that 8d33f67
not sure what that syntax is d8ae54e
autoimageprocesor c8747a5
auto map to preprocessor_config.json a8fbbf3
add auto_map b0eec2b
cleaning b2fcf7c
try the remote f91fa43
autoprocessor class db0b5ed
git add image processing 9da4b20
change type 8e32f98
model weights 3f133cb
configs a9211f5
modeling 9505bbc
initial commit a1abacc
Victor Sanh commited on