Instructions to use callgg/fastvlm-caption with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use callgg/fastvlm-caption with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("callgg/fastvlm-caption", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
Upload processor_config.json with huggingface_hub
Browse files- processor_config.json +7 -0
processor_config.json
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"image_token": "<image>",
|
| 3 |
+
"num_additional_image_tokens": 0,
|
| 4 |
+
"patch_size": 64,
|
| 5 |
+
"processor_class": "LlavaProcessor",
|
| 6 |
+
"vision_feature_select_strategy": null
|
| 7 |
+
}
|