Instructions to use adept/fuyu-8b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use adept/fuyu-8b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="adept/fuyu-8b")# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("adept/fuyu-8b") model = AutoModelForImageTextToText.from_pretrained("adept/fuyu-8b") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use adept/fuyu-8b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "adept/fuyu-8b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "adept/fuyu-8b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/adept/fuyu-8b
- SGLang
How to use adept/fuyu-8b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "adept/fuyu-8b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "adept/fuyu-8b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "adept/fuyu-8b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "adept/fuyu-8b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use adept/fuyu-8b with Docker Model Runner:
docker model run hf.co/adept/fuyu-8b
How does the Fuyu model Get images?
The Q above, because from what im seeing, you take an image, split it into rows, and give that to the model, and it supossably has no real difference from permission 8b. Like how are the images going in? From what I can tell, youre not making image embeddings, so hows the model understanding images?
Hi @VatsaDev , not sure I understand your question exactly but the model does have a vision layer. It is simply linear, but it does create an embedding vector of required dimension from each patch. Then as you said the embeddings are combined with the text embeddings from the prompt tokens and fed into a Persimmon-8b like architecture.
I recommend inspecting the modeling code here to get a better sense of what the model is doing: https://github.com/huggingface/transformers/blob/9beb2737d758160e845b66742a0c01201e38007f/src/transformers/models/fuyu/modeling_fuyu.py#L154C1-L158C10
ok, so your visual layer is turning images to embeddings through an nn.linear class?
Did you really have to train it, or does image to embedding just work?
Also, Im sorry if this is too much, but im new to pytorch, learning it, could you give me code example of image -> embedding -> image?
ok, so your visual layer is turning images to embeddings through an
nn.linearclass?Did you really have to train it, or does image to embedding just work?
Also, Im sorry if this is too much, but im new to pytorch, learning it, could you give me code example of image -> embedding -> image?
The linear layer has to be trained.