Image-Text-to-Text
Transformers
Safetensors
qwen3_vl
graph
multi-task
multi-modal
scene-graph
event-graph
molecular-graph
conversational
Instructions to use zmli/G-Substrate-Qwen3-VL-2B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use zmli/G-Substrate-Qwen3-VL-2B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="zmli/G-Substrate-Qwen3-VL-2B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("zmli/G-Substrate-Qwen3-VL-2B") model = AutoModelForImageTextToText.from_pretrained("zmli/G-Substrate-Qwen3-VL-2B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use zmli/G-Substrate-Qwen3-VL-2B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "zmli/G-Substrate-Qwen3-VL-2B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zmli/G-Substrate-Qwen3-VL-2B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/zmli/G-Substrate-Qwen3-VL-2B
- SGLang
How to use zmli/G-Substrate-Qwen3-VL-2B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "zmli/G-Substrate-Qwen3-VL-2B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zmli/G-Substrate-Qwen3-VL-2B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "zmli/G-Substrate-Qwen3-VL-2B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "zmli/G-Substrate-Qwen3-VL-2B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use zmli/G-Substrate-Qwen3-VL-2B with Docker Model Runner:
docker model run hf.co/zmli/G-Substrate-Qwen3-VL-2B
G-Substrate (Qwen3-VL-2B)
Multi-task fine-tuned model from the paper "Graph is a Substrate Across Data Modalities" (ICML 2026).
Model Description
This model is fine-tuned from Qwen3-VL-2B-Instruct using the G-Substrate framework. G-Substrate treats graph structure as a persistent structural substrate that accumulates knowledge across heterogeneous data modalities and tasks. It employs a unified structural schema for compatibility and an interleaved role-based training strategy.
Training Details
- Base model: Qwen3-VL-2B-Instruct
- Training method: Full fine-tuning (SFT) with multi-task learning
- Training data: Scene graphs, event graphs, molecular graphs, graph algorithmic tasks, and interleaved role-based data
- Epochs: 2
- Learning rate: 8e-6 (cosine schedule, warmup 10%)
- Batch size: 1 per device, 32 gradient accumulation steps
- GPUs: 2x NVIDIA A100 (DeepSpeed ZeRO-3)
Supported Tasks
| Task | Domain | Input | Output |
|---|---|---|---|
| Scene Graph Generation | Visual | Image | Graph triplets |
| Event Relation Extraction | Text | Document + events | Event relation graph |
| Molecular Graph Description | Scientific | SMILES + graph | Natural language description |
| Graph Algorithmic Reasoning | Algorithmic | Graph description | Algorithm answer |
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("zmli/G-Substrate-Qwen3-VL-2B", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("zmli/G-Substrate-Qwen3-VL-2B", trust_remote_code=True)
For batch inference with vLLM, see the G-Substrate repository.
Results
| CT | CD | SP | BM | BLEU-4 | ROUGE-L | PCIs R@50 | MA-S F1 | MA-T F1 | MA-C F1 | HiE F1 |
|---|---|---|---|---|---|---|---|---|---|---|
| 98.41 | 96.97 | 48.59 | 94.54 | 51.53 | 68.47 | 25.38 | 52.20 | 42.68 | 40.91 | 25.15 |
Citation
@inproceedings{li2026gsubstrate,
title={Graph is a Substrate Across Data Modalities},
author={Li, Ziming and Wu, Xiaoming and Wang, Zehong and Li, Jiazheng and Tian, Yijun and Bi, Jinhe and Ma, Yunpu and Ye, Yanfang and Zhang, Chuxu},
booktitle={ICML},
year={2026}
}
- Downloads last month
- 29
Model tree for zmli/G-Substrate-Qwen3-VL-2B
Paper for zmli/G-Substrate-Qwen3-VL-2B
Paper • 2601.22384 • Published