| # Inference | |
| Two runtime tracks are provided: | |
| - `full_precision/`: single-image inference, multi-turn chat, and FastAPI service | |
| - `int4_quantized/`: single-image inference, multi-turn chat, and FastAPI service for the INT4 path | |
| Model weights directory: | |
| - `./checkpoints` | |