---
language:
  - en
pipeline_tag: text-generation
tags:
  - gguf
  - llama.cpp
  - lmstudio
  - traffic-signal-control
  - simulation
license: cc-by-nc-4.0
---

# DeepSignal-4B-V1 (GGUF)

This repository provides a GGUF model file for local inference (e.g., `llama.cpp` / LM Studio). It is intended for traffic-signal-control analysis and related text-generation workflows.
For details, check our repository at [`AIMSLaboratory/DeepSignal`](https://github.com/AIMSLaboratory/DeepSignal).


## Files

- `DeepSignal-4B_V1.F16.gguf`
- `config.json`

## Quickstart (llama.cpp)

```bash
llama-cli -m DeepSignal-4B_V1.F16.gguf -p "You are a traffic management expert. You can use your traffic knowledge to solve the traffic signal control task.
Based on the given traffic {scene} and {state}, predict the next signal phase and its duration.
You must answer directly, the format must be: next signal phase: {number}, duration: {seconds} seconds
where the number is the phase index (starting from 0) and the seconds is the duration (usually between 20-90 seconds)."
```

*You need to input the {scene} (total number of phases, which phases controls which lanes/directions and current phase ID/number, etc) and {state} (number of queing vehicles per lane, throughout vehicles per lane during the current phase, etc)*

## Evaluation (Traffic Simulation)

### Performance Metrics Comparison by Model *

| Model | Avg Saturation | Avg Cumulative Queue Length (veh⋅min) | Avg Throughput (veh/5min) | Avg Response Time (s) |
|:---:|:---:|:---:|:---:|:---:|
| [`GPT-OSS-20B (thinking)`](https://huggingface.co/openai/gpt-oss-20b) | 0.380 | 14.088 | 77.910 | 6.768 |
| **DeepSignal-4B (Ours)** | 0.422 | 15.703 | **79.883** | 2.131 |
| [`Qwen3-30B-A3B`](https://huggingface.co/Qwen/Qwen3-VL-30B-A3B-Instruct) | 0.431 | 17.046 | 79.059 | 2.727 |
| [`Qwen3-4B`](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) | 0.466 | 57.699 | 75.712 | 1.994 |
| Max Pressure | 0.465 | 23.022 | 77.236 | ** |
| [`LightGPT-8B-Llama3`](https://huggingface.co/lightgpt/LightGPT-8B-Llama3) | 0.523 | 54.384 | 75.512 | 3.025*** |

`*`: Each simulation scenario runs for 60 minutes. We discard the first **5 minutes** as warm-up, then compute metrics over the next **20 minutes** (minute 5 to 25). We cap the evaluation window because, when an LLM controls signal timing for only a single intersection, spillback from neighboring intersections may occur after ~20+ minutes and destabilize the scenario. All evaluations are conducted on a **Mac Studio M3 Ultra**.  
`**`: Max Pressure is a fixed signal-timing optimization algorithm (not an LLM), so we omit its Avg Response Time; this metric is only defined for LLM-based signal-timing optimization.  
`***`: For LightGPT-8B-Llama3, Avg Response Time is computed using only the successful responses.

# License
This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).
Commercial use is strictly prohibited.