Image-Text-to-Text
PEFT
Safetensors
English
vision-language
autonomous-driving
faithfulness
critic
lora
grpo-reward
waypoint-prediction
Instructions to use mjf-su/FaithfulnessCritic with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use mjf-su/FaithfulnessCritic with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,165 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
library_name: peft
|
| 6 |
+
base_model: Qwen/Qwen3-VL-4B-Instruct
|
| 7 |
+
pipeline_tag: image-text-to-text
|
| 8 |
+
tags:
|
| 9 |
+
- vision-language
|
| 10 |
+
- autonomous-driving
|
| 11 |
+
- faithfulness
|
| 12 |
+
- critic
|
| 13 |
+
- lora
|
| 14 |
+
- grpo-reward
|
| 15 |
+
- waypoint-prediction
|
| 16 |
+
---
|
| 17 |
+
|
| 18 |
+
# FaithfulnessCritic
|
| 19 |
+
|
| 20 |
+
LoRA adapters over **Qwen3-VL-4B-Instruct** that score whether a vision-language driving planner's **reasoning (R)**, **meta-action (A)**, and **24-step waypoint plan (W)** are mutually self-consistent given the camera scene.
|
| 21 |
+
|
| 22 |
+
The critic emits a single token directly after a forced `<verdict>` prefix; the score `P(CONSISTENT) β (0,1)` is recovered by softmaxing the logits over the two single-token verdict words `CONSISTENT` and `INCONSISTENT`. The model is intended as a frozen reward signal during GRPO planner training and as a faithfulness-auditing tool offline.
|
| 23 |
+
|
| 24 |
+
## Variants
|
| 25 |
+
|
| 26 |
+
The repo contains four adapter checkpoints under separate subfolders. They differ in (i) which **input class** the critic sees and (ii) which **counterfactual augmentation** strategies were used to construct the negative training examples.
|
| 27 |
+
|
| 28 |
+
| Subfolder | Input class | Negative strategies | Notes |
|
| 29 |
+
|---|---|---|---|
|
| 30 |
+
| `GB-S12` | BEV plot + speed profile | S1, S2 | Lighter β no scene-description corruption. |
|
| 31 |
+
| `GB-S123` | BEV plot + speed profile | S1, S2, S3 | All three failure modes. |
|
| 32 |
+
| `GP-S12` | Forward camera overlay + speed | S1, S2 | First-person view; uses calibration parquets. |
|
| 33 |
+
| `GP-S123` | Forward camera overlay + speed | S1, S2, S3 | All three failure modes. |
|
| 34 |
+
|
| 35 |
+
Where:
|
| 36 |
+
- **GB** = Gemini-curated dataset, **B**EV input.
|
| 37 |
+
- **GP** = Gemini-curated dataset, first-**P**erson input.
|
| 38 |
+
- **S1** β waypoint substitution: `W` replaced with geometrically incompatible donor waypoints.
|
| 39 |
+
- **S2** β move-justification substitution: only `R.move_justification` is swapped from a donor.
|
| 40 |
+
- **S3** β scene description substitution: `R.scene` is swapped from a different scene.
|
| 41 |
+
|
| 42 |
+
Validation sets always include all three strategies in equal proportions, regardless of training mix, so the variants are directly comparable on the same benchmark.
|
| 43 |
+
|
| 44 |
+
## Quick start
|
| 45 |
+
|
| 46 |
+
Each subfolder is a standalone PEFT adapter. Load it on top of the base VLM:
|
| 47 |
+
|
| 48 |
+
```python
|
| 49 |
+
import torch
|
| 50 |
+
from peft import PeftModel
|
| 51 |
+
from transformers import AutoModelForImageTextToText, AutoProcessor
|
| 52 |
+
|
| 53 |
+
BASE = "Qwen/Qwen3-VL-4B-Instruct"
|
| 54 |
+
ADAPTER = "mjf-su/FaithfulnessCritic"
|
| 55 |
+
SUBFOLDER = "GB-S12" # or GB-S123, GP-S12, GP-S123
|
| 56 |
+
|
| 57 |
+
processor = AutoProcessor.from_pretrained(BASE, trust_remote_code=True)
|
| 58 |
+
processor.tokenizer.padding_side = "left"
|
| 59 |
+
|
| 60 |
+
base = AutoModelForImageTextToText.from_pretrained(
|
| 61 |
+
BASE, dtype=torch.bfloat16, trust_remote_code=True,
|
| 62 |
+
)
|
| 63 |
+
model = PeftModel.from_pretrained(base, ADAPTER, subfolder=SUBFOLDER)
|
| 64 |
+
model.eval().to("cuda")
|
| 65 |
+
|
| 66 |
+
# Build the chat-template prompt with image(s) + text and append "<verdict>"
|
| 67 |
+
# at the end so the next-token logits are over CONSISTENT / INCONSISTENT.
|
| 68 |
+
# See `critic_rewards.py:CriticRewardBase._build_prompt` for the full template
|
| 69 |
+
# and `_score_logit_mode` for the scoring path used to produce P(CONSISTENT).
|
| 70 |
+
```
|
| 71 |
+
|
| 72 |
+
The reference end-to-end pipeline lives at https://github.com/mjf-su/fms4navigation under `critic_library/Gemini_samples/{BEV,fPOV}/`.
|
| 73 |
+
|
| 74 |
+
## Inputs
|
| 75 |
+
|
| 76 |
+
A single triplet `(Image, R, A, W)`:
|
| 77 |
+
- **Image** β forward-facing camera frame of the driving scene.
|
| 78 |
+
- `GB-*` adapters consume a BEV trajectory plot + a speed-vs-time strip rendered purely from `W`.
|
| 79 |
+
- `GP-*` adapters consume the camera frame with `W` projected as a teal polyline (full calibration + egomotion required) plus the same speed strip.
|
| 80 |
+
- **R** β `<think>{ "scene": ..., "move_justification": ... }</think>`.
|
| 81 |
+
- **A** β `<action> Longitudinal: <label> | Lateral: <label> </action>` from the canonical 7-longitudinal Γ 11-lateral vocabulary.
|
| 82 |
+
- **W** β 24 lines of `<wp>[x, y, ΞΈ]</wp>`, vehicle-relative, 0.25 s spacing, 6 s horizon.
|
| 83 |
+
|
| 84 |
+
## Output
|
| 85 |
+
|
| 86 |
+
The critic emits a single token after a forced `<verdict>` prefix. Two scoring paths are supported:
|
| 87 |
+
|
| 88 |
+
| Mode | What it does | Range |
|
| 89 |
+
|---|---|---|
|
| 90 |
+
| `logit` (default) | Softmax over the two single-token verdict ids at the prompt's last position. | `P(CONSISTENT) β (0,1)` |
|
| 91 |
+
| `generate` | Greedy-decode 8 tokens, regex-parse `CONSISTENT` / `INCONSISTENT`. | `{0.0, 0.5, 1.0}` |
|
| 92 |
+
|
| 93 |
+
Use `logit` mode for reward signals (smooth) and `generate` mode for human-readable verdicts.
|
| 94 |
+
|
| 95 |
+
## Training
|
| 96 |
+
|
| 97 |
+
- **Base**: Qwen/Qwen3-VL-4B-Instruct (frozen).
|
| 98 |
+
- **Adaptation**: LoRA (`r=256`, `lr=1e-4`).
|
| 99 |
+
- **Loss**: standard SFT next-token cross-entropy, supervising only the `CONSISTENT` / `INCONSISTENT` verdict token.
|
| 100 |
+
- **Positives**: ground-truth `(R, A, W)` triplets from a Gemini-curated subset of [PhysicalAI-Reason-US](https://huggingface.co/datasets/mjf-su/PhysicalAI-Reason-US).
|
| 101 |
+
- **Negatives**: counterfactual triplets built per strategy; donor eligibility requires both action axes to differ, different `scene_id`, same train/val split.
|
| 102 |
+
|
| 103 |
+
## Evaluation
|
| 104 |
+
|
| 105 |
+
Each variant scored 125 randomly drawn (`seed=42`) planner outputs from two driving VLM planners, with `gemini-3-pro-preview` (few-shot, system-prompt + 6 worked examples) used as the LLM judge. Per-axis verdicts are aggregated to a single `overall β {CONSISTENT, INCONSISTENT, AMBIGUOUS}`. **Agreement = accuracy treating Gemini's `overall` as ground truth**, computed on the subset where both Gemini and the critic returned a non-null verdict (Gemini parse failures and `AMBIGUOUS` are skipped).
|
| 106 |
+
|
| 107 |
+
```
|
| 108 |
+
Planner Critic Agreement P R F1 ΞΌP|C ΞΌP|IC
|
| 109 |
+
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 110 |
+
MetaAction-1e GB-S12 0.764 0.763 0.750 0.756 0.750 0.222
|
| 111 |
+
MetaAction-1e GB-S123 0.724 0.732 0.683 0.707 0.683 0.238
|
| 112 |
+
MetaAction-1e GP-S12 0.732 0.729 0.717 0.723 0.717 0.254
|
| 113 |
+
MetaAction-1e GP-S123 0.732 0.737 0.700 0.718 0.700 0.238
|
| 114 |
+
ADEnReward GB-S12 0.694 0.672 0.717 0.694 0.717 0.328
|
| 115 |
+
ADEnReward GB-S123 0.653 0.644 0.633 0.639 0.633 0.328
|
| 116 |
+
ADEnReward GP-S12 0.734 0.714 0.750 0.732 0.750 0.281
|
| 117 |
+
ADEnReward GP-S123 0.694 0.696 0.650 0.672 0.650 0.266
|
| 118 |
+
```
|
| 119 |
+
|
| 120 |
+
- **P / R / F1** treat `CONSISTENT` as the positive class.
|
| 121 |
+
- **ΞΌP\|C** β mean critic `P(CONSISTENT)` on Gemini-CONSISTENT records (higher is better).
|
| 122 |
+
- **ΞΌP\|IC** β mean critic `P(CONSISTENT)` on Gemini-INCONSISTENT records (lower is better; the spread `ΞΌP|C β ΞΌP|IC` β 0.45β0.53 across variants indicates the critic is well-discriminating despite a non-trivial decision-boundary error rate).
|
| 123 |
+
|
| 124 |
+
Best per planner: `GB-S12` for MetaAction-1e (0.764), `GP-S12` for ADEnReward (0.734). Adding S3 (scene-description corruption) to the training mix did not improve agreement on either planner in this benchmark.
|
| 125 |
+
|
| 126 |
+
## Intended use
|
| 127 |
+
|
| 128 |
+
- Frozen reward model in GRPO/PPO planner fine-tuning where faithfulness of the (R, A, W) chain matters.
|
| 129 |
+
- Offline auditing of candidate planner outputs.
|
| 130 |
+
- Counterfactual-failure-mode analysis when paired with the variant ablation (S12 vs S123).
|
| 131 |
+
|
| 132 |
+
## Out-of-scope use
|
| 133 |
+
|
| 134 |
+
- The critic is **not** a safety verifier. A `CONSISTENT` verdict means R/A/W are mutually self-consistent and consistent with the scene; it does **not** mean the trajectory is collision-free, comfortable, or legally compliant.
|
| 135 |
+
- The critic was trained on a US-centric driving dataset; performance on non-US driving cultures, weather conditions, or sensor configurations not present in the training set is unverified.
|
| 136 |
+
- Single-camera, single-frame input only β no temporal stack, no surround views.
|
| 137 |
+
|
| 138 |
+
## Limitations
|
| 139 |
+
|
| 140 |
+
- Greedy decoding only in `generate` mode; the reward signal is best read via `logit` mode.
|
| 141 |
+
- The critic occasionally produces `null` (parse / render failure) when calibration parquets or camera frames are missing β see `n_critic_failure` in the eval summaries.
|
| 142 |
+
- Like the judge it's evaluated against, the critic can be confidently wrong on edge cases involving rare action combinations (lane-change-during-pull-over, etc.).
|
| 143 |
+
|
| 144 |
+
## Files
|
| 145 |
+
|
| 146 |
+
```
|
| 147 |
+
mjf-su/FaithfulnessCritic/
|
| 148 |
+
βββ GB-S12/ adapter_config.json + adapter_model.safetensors
|
| 149 |
+
βββ GB-S123/ ...
|
| 150 |
+
βββ GP-S12/ ...
|
| 151 |
+
βββ GP-S123/ ...
|
| 152 |
+
```
|
| 153 |
+
|
| 154 |
+
## Citation
|
| 155 |
+
|
| 156 |
+
If you use this model, please cite the upstream dataset and base model:
|
| 157 |
+
|
| 158 |
+
```bibtex
|
| 159 |
+
@misc{foutter_faithfulnesscritic_2026,
|
| 160 |
+
title = {FaithfulnessCritic: counterfactual-trained R/A/W consistency critics for vision-based driving planners},
|
| 161 |
+
author = {Foutter, Matthew and Cercola, Marco and Gammelli, Daniele},
|
| 162 |
+
year = {2026},
|
| 163 |
+
howpublished = {\url{https://huggingface.co/mjf-su/FaithfulnessCritic}},
|
| 164 |
+
}
|
| 165 |
+
```
|