File size: 1,082 Bytes
3334467
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# Examples

PixDLM evaluates reasoning segmentation samples in the DRSeg format.

Minimal sample fields:

```json
{
  "id": "example_frame_0001",
  "width": 1024,
  "height": 1024,
  "metadata": {
    "time_of_day": "day",
    "location": "urban_road",
    "altitude": "60m",
    "camera_angle": "90deg"
  },
  "ann_list": [
    {
      "bbox": [100.0, 120.0, 80.0, 60.0],
      "segmentation": [[100.0, 120.0, 180.0, 120.0, 180.0, 180.0, 100.0, 180.0]],
      "area": 4800.0,
      "category_name": "car"
    }
  ],
  "questions": [
    "Which vehicle is closest to the intersection and may affect traffic flow?"
  ],
  "answers": [
    "<think>The target vehicle is positioned nearest to the intersection and aligned with the traffic lane.</think> <answer>The vehicle closest to the intersection is the target.</answer>"
  ],
  "reasoning_types": ["spatial"]
}
```

Evaluation outputs are written to:

```text
outputs/<exp_name>/<dataset_name>/with_cot/
```

Per sample:

- `*_input.jpg`
- `*_pred_mask.png`
- `*_gt_mask.png`
- `*_overlay_pred_red_gt_green.jpg`
- `*_result.json`