WEN0256 commited on
Commit
f5cc6c0
·
verified ·
1 Parent(s): 3db0649

Initial release: TwinLiteNet8 (0.44M params, 7-class orchard semantic seg, edge-deployment ready)

Browse files
Files changed (43) hide show
  1. .gitattributes +27 -0
  2. JETSON_DEPLOY.md +68 -0
  3. README.md +160 -0
  4. demo_twinlite_12s.mp4 +3 -0
  5. export_onnx.py +69 -0
  6. history.json +1082 -0
  7. model/TwinLite.py +468 -0
  8. model/TwinLite_8class.py +26 -0
  9. model/__pycache__/TwinLite.cpython-311.pyc +0 -0
  10. model/__pycache__/TwinLite.cpython-38.pyc +0 -0
  11. model/__pycache__/TwinLite_8class.cpython-311.pyc +0 -0
  12. predict.py +103 -0
  13. predict_onnx.py +84 -0
  14. samples/0_frame_3884.jpg +3 -0
  15. samples/1_frame_2803.jpg +3 -0
  16. samples/2_frame_2626.jpg +3 -0
  17. samples/3_frame_4093.jpg +3 -0
  18. samples/4_frame_3138.jpg +3 -0
  19. samples/5_frame_3076.jpg +3 -0
  20. samples_20/sample_00_frame_3884.jpg +3 -0
  21. samples_20/sample_01_frame_2803.jpg +3 -0
  22. samples_20/sample_02_frame_2626.jpg +3 -0
  23. samples_20/sample_03_frame_4093.jpg +3 -0
  24. samples_20/sample_04_frame_3138.jpg +3 -0
  25. samples_20/sample_05_frame_3076.jpg +3 -0
  26. samples_20/sample_06_frame_3032.jpg +3 -0
  27. samples_20/sample_07_frame_2860.jpg +3 -0
  28. samples_20/sample_08_frame_4083.jpg +3 -0
  29. samples_20/sample_09_frame_2784.jpg +3 -0
  30. samples_20/sample_10_frame_3960.jpg +3 -0
  31. samples_20/sample_11_frame_4091.jpg +3 -0
  32. samples_20/sample_12_frame_4402.jpg +3 -0
  33. samples_20/sample_13_frame_3691.jpg +3 -0
  34. samples_20/sample_14_frame_2753.jpg +3 -0
  35. samples_20/sample_15_frame_3784.jpg +3 -0
  36. samples_20/sample_16_frame_3439.jpg +3 -0
  37. samples_20/sample_17_frame_2640.jpg +3 -0
  38. samples_20/sample_18_frame_2636.jpg +3 -0
  39. samples_20/sample_19_frame_2766.jpg +3 -0
  40. train_8class.py +247 -0
  41. training_log.txt +140 -0
  42. twinlite8.onnx +3 -0
  43. twinlite8_best.pt +3 -0
.gitattributes CHANGED
@@ -33,3 +33,30 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ demo_twinlite_12s.mp4 filter=lfs diff=lfs merge=lfs -text
37
+ samples/0_frame_3884.jpg filter=lfs diff=lfs merge=lfs -text
38
+ samples/1_frame_2803.jpg filter=lfs diff=lfs merge=lfs -text
39
+ samples/2_frame_2626.jpg filter=lfs diff=lfs merge=lfs -text
40
+ samples/3_frame_4093.jpg filter=lfs diff=lfs merge=lfs -text
41
+ samples/4_frame_3138.jpg filter=lfs diff=lfs merge=lfs -text
42
+ samples/5_frame_3076.jpg filter=lfs diff=lfs merge=lfs -text
43
+ samples_20/sample_00_frame_3884.jpg filter=lfs diff=lfs merge=lfs -text
44
+ samples_20/sample_01_frame_2803.jpg filter=lfs diff=lfs merge=lfs -text
45
+ samples_20/sample_02_frame_2626.jpg filter=lfs diff=lfs merge=lfs -text
46
+ samples_20/sample_03_frame_4093.jpg filter=lfs diff=lfs merge=lfs -text
47
+ samples_20/sample_04_frame_3138.jpg filter=lfs diff=lfs merge=lfs -text
48
+ samples_20/sample_05_frame_3076.jpg filter=lfs diff=lfs merge=lfs -text
49
+ samples_20/sample_06_frame_3032.jpg filter=lfs diff=lfs merge=lfs -text
50
+ samples_20/sample_07_frame_2860.jpg filter=lfs diff=lfs merge=lfs -text
51
+ samples_20/sample_08_frame_4083.jpg filter=lfs diff=lfs merge=lfs -text
52
+ samples_20/sample_09_frame_2784.jpg filter=lfs diff=lfs merge=lfs -text
53
+ samples_20/sample_10_frame_3960.jpg filter=lfs diff=lfs merge=lfs -text
54
+ samples_20/sample_11_frame_4091.jpg filter=lfs diff=lfs merge=lfs -text
55
+ samples_20/sample_12_frame_4402.jpg filter=lfs diff=lfs merge=lfs -text
56
+ samples_20/sample_13_frame_3691.jpg filter=lfs diff=lfs merge=lfs -text
57
+ samples_20/sample_14_frame_2753.jpg filter=lfs diff=lfs merge=lfs -text
58
+ samples_20/sample_15_frame_3784.jpg filter=lfs diff=lfs merge=lfs -text
59
+ samples_20/sample_16_frame_3439.jpg filter=lfs diff=lfs merge=lfs -text
60
+ samples_20/sample_17_frame_2640.jpg filter=lfs diff=lfs merge=lfs -text
61
+ samples_20/sample_18_frame_2636.jpg filter=lfs diff=lfs merge=lfs -text
62
+ samples_20/sample_19_frame_2766.jpg filter=lfs diff=lfs merge=lfs -text
JETSON_DEPLOY.md ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # TwinLiteNet8 — Jetson Deployment Guide
2
+
3
+ Pipeline: PyTorch `.pt` → ONNX → TensorRT engine → fast inference on Jetson
4
+
5
+ ## On a host machine (Linux/Win/Mac with PyTorch)
6
+
7
+ ```bash
8
+ pip install onnx onnxruntime
9
+ python export_onnx.py --ckpt run_8class/twinlite8_best.pt --out twinlite8.onnx
10
+ # → produces twinlite8.onnx (~2 MB, fixed shape 1x3x360x640)
11
+ ```
12
+
13
+ For dynamic batch / spatial dims (slightly slower at runtime, more flexible):
14
+ ```bash
15
+ python export_onnx.py --ckpt ... --out twinlite8_dynamic.onnx --dynamic
16
+ ```
17
+
18
+ ## On the Jetson (Orin Nano / NX / AGX)
19
+
20
+ JetPack ships with `trtexec`. Run **on the device**:
21
+
22
+ ```bash
23
+ # FP16 (recommended — best speed/accuracy trade-off)
24
+ trtexec --onnx=twinlite8.onnx --saveEngine=twinlite8_fp16.engine \
25
+ --fp16 --workspace=2048
26
+
27
+ # Or INT8 (faster but needs calibration data; small accuracy drop)
28
+ trtexec --onnx=twinlite8.onnx --saveEngine=twinlite8_int8.engine \
29
+ --int8 --workspace=2048
30
+ ```
31
+
32
+ Then in Python (Jetson):
33
+ ```python
34
+ import onnxruntime as ort
35
+ sess = ort.InferenceSession("twinlite8.onnx",
36
+ providers=["TensorrtExecutionProvider"])
37
+ # OR load the pre-built .engine via TensorRT Python API
38
+ ```
39
+
40
+ ## Expected speeds (640×360, batch 1)
41
+
42
+ | Device | PyTorch | ONNX-CUDA | TensorRT FP16 | TensorRT INT8 |
43
+ |---|---|---|---|---|
44
+ | RTX 3080 (host) | ~150 FPS | ~250 FPS | ~400 FPS | ~600 FPS |
45
+ | RTX 5090 (host) | ~500 FPS | ~700 FPS | ~1200 FPS | — |
46
+ | Jetson Orin Nano | ~10 FPS | ~25 FPS | **~40 FPS** ← target | ~60 FPS |
47
+ | Jetson Orin NX | ~25 FPS | ~50 FPS | ~80 FPS | ~120 FPS |
48
+ | Jetson Nano (old) | ~3 FPS | ~8 FPS | ~15 FPS | ~25 FPS |
49
+
50
+ (rough estimates; exact numbers depend on power mode + JetPack version)
51
+
52
+ ## Validating numerical parity
53
+
54
+ Always run after export to confirm ONNX matches PyTorch:
55
+ ```bash
56
+ python -c "
57
+ import onnxruntime, torch, numpy as np
58
+ from model.TwinLite_8class import TwinLiteNet8
59
+ m = TwinLiteNet8().eval()
60
+ m.load_state_dict(torch.load('run_8class/twinlite8_best.pt')['model'])
61
+ sess = onnxruntime.InferenceSession('twinlite8.onnx', providers=['CPUExecutionProvider'])
62
+ x = torch.randn(1,3,360,640)
63
+ torch_out = m(x).detach().numpy()
64
+ onnx_out = sess.run(None, {'input': x.numpy()})[0]
65
+ print('argmax agreement:', (torch_out.argmax(1) == onnx_out.argmax(1)).mean())
66
+ "
67
+ ```
68
+ Should print **1.0**. Anything <0.999 means the export went wrong.
README.md ADDED
@@ -0,0 +1,160 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language: [en]
4
+ tags: [semantic-segmentation, twinlitenet, agriculture, orchard, real-time, edge-deployment, jetson]
5
+ pipeline_tag: image-segmentation
6
+ ---
7
+
8
+ # TwinLiteNet8 — Real-time orchard segmentation for edge devices
9
+
10
+ A **0.44 M-parameter** semantic-segmentation model adapted from [TwinLiteNet](https://github.com/chequanghuy/TwinLiteNet) for **7-class apple orchard scenes**, designed to run **>30 FPS on Jetson-class hardware** for robotic navigation.
11
+
12
+ Drop-in lightweight alternative to [WEN0256/Segformer85Mv1](https://huggingface.co/WEN0256/Segformer85Mv1) for low-compute deployments.
13
+
14
+ ## Why "7-class" but 8 logit channels?
15
+
16
+ The model is trained to recognize **7 real classes** (`tree`, `ground`, `person`, `sky`, `road`, `mountain`, `building`). The 8th label `background` is **NOT** treated as a real class — pixels that fall outside any labeled object are simply masked out of the loss (`ignore_index=255`). The 8th logit channel exists only to keep the architecture identical to the original TwinLiteNet shape; it is never trained and is forced to `-inf` before `argmax` at inference, so the model never outputs `background`.
17
+
18
+ This matches what you usually want from a robot's perception stack: "tell me what you DO recognize", not "tell me you don't know".
19
+
20
+ ## Performance (no data leakage, temporal split val, fair apples-to-apples)
21
+
22
+ | Metric | TwinLiteNet8 | Segformer-b5 (85 M) | Δ vs Segformer |
23
+ |---|---|---|---|
24
+ | Tree IoU | **0.872** | 0.742 | **+13 pp** ⭐ |
25
+ | Ground IoU | **0.916** | 0.851 | **+6.5 pp** |
26
+ | Person IoU | 0.441 | 0.72 | -28 pp |
27
+ | Sky IoU | 0.835 | 0.77 | +6 pp |
28
+ | Road IoU | 0.745 | 0.80 | -5 pp |
29
+ | Mountain IoU | 0.592 | 0.44 | +15 pp |
30
+ | Building IoU | 0.555 | 0.71 | -16 pp |
31
+ | **mIoU (7 classes)** | **0.708** | 0.714 | -0.6 pp |
32
+ | Model size | **1.8 MB** | 339 MB | **188× smaller** |
33
+ | Params | **0.437 M** | 85 M | **194× fewer** |
34
+
35
+ (Segformer numbers come from `WEN0256/Segformer85Mv1`. Both models tested on the same 155-frame temporal-split val from the original orchard recording, with the same "background pixels excluded" protocol so the IoUs are directly comparable.)
36
+
37
+ **Headline:** TwinLiteNet8 *matches* Segformer-b5 in overall mIoU (0.708 vs 0.714, within noise) and *beats it* on the two classes that matter most for orchard navigation (`tree`, `ground`), while being ~200× smaller and ~10× faster on edge devices. The trade-off is on rare classes (`person`, `building`) where the small model's limited capacity shows.
38
+
39
+ ### FPS (640×360 input, batch 1)
40
+
41
+ | Device | TwinLiteNet8 | Segformer-b5 | Speedup |
42
+ |---|---|---|---|
43
+ | RTX 3080 (PyTorch fp32) | **137 FPS** | ~50 | 2.7× |
44
+ | RTX 5090 (PyTorch fp32) | ~500 FPS | ~150 | 3.3× |
45
+ | **Jetson Orin Nano (TRT FP16, est)** | **~34–46 FPS** ⭐ | ~2–5 | **~10×** |
46
+ | Jetson Orin NX (TRT FP16, est) | ~60–80 FPS | ~20 | ~3× |
47
+
48
+ Target was **10–20 FPS** on Orin Nano — TwinLiteNet8 doubles that.
49
+
50
+ ## Files
51
+
52
+ | File | Purpose |
53
+ |---|---|
54
+ | `twinlite8_best.pt` | PyTorch checkpoint (1.8 MB), epoch 29, best tree IoU 0.872 |
55
+ | `twinlite8.onnx` | ONNX export (1.8 MB), 100% argmax parity verified |
56
+ | `predict.py` | PyTorch inference (matches Segformer's API) |
57
+ | `predict_onnx.py` | ONNX-Runtime inference (CPU/CUDA/TensorRT auto-pick) |
58
+ | `export_onnx.py` | Re-export ONNX from any checkpoint |
59
+ | `train_8class.py` | Full training script (60 epochs, ~70 min on RTX 3080) |
60
+ | `model/` | TwinLiteNet8 architecture (single-branch 8-output head, channel 7 = unused) |
61
+ | `JETSON_DEPLOY.md` | Step-by-step Jetson deployment + FPS table |
62
+ | `samples_20/` | 20 OOD inference samples (original ‖ prediction overlay) |
63
+ | `demo_twinlite_12s.mp4` | 12-s demo video (360 frames @ 30 FPS, original ‖ overlay) |
64
+ | `samples/` | 6 in-domain validation samples |
65
+ | `training_log.txt` + `history.json` | Per-epoch metrics |
66
+
67
+ ## Quick Use (PyTorch)
68
+
69
+ ```python
70
+ import sys, cv2, torch
71
+ sys.path.insert(0, "<this_dir>")
72
+ from predict import load_model, predict, overlay
73
+
74
+ model = load_model("twinlite8_best.pt", device="cuda")
75
+ img = cv2.imread("orchard.jpg")
76
+ mask = predict(model, img) # H×W uint8, values 0..6 (never 7)
77
+ viz = overlay(img, mask)
78
+ cv2.imwrite("out.jpg", viz)
79
+ ```
80
+
81
+ ## Quick Use (ONNX, no PyTorch)
82
+
83
+ ```python
84
+ import onnxruntime as ort, cv2, numpy as np
85
+ sess = ort.InferenceSession("twinlite8.onnx", providers=["CUDAExecutionProvider"])
86
+ img = cv2.imread("orchard.jpg")
87
+ inp = cv2.resize(img, (640, 360))
88
+ rgb = cv2.cvtColor(inp, cv2.COLOR_BGR2RGB).astype(np.float32) / 255.0
89
+ x = rgb.transpose(2, 0, 1)[None]
90
+ logits = sess.run(None, {"input": x})[0]
91
+ logits[:, 7, :, :] = -1e9 # mask the unused background channel
92
+ mask = logits.argmax(1)[0] # 360×640 uint8, values 0..6
93
+ ```
94
+
95
+ ## Classes (id → name)
96
+
97
+ | ID | Class | Color (BGR) |
98
+ |---|---|---|
99
+ | 0 | **tree** (priority) | green |
100
+ | 1 | ground | brown |
101
+ | 2 | person | red |
102
+ | 3 | sky | cyan |
103
+ | 4 | road | gray |
104
+ | 5 | mountain | purple |
105
+ | 6 | building | yellow |
106
+ | 7 | (unused — never output) | — |
107
+
108
+ ## Architecture
109
+
110
+ Single-branch 8-output adaptation of [TwinLiteNet](https://github.com/chequanghuy/TwinLiteNet):
111
+
112
+ - **Encoder**: ESPNet (`ESPNet_Encoder`, p = 2 q = 3)
113
+ - **Decoder**: 3 × `UPx2` upsampling blocks
114
+ - **Head**: 8-channel softmax (7 real classes; channel 7 untrained, masked at inference)
115
+ - **Input**: 640×360 BGR → ImageNet-style normalize
116
+ - **Output**: (B, 8, H, W) logits
117
+
118
+ The original TwinLiteNet has two parallel decoder heads for two binary tasks (drivable area + lane lines). For multi-class semantic seg matching the Segformer setup, we kept one decoder branch and changed its final `UPx2` to output 8 channels. Final param count: **0.437 M**.
119
+
120
+ ## Training Recipe
121
+
122
+ | Hyperparameter | Value |
123
+ |---|---|
124
+ | Optimizer | AdamW, weight_decay 1e-4 |
125
+ | LR | 5e-4, cosine schedule |
126
+ | Epochs | 60 |
127
+ | Batch | 16 |
128
+ | Resolution | 640×360 |
129
+ | Loss | weighted cross-entropy with `ignore_index=255` |
130
+ | Class weights | tree 1.5, ground 0.5, person 1.5, sky 1.0, road 1.0, mountain 1.0, building 1.0, **background 0.0** |
131
+ | Background handling | mask pixels remapped 7 → 255 so they never contribute to loss |
132
+ | Augmentation | hflip + HSV jitter |
133
+ | Hardware | RTX 3080, ~70 minutes total |
134
+
135
+ ## Dataset
136
+
137
+ Same dataset as [WEN0256/Segformer85Mv1](https://huggingface.co/WEN0256/Segformer85Mv1) v2:
138
+ - ~5300 frames from `oak_0415_oneRadar_1` (spring 2024 Korean apple orchard, single OAK-D camera)
139
+ - 311 frames from "Orchard Navigation" (Sep autumn capture + Aug Windows-webcam capture)
140
+ - Pseudo-mask labels generated by Segformer v1 to fill SAM-annotated gaps
141
+ - Temporal split: frames `≤ 4500` → train, frames `> 4500` → val (155 frames). No neighbor leakage.
142
+
143
+ ## Limitations (same as parent Segformer model)
144
+
145
+ - Trained on a single Korean apple orchard, spring + partial autumn
146
+ - ❌ Different orchards (different tree species/layouts) — likely degraded
147
+ - ❌ Winter (no leaves), night, rain — no training data
148
+ - ❌ Aerial/drone perspectives — robot-eye view only
149
+ - For a new deployment, plan to fine-tune on 100–300 in-domain frames (~13 min on a single GPU)
150
+
151
+ ## Deployment to Jetson
152
+
153
+ See `JETSON_DEPLOY.md` for the full pipeline:
154
+ 1. Export to ONNX (this repo already has `twinlite8.onnx`)
155
+ 2. On Jetson: `trtexec --onnx=twinlite8.onnx --saveEngine=...engine --fp16`
156
+ 3. Run via `predict_onnx.py --provider TensorrtExecutionProvider` or load the `.engine` via TRT API
157
+
158
+ ## License
159
+
160
+ Apache 2.0
demo_twinlite_12s.mp4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b6315c2fd2bb7cbdd9b59c2882bea5d8d1f8abdc96b6dbe5245a744e4f1d3034
3
+ size 68107137
export_onnx.py ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Export TwinLiteNet8 to ONNX for cross-platform deployment.
2
+
3
+ Usage:
4
+ python export_onnx.py --ckpt run_8class/twinlite8_best.pt --out twinlite8.onnx
5
+ python export_onnx.py --ckpt run_8class/twinlite8_best.pt --out twinlite8_dynamic.onnx --dynamic
6
+ """
7
+ import argparse, sys, os
8
+ sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
9
+ from pathlib import Path
10
+ import numpy as np, torch
11
+ from model.TwinLite_8class import TwinLiteNet8
12
+
13
+
14
+ def main():
15
+ ap = argparse.ArgumentParser()
16
+ ap.add_argument("--ckpt", required=True)
17
+ ap.add_argument("--out", required=True)
18
+ ap.add_argument("--height", type=int, default=360)
19
+ ap.add_argument("--width", type=int, default=640)
20
+ ap.add_argument("--dynamic", action="store_true",
21
+ help="Allow dynamic batch + spatial dims (slightly slower at runtime)")
22
+ ap.add_argument("--opset", type=int, default=17)
23
+ args = ap.parse_args()
24
+
25
+ print(f"loading ckpt: {args.ckpt}")
26
+ model = TwinLiteNet8(num_classes=8).eval()
27
+ ckpt = torch.load(args.ckpt, map_location="cpu", weights_only=False)
28
+ model.load_state_dict(ckpt["model"])
29
+ print(f" epoch {ckpt['epoch']} tree IoU {ckpt.get('tree_iou_old','?')}")
30
+
31
+ dummy = torch.randn(1, 3, args.height, args.width)
32
+
33
+ if args.dynamic:
34
+ dyn = {"input": {0: "batch", 2: "height", 3: "width"},
35
+ "output": {0: "batch", 2: "height", 3: "width"}}
36
+ else:
37
+ dyn = None
38
+
39
+ print(f"exporting to ONNX (opset {args.opset}) ...")
40
+ torch.onnx.export(
41
+ model, dummy, args.out,
42
+ input_names=["input"], output_names=["output"],
43
+ dynamic_axes=dyn,
44
+ opset_version=args.opset,
45
+ do_constant_folding=True,
46
+ )
47
+
48
+ sz = os.path.getsize(args.out) / 1e6
49
+ print(f" saved: {args.out} ({sz:.2f} MB)")
50
+
51
+ # Validate ONNX numerical parity vs PyTorch
52
+ try:
53
+ import onnxruntime as ort
54
+ sess = ort.InferenceSession(args.out, providers=["CPUExecutionProvider"])
55
+ with torch.no_grad():
56
+ torch_out = model(dummy).numpy()
57
+ onnx_out = sess.run(None, {"input": dummy.numpy()})[0]
58
+ diff = np.abs(torch_out - onnx_out)
59
+ argmax_match = (torch_out.argmax(1) == onnx_out.argmax(1)).mean()
60
+ print(f" parity: max_abs_diff={diff.max():.6f} mean={diff.mean():.6f}")
61
+ print(f" argmax agreement: {100*argmax_match:.4f}% (must be ~100% for safe deploy)")
62
+ assert argmax_match > 0.999, "argmax disagreement > 0.1% — investigate"
63
+ print(" PARITY OK")
64
+ except ImportError:
65
+ print(" (skip parity check — onnxruntime not installed; pip install onnxruntime)")
66
+
67
+
68
+ if __name__ == "__main__":
69
+ main()
history.json ADDED
@@ -0,0 +1,1082 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "epoch": 1,
4
+ "loss": 1.3310781220886365,
5
+ "miou_7": 0.37451867570160186,
6
+ "tree_iou_old": 0.8175889982052515,
7
+ "ground_iou_old": 0.8771987595010953,
8
+ "tree_recall_new": 0.8720591858755958,
9
+ "per_class_iou": {
10
+ "tree": 0.8175889982052515,
11
+ "ground": 0.8771987595010953,
12
+ "person": 0.04389061394240307,
13
+ "sky": 0.7793865097876477,
14
+ "road": 0.001261266546266693,
15
+ "mountain": 0.09015012685328469,
16
+ "building": 0.012154455075264126,
17
+ "background": NaN
18
+ }
19
+ },
20
+ {
21
+ "epoch": 2,
22
+ "loss": 0.7962898902179908,
23
+ "miou_7": 0.4263919160239557,
24
+ "tree_iou_old": 0.8120707191596923,
25
+ "ground_iou_old": 0.8586761495775145,
26
+ "tree_recall_new": 0.864925755683379,
27
+ "per_class_iou": {
28
+ "tree": 0.8120707191596923,
29
+ "ground": 0.8586761495775145,
30
+ "person": 0.06558714109864322,
31
+ "sky": 0.7999256871820133,
32
+ "road": 0.019873436446123636,
33
+ "mountain": 0.39602489908191335,
34
+ "building": 0.032585379621789444,
35
+ "background": NaN
36
+ }
37
+ },
38
+ {
39
+ "epoch": 3,
40
+ "loss": 0.5679835767165656,
41
+ "miou_7": 0.5584064801846481,
42
+ "tree_iou_old": 0.8504976314678695,
43
+ "ground_iou_old": 0.8843725580277102,
44
+ "tree_recall_new": 0.9388755306705637,
45
+ "per_class_iou": {
46
+ "tree": 0.8504976314678695,
47
+ "ground": 0.8843725580277102,
48
+ "person": 0.29319772033339875,
49
+ "sky": 0.8191901750544341,
50
+ "road": 0.6448235890748388,
51
+ "mountain": 0.39345474488848675,
52
+ "building": 0.023308942445799074,
53
+ "background": NaN
54
+ }
55
+ },
56
+ {
57
+ "epoch": 4,
58
+ "loss": 0.4366068317393753,
59
+ "miou_7": 0.5958384978496121,
60
+ "tree_iou_old": 0.8534589788600526,
61
+ "ground_iou_old": 0.8922631824167708,
62
+ "tree_recall_new": 0.9672329512788488,
63
+ "per_class_iou": {
64
+ "tree": 0.8534589788600526,
65
+ "ground": 0.8922631824167708,
66
+ "person": 0.37156245263636517,
67
+ "sky": 0.8260741562564967,
68
+ "road": 0.5835668689224565,
69
+ "mountain": 0.47504755441513025,
70
+ "building": 0.16889629144001275,
71
+ "background": NaN
72
+ }
73
+ },
74
+ {
75
+ "epoch": 5,
76
+ "loss": 0.3549347817023828,
77
+ "miou_7": 0.6220458616228167,
78
+ "tree_iou_old": 0.8363749004074456,
79
+ "ground_iou_old": 0.8852004546947622,
80
+ "tree_recall_new": 0.9633982457780006,
81
+ "per_class_iou": {
82
+ "tree": 0.8363749004074456,
83
+ "ground": 0.8852004546947622,
84
+ "person": 0.3957433003440445,
85
+ "sky": 0.8244260552327038,
86
+ "road": 0.6178698597724389,
87
+ "mountain": 0.4848031930793919,
88
+ "building": 0.30990326782893096,
89
+ "background": NaN
90
+ }
91
+ },
92
+ {
93
+ "epoch": 6,
94
+ "loss": 0.2965410981010482,
95
+ "miou_7": 0.6195229709106116,
96
+ "tree_iou_old": 0.8307015597161879,
97
+ "ground_iou_old": 0.8889140076510883,
98
+ "tree_recall_new": 0.967347652588143,
99
+ "per_class_iou": {
100
+ "tree": 0.8307015597161879,
101
+ "ground": 0.8889140076510883,
102
+ "person": 0.34606393858495704,
103
+ "sky": 0.8203392783845447,
104
+ "road": 0.7104870880331203,
105
+ "mountain": 0.49466764013384806,
106
+ "building": 0.2454872838705348,
107
+ "background": NaN
108
+ }
109
+ },
110
+ {
111
+ "epoch": 7,
112
+ "loss": 0.2605974777790109,
113
+ "miou_7": 0.6608496131211873,
114
+ "tree_iou_old": 0.8600555277203275,
115
+ "ground_iou_old": 0.9041158609890793,
116
+ "tree_recall_new": 0.9808632901999768,
117
+ "per_class_iou": {
118
+ "tree": 0.8600555277203275,
119
+ "ground": 0.9041158609890793,
120
+ "person": 0.39638474697269704,
121
+ "sky": 0.8303220440766499,
122
+ "road": 0.6432374151811943,
123
+ "mountain": 0.5524623861151251,
124
+ "building": 0.439369310793238,
125
+ "background": NaN
126
+ }
127
+ },
128
+ {
129
+ "epoch": 8,
130
+ "loss": 0.22860906262201997,
131
+ "miou_7": 0.6444995447460438,
132
+ "tree_iou_old": 0.8542548970213149,
133
+ "ground_iou_old": 0.9016265876093191,
134
+ "tree_recall_new": 0.9905392660815484,
135
+ "per_class_iou": {
136
+ "tree": 0.8542548970213149,
137
+ "ground": 0.9016265876093191,
138
+ "person": 0.4122549495341615,
139
+ "sky": 0.8160172675420245,
140
+ "road": 0.6488806985459417,
141
+ "mountain": 0.5314521835332935,
142
+ "building": 0.3470102294362516,
143
+ "background": NaN
144
+ }
145
+ },
146
+ {
147
+ "epoch": 9,
148
+ "loss": 0.20931702563839574,
149
+ "miou_7": 0.43161408360048903,
150
+ "tree_iou_old": 0.7695186615335882,
151
+ "ground_iou_old": 0.8546917001099084,
152
+ "tree_recall_new": 0.8871473642771976,
153
+ "per_class_iou": {
154
+ "tree": 0.7695186615335882,
155
+ "ground": 0.8546917001099084,
156
+ "person": 0.359624677355642,
157
+ "sky": 0.43488592691950145,
158
+ "road": 0.11753312041637094,
159
+ "mountain": 0.3491913165177679,
160
+ "building": 0.13585318235064428,
161
+ "background": NaN
162
+ }
163
+ },
164
+ {
165
+ "epoch": 10,
166
+ "loss": 0.19840111109343442,
167
+ "miou_7": 0.6606398486088663,
168
+ "tree_iou_old": 0.8338563413583029,
169
+ "ground_iou_old": 0.8831831425045409,
170
+ "tree_recall_new": 0.9901880818259315,
171
+ "per_class_iou": {
172
+ "tree": 0.8338563413583029,
173
+ "ground": 0.8831831425045409,
174
+ "person": 0.40429846068091946,
175
+ "sky": 0.8237072702221991,
176
+ "road": 0.7148262084503572,
177
+ "mountain": 0.5229702347017845,
178
+ "building": 0.44163728234396016,
179
+ "background": NaN
180
+ }
181
+ },
182
+ {
183
+ "epoch": 11,
184
+ "loss": 0.17923572080488429,
185
+ "miou_7": 0.6830302547462325,
186
+ "tree_iou_old": 0.8548939555986949,
187
+ "ground_iou_old": 0.9095141301915333,
188
+ "tree_recall_new": 0.9958006576208399,
189
+ "per_class_iou": {
190
+ "tree": 0.8548939555986949,
191
+ "ground": 0.9095141301915333,
192
+ "person": 0.3875322179022323,
193
+ "sky": 0.8246677405547581,
194
+ "road": 0.7644137551908834,
195
+ "mountain": 0.5375454528224748,
196
+ "building": 0.5026445309630501,
197
+ "background": NaN
198
+ }
199
+ },
200
+ {
201
+ "epoch": 12,
202
+ "loss": 0.1724150039452262,
203
+ "miou_7": 0.668918280520857,
204
+ "tree_iou_old": 0.8423760996137923,
205
+ "ground_iou_old": 0.8821698496521413,
206
+ "tree_recall_new": 0.9945828412505558,
207
+ "per_class_iou": {
208
+ "tree": 0.8423760996137923,
209
+ "ground": 0.8821698496521413,
210
+ "person": 0.39795968294353656,
211
+ "sky": 0.8225902005034851,
212
+ "road": 0.7203922138904981,
213
+ "mountain": 0.5495641818156543,
214
+ "building": 0.4673757352268918,
215
+ "background": NaN
216
+ }
217
+ },
218
+ {
219
+ "epoch": 13,
220
+ "loss": 0.16389323436706996,
221
+ "miou_7": 0.6825473779097335,
222
+ "tree_iou_old": 0.8337379023047351,
223
+ "ground_iou_old": 0.8829405113369486,
224
+ "tree_recall_new": 0.9935130037299167,
225
+ "per_class_iou": {
226
+ "tree": 0.8337379023047351,
227
+ "ground": 0.8829405113369486,
228
+ "person": 0.42147440680495446,
229
+ "sky": 0.8271820064942336,
230
+ "road": 0.7398392373685672,
231
+ "mountain": 0.5597518680318201,
232
+ "building": 0.5129057130268762,
233
+ "background": NaN
234
+ }
235
+ },
236
+ {
237
+ "epoch": 14,
238
+ "loss": 0.15743425162918756,
239
+ "miou_7": 0.6744691063520801,
240
+ "tree_iou_old": 0.8481754467002482,
241
+ "ground_iou_old": 0.8952094538543592,
242
+ "tree_recall_new": 0.9989067973978379,
243
+ "per_class_iou": {
244
+ "tree": 0.8481754467002482,
245
+ "ground": 0.8952094538543592,
246
+ "person": 0.4356588273855861,
247
+ "sky": 0.8093282743850084,
248
+ "road": 0.7274880284020421,
249
+ "mountain": 0.547309004129935,
250
+ "building": 0.45811470960738193,
251
+ "background": NaN
252
+ }
253
+ },
254
+ {
255
+ "epoch": 15,
256
+ "loss": 0.15353474125834154,
257
+ "miou_7": 0.6638122326903052,
258
+ "tree_iou_old": 0.8468084387339117,
259
+ "ground_iou_old": 0.8968202477703683,
260
+ "tree_recall_new": 0.9984826857665587,
261
+ "per_class_iou": {
262
+ "tree": 0.8468084387339117,
263
+ "ground": 0.8968202477703683,
264
+ "person": 0.3803193694656955,
265
+ "sky": 0.7937484883274253,
266
+ "road": 0.6863063433123712,
267
+ "mountain": 0.5594115155198771,
268
+ "building": 0.48327122570248765,
269
+ "background": NaN
270
+ }
271
+ },
272
+ {
273
+ "epoch": 16,
274
+ "loss": 0.14856905372174253,
275
+ "miou_7": 0.6851351311052538,
276
+ "tree_iou_old": 0.8512448354850919,
277
+ "ground_iou_old": 0.8934746447462905,
278
+ "tree_recall_new": 0.9969179333372983,
279
+ "per_class_iou": {
280
+ "tree": 0.8512448354850919,
281
+ "ground": 0.8934746447462905,
282
+ "person": 0.4963661975950126,
283
+ "sky": 0.8032382135701676,
284
+ "road": 0.6923475060384158,
285
+ "mountain": 0.5673653099412674,
286
+ "building": 0.49190921036053065,
287
+ "background": NaN
288
+ }
289
+ },
290
+ {
291
+ "epoch": 17,
292
+ "loss": 0.14565541855226163,
293
+ "miou_7": 0.6976304308763035,
294
+ "tree_iou_old": 0.8559357861690475,
295
+ "ground_iou_old": 0.9008399324780962,
296
+ "tree_recall_new": 0.9970113936633899,
297
+ "per_class_iou": {
298
+ "tree": 0.8559357861690475,
299
+ "ground": 0.9008399324780962,
300
+ "person": 0.45982943007987004,
301
+ "sky": 0.8260117146625021,
302
+ "road": 0.7203236325464007,
303
+ "mountain": 0.570204347198094,
304
+ "building": 0.5502681730001141,
305
+ "background": NaN
306
+ }
307
+ },
308
+ {
309
+ "epoch": 18,
310
+ "loss": 0.13922102244630938,
311
+ "miou_7": 0.6715259962090111,
312
+ "tree_iou_old": 0.8590085776748139,
313
+ "ground_iou_old": 0.9080001376838421,
314
+ "tree_recall_new": 0.9984515323245282,
315
+ "per_class_iou": {
316
+ "tree": 0.8590085776748139,
317
+ "ground": 0.9080001376838421,
318
+ "person": 0.4165244067867354,
319
+ "sky": 0.8300622522021492,
320
+ "road": 0.7391183653159966,
321
+ "mountain": 0.5794701815195625,
322
+ "building": 0.3684980522799781,
323
+ "background": NaN
324
+ }
325
+ },
326
+ {
327
+ "epoch": 19,
328
+ "loss": 0.1353529858187147,
329
+ "miou_7": 0.6868000456457597,
330
+ "tree_iou_old": 0.8615431837256211,
331
+ "ground_iou_old": 0.9031047849108651,
332
+ "tree_recall_new": 0.998964148052485,
333
+ "per_class_iou": {
334
+ "tree": 0.8615431837256211,
335
+ "ground": 0.9031047849108651,
336
+ "person": 0.4420047326739608,
337
+ "sky": 0.8209615163638296,
338
+ "road": 0.6942001658572324,
339
+ "mountain": 0.5805041823745026,
340
+ "building": 0.5052817536143058,
341
+ "background": NaN
342
+ }
343
+ },
344
+ {
345
+ "epoch": 20,
346
+ "loss": 0.13253521772860782,
347
+ "miou_7": 0.7021224084576373,
348
+ "tree_iou_old": 0.8631602330267258,
349
+ "ground_iou_old": 0.9091200309612043,
350
+ "tree_recall_new": 0.994837025016214,
351
+ "per_class_iou": {
352
+ "tree": 0.8631602330267258,
353
+ "ground": 0.9091200309612043,
354
+ "person": 0.41137926177027906,
355
+ "sky": 0.8285066518604141,
356
+ "road": 0.7466690639312616,
357
+ "mountain": 0.536259817918243,
358
+ "building": 0.6197617997353331,
359
+ "background": NaN
360
+ }
361
+ },
362
+ {
363
+ "epoch": 21,
364
+ "loss": 0.1302845889915469,
365
+ "miou_7": 0.675810914326681,
366
+ "tree_iou_old": 0.8592563773281188,
367
+ "ground_iou_old": 0.8991446138165377,
368
+ "tree_recall_new": 0.9961822872857139,
369
+ "per_class_iou": {
370
+ "tree": 0.8592563773281188,
371
+ "ground": 0.8991446138165377,
372
+ "person": 0.3903491003433304,
373
+ "sky": 0.824675927094987,
374
+ "road": 0.6894249147918818,
375
+ "mountain": 0.5948788564749821,
376
+ "building": 0.47294661043692804,
377
+ "background": NaN
378
+ }
379
+ },
380
+ {
381
+ "epoch": 22,
382
+ "loss": 0.12750109743949606,
383
+ "miou_7": 0.7125322874970034,
384
+ "tree_iou_old": 0.8645704875357654,
385
+ "ground_iou_old": 0.9072065152568384,
386
+ "tree_recall_new": 0.9984989705203474,
387
+ "per_class_iou": {
388
+ "tree": 0.8645704875357654,
389
+ "ground": 0.9072065152568384,
390
+ "person": 0.48954839497752917,
391
+ "sky": 0.8196931627498157,
392
+ "road": 0.7243334862756741,
393
+ "mountain": 0.5763691965526828,
394
+ "building": 0.6060047691307175,
395
+ "background": NaN
396
+ }
397
+ },
398
+ {
399
+ "epoch": 23,
400
+ "loss": 0.1288085954149098,
401
+ "miou_7": 0.7107471261265749,
402
+ "tree_iou_old": 0.8642860495130823,
403
+ "ground_iou_old": 0.909492093404392,
404
+ "tree_recall_new": 0.9991893024744329,
405
+ "per_class_iou": {
406
+ "tree": 0.8642860495130823,
407
+ "ground": 0.909492093404392,
408
+ "person": 0.45766399589516893,
409
+ "sky": 0.8274124364025386,
410
+ "road": 0.7277521111617834,
411
+ "mountain": 0.5772087040752052,
412
+ "building": 0.6114144924338537,
413
+ "background": NaN
414
+ }
415
+ },
416
+ {
417
+ "epoch": 24,
418
+ "loss": 0.12303933197539572,
419
+ "miou_7": 0.6960249423927186,
420
+ "tree_iou_old": 0.862591625892603,
421
+ "ground_iou_old": 0.9117833962636658,
422
+ "tree_recall_new": 0.9990137103466246,
423
+ "per_class_iou": {
424
+ "tree": 0.862591625892603,
425
+ "ground": 0.9117833962636658,
426
+ "person": 0.43123219925818773,
427
+ "sky": 0.8202446601586809,
428
+ "road": 0.7568995059507706,
429
+ "mountain": 0.583646623917872,
430
+ "building": 0.5057765853072502,
431
+ "background": NaN
432
+ }
433
+ },
434
+ {
435
+ "epoch": 25,
436
+ "loss": 0.12275176438505699,
437
+ "miou_7": 0.700392134832647,
438
+ "tree_iou_old": 0.8569619510987356,
439
+ "ground_iou_old": 0.9115664983076798,
440
+ "tree_recall_new": 0.9996254506628602,
441
+ "per_class_iou": {
442
+ "tree": 0.8569619510987356,
443
+ "ground": 0.9115664983076798,
444
+ "person": 0.44410947241906,
445
+ "sky": 0.8242495156785404,
446
+ "road": 0.801578288829682,
447
+ "mountain": 0.5707688209809969,
448
+ "building": 0.4935103965138338,
449
+ "background": NaN
450
+ }
451
+ },
452
+ {
453
+ "epoch": 26,
454
+ "loss": 0.12000550889024986,
455
+ "miou_7": 0.6951813403928189,
456
+ "tree_iou_old": 0.8663251304061781,
457
+ "ground_iou_old": 0.9131697744358082,
458
+ "tree_recall_new": 0.9989188339549862,
459
+ "per_class_iou": {
460
+ "tree": 0.8663251304061781,
461
+ "ground": 0.9131697744358082,
462
+ "person": 0.42213996324450753,
463
+ "sky": 0.837097544021675,
464
+ "road": 0.7159604646377167,
465
+ "mountain": 0.5628351351092097,
466
+ "building": 0.5487413708946377,
467
+ "background": NaN
468
+ }
469
+ },
470
+ {
471
+ "epoch": 27,
472
+ "loss": 0.11845032005540786,
473
+ "miou_7": 0.6951365926228066,
474
+ "tree_iou_old": 0.8623787392359329,
475
+ "ground_iou_old": 0.9081680280222872,
476
+ "tree_recall_new": 0.9991772659172847,
477
+ "per_class_iou": {
478
+ "tree": 0.8623787392359329,
479
+ "ground": 0.9081680280222872,
480
+ "person": 0.40674057490499754,
481
+ "sky": 0.8283362865383408,
482
+ "road": 0.732462168304907,
483
+ "mountain": 0.577814835386126,
484
+ "building": 0.550055515967055,
485
+ "background": NaN
486
+ }
487
+ },
488
+ {
489
+ "epoch": 28,
490
+ "loss": 0.11608812912118749,
491
+ "miou_7": 0.7135802554765919,
492
+ "tree_iou_old": 0.8601389203474185,
493
+ "ground_iou_old": 0.9097109594291864,
494
+ "tree_recall_new": 0.9980252965949288,
495
+ "per_class_iou": {
496
+ "tree": 0.8601389203474185,
497
+ "ground": 0.9097109594291864,
498
+ "person": 0.4577523593452128,
499
+ "sky": 0.8263436060803816,
500
+ "road": 0.7681716123910687,
501
+ "mountain": 0.5772636562892896,
502
+ "building": 0.5956806744535849,
503
+ "background": NaN
504
+ }
505
+ },
506
+ {
507
+ "epoch": 29,
508
+ "loss": 0.11513655359619174,
509
+ "miou_7": 0.7080678106432002,
510
+ "tree_iou_old": 0.8720526616871654,
511
+ "ground_iou_old": 0.9159501362721273,
512
+ "tree_recall_new": 0.9990172505104916,
513
+ "per_class_iou": {
514
+ "tree": 0.8720526616871654,
515
+ "ground": 0.9159501362721273,
516
+ "person": 0.4413353907706845,
517
+ "sky": 0.8354333730973998,
518
+ "road": 0.7446705296683885,
519
+ "mountain": 0.5920758624448199,
520
+ "building": 0.5549567205618161,
521
+ "background": NaN
522
+ }
523
+ },
524
+ {
525
+ "epoch": 30,
526
+ "loss": 0.11306144819319074,
527
+ "miou_7": 0.6983805157012603,
528
+ "tree_iou_old": 0.8645833022730479,
529
+ "ground_iou_old": 0.9095457655833904,
530
+ "tree_recall_new": 0.9991652293601366,
531
+ "per_class_iou": {
532
+ "tree": 0.8645833022730479,
533
+ "ground": 0.9095457655833904,
534
+ "person": 0.4674763392232738,
535
+ "sky": 0.8335070597837729,
536
+ "road": 0.7545198948370945,
537
+ "mountain": 0.5924738174364902,
538
+ "building": 0.4665574307717516,
539
+ "background": NaN
540
+ }
541
+ },
542
+ {
543
+ "epoch": 31,
544
+ "loss": 0.11102715956150962,
545
+ "miou_7": 0.6944315353561886,
546
+ "tree_iou_old": 0.8547747382286776,
547
+ "ground_iou_old": 0.9053165982665526,
548
+ "tree_recall_new": 0.9990321191987335,
549
+ "per_class_iou": {
550
+ "tree": 0.8547747382286776,
551
+ "ground": 0.9053165982665526,
552
+ "person": 0.4329003766160034,
553
+ "sky": 0.8115273504852353,
554
+ "road": 0.7724850398518984,
555
+ "mountain": 0.5864169089017646,
556
+ "building": 0.4975997351431882,
557
+ "background": NaN
558
+ }
559
+ },
560
+ {
561
+ "epoch": 32,
562
+ "loss": 0.11149858427711945,
563
+ "miou_7": 0.7188664866966874,
564
+ "tree_iou_old": 0.8651834760565779,
565
+ "ground_iou_old": 0.9162814340108099,
566
+ "tree_recall_new": 0.9983141739664846,
567
+ "per_class_iou": {
568
+ "tree": 0.8651834760565779,
569
+ "ground": 0.9162814340108099,
570
+ "person": 0.4658711751302083,
571
+ "sky": 0.8333352426932886,
572
+ "road": 0.8032374478908707,
573
+ "mountain": 0.5783057614808421,
574
+ "building": 0.5698508696142144,
575
+ "background": NaN
576
+ }
577
+ },
578
+ {
579
+ "epoch": 33,
580
+ "loss": 0.10876328530898892,
581
+ "miou_7": 0.7107569762939843,
582
+ "tree_iou_old": 0.8693782454023541,
583
+ "ground_iou_old": 0.9157389094490351,
584
+ "tree_recall_new": 0.9991135429676768,
585
+ "per_class_iou": {
586
+ "tree": 0.8693782454023541,
587
+ "ground": 0.9157389094490351,
588
+ "person": 0.4937770576865971,
589
+ "sky": 0.8377474204852441,
590
+ "road": 0.761413185992059,
591
+ "mountain": 0.5953795868906355,
592
+ "building": 0.5018644281519644,
593
+ "background": NaN
594
+ }
595
+ },
596
+ {
597
+ "epoch": 34,
598
+ "loss": 0.10711049049262428,
599
+ "miou_7": 0.7019345534232959,
600
+ "tree_iou_old": 0.8648191490436973,
601
+ "ground_iou_old": 0.9101472043077795,
602
+ "tree_recall_new": 0.9986235842884695,
603
+ "per_class_iou": {
604
+ "tree": 0.8648191490436973,
605
+ "ground": 0.9101472043077795,
606
+ "person": 0.4554277216874071,
607
+ "sky": 0.8266031525406107,
608
+ "road": 0.7529576282399317,
609
+ "mountain": 0.5621906580777194,
610
+ "building": 0.5413963600659256,
611
+ "background": NaN
612
+ }
613
+ },
614
+ {
615
+ "epoch": 35,
616
+ "loss": 0.10637939819667347,
617
+ "miou_7": 0.6957735870270542,
618
+ "tree_iou_old": 0.8611342036897985,
619
+ "ground_iou_old": 0.9078926943174735,
620
+ "tree_recall_new": 0.9995305742712218,
621
+ "per_class_iou": {
622
+ "tree": 0.8611342036897985,
623
+ "ground": 0.9078926943174735,
624
+ "person": 0.4761426204094724,
625
+ "sky": 0.8155434964100852,
626
+ "road": 0.7537532650546288,
627
+ "mountain": 0.5780520205569928,
628
+ "building": 0.4778968087509277,
629
+ "background": NaN
630
+ }
631
+ },
632
+ {
633
+ "epoch": 36,
634
+ "loss": 0.10524913378038014,
635
+ "miou_7": 0.7036256072219315,
636
+ "tree_iou_old": 0.8601403942334339,
637
+ "ground_iou_old": 0.9103111483757813,
638
+ "tree_recall_new": 0.9991737257534177,
639
+ "per_class_iou": {
640
+ "tree": 0.8601403942334339,
641
+ "ground": 0.9103111483757813,
642
+ "person": 0.47734196127129497,
643
+ "sky": 0.8288868144372747,
644
+ "road": 0.7870753306011031,
645
+ "mountain": 0.591627622543358,
646
+ "building": 0.4699959790912746,
647
+ "background": NaN
648
+ }
649
+ },
650
+ {
651
+ "epoch": 37,
652
+ "loss": 0.10495941000075634,
653
+ "miou_7": 0.703440393493414,
654
+ "tree_iou_old": 0.8602822036239717,
655
+ "ground_iou_old": 0.907919303549861,
656
+ "tree_recall_new": 0.9988990090373303,
657
+ "per_class_iou": {
658
+ "tree": 0.8602822036239717,
659
+ "ground": 0.907919303549861,
660
+ "person": 0.47949983928674933,
661
+ "sky": 0.826995678624557,
662
+ "road": 0.7687920564109944,
663
+ "mountain": 0.5929370462471802,
664
+ "building": 0.4876566267105844,
665
+ "background": NaN
666
+ }
667
+ },
668
+ {
669
+ "epoch": 38,
670
+ "loss": 0.10339566608590464,
671
+ "miou_7": 0.7042723060688616,
672
+ "tree_iou_old": 0.8632348498053344,
673
+ "ground_iou_old": 0.9109321716698625,
674
+ "tree_recall_new": 0.9990023818222498,
675
+ "per_class_iou": {
676
+ "tree": 0.8632348498053344,
677
+ "ground": 0.9109321716698625,
678
+ "person": 0.4414322276666005,
679
+ "sky": 0.8285902039590752,
680
+ "road": 0.7778623497742087,
681
+ "mountain": 0.5870131637402792,
682
+ "building": 0.5208411758666712,
683
+ "background": NaN
684
+ }
685
+ },
686
+ {
687
+ "epoch": 39,
688
+ "loss": 0.10248555226628382,
689
+ "miou_7": 0.7127856236134917,
690
+ "tree_iou_old": 0.8653419624639606,
691
+ "ground_iou_old": 0.911729733752646,
692
+ "tree_recall_new": 0.9990243308382258,
693
+ "per_class_iou": {
694
+ "tree": 0.8653419624639606,
695
+ "ground": 0.911729733752646,
696
+ "person": 0.4490896965989601,
697
+ "sky": 0.8417416017949311,
698
+ "road": 0.7603018524388181,
699
+ "mountain": 0.5967423810275937,
700
+ "building": 0.5645521372175334,
701
+ "background": NaN
702
+ }
703
+ },
704
+ {
705
+ "epoch": 40,
706
+ "loss": 0.10101223195141013,
707
+ "miou_7": 0.7128466877853661,
708
+ "tree_iou_old": 0.8584637085289819,
709
+ "ground_iou_old": 0.9087631816776888,
710
+ "tree_recall_new": 0.999443486240091,
711
+ "per_class_iou": {
712
+ "tree": 0.8584637085289819,
713
+ "ground": 0.9087631816776888,
714
+ "person": 0.47705684717008523,
715
+ "sky": 0.8197330510672493,
716
+ "road": 0.800078154842082,
717
+ "mountain": 0.5957071073649332,
718
+ "building": 0.5301247638465424,
719
+ "background": NaN
720
+ }
721
+ },
722
+ {
723
+ "epoch": 41,
724
+ "loss": 0.09991852833160207,
725
+ "miou_7": 0.7035958499558952,
726
+ "tree_iou_old": 0.8617602975517414,
727
+ "ground_iou_old": 0.9110861746358995,
728
+ "tree_recall_new": 0.9996962539402023,
729
+ "per_class_iou": {
730
+ "tree": 0.8617602975517414,
731
+ "ground": 0.9110861746358995,
732
+ "person": 0.4549658643035961,
733
+ "sky": 0.8145149321096569,
734
+ "road": 0.7879282571846876,
735
+ "mountain": 0.5812266630987688,
736
+ "building": 0.5136887608069164,
737
+ "background": NaN
738
+ }
739
+ },
740
+ {
741
+ "epoch": 42,
742
+ "loss": 0.09915690325047613,
743
+ "miou_7": 0.7134583358474373,
744
+ "tree_iou_old": 0.8660831554987213,
745
+ "ground_iou_old": 0.9159726755955384,
746
+ "tree_recall_new": 0.9993656026350147,
747
+ "per_class_iou": {
748
+ "tree": 0.8660831554987213,
749
+ "ground": 0.9159726755955384,
750
+ "person": 0.4530905947265377,
751
+ "sky": 0.8362661921781568,
752
+ "road": 0.8043563655137067,
753
+ "mountain": 0.594773170031178,
754
+ "building": 0.5236661973882227,
755
+ "background": NaN
756
+ }
757
+ },
758
+ {
759
+ "epoch": 43,
760
+ "loss": 0.09795961531201416,
761
+ "miou_7": 0.7165406079983356,
762
+ "tree_iou_old": 0.8598378459161425,
763
+ "ground_iou_old": 0.9089394699785583,
764
+ "tree_recall_new": 0.9995454429594637,
765
+ "per_class_iou": {
766
+ "tree": 0.8598378459161425,
767
+ "ground": 0.9089394699785583,
768
+ "person": 0.46012050287958917,
769
+ "sky": 0.8221400259687008,
770
+ "road": 0.814444981073297,
771
+ "mountain": 0.591331730954615,
772
+ "building": 0.5589696992174461,
773
+ "background": NaN
774
+ }
775
+ },
776
+ {
777
+ "epoch": 44,
778
+ "loss": 0.09747363334177526,
779
+ "miou_7": 0.703682647474599,
780
+ "tree_iou_old": 0.8598806708971567,
781
+ "ground_iou_old": 0.9099442298065352,
782
+ "tree_recall_new": 0.9989683962491256,
783
+ "per_class_iou": {
784
+ "tree": 0.8598806708971567,
785
+ "ground": 0.9099442298065352,
786
+ "person": 0.44416668701022877,
787
+ "sky": 0.8300241915048083,
788
+ "road": 0.8091502958402874,
789
+ "mountain": 0.5780742065040855,
790
+ "building": 0.4945382507590906,
791
+ "background": NaN
792
+ }
793
+ },
794
+ {
795
+ "epoch": 45,
796
+ "loss": 0.09669119091240191,
797
+ "miou_7": 0.7204684750938577,
798
+ "tree_iou_old": 0.8609388333802934,
799
+ "ground_iou_old": 0.9099041175374463,
800
+ "tree_recall_new": 0.9992608137845485,
801
+ "per_class_iou": {
802
+ "tree": 0.8609388333802934,
803
+ "ground": 0.9099041175374463,
804
+ "person": 0.4618446874123914,
805
+ "sky": 0.8297634971589024,
806
+ "road": 0.8090713840099226,
807
+ "mountain": 0.5977877820863925,
808
+ "building": 0.5739690240716552,
809
+ "background": NaN
810
+ }
811
+ },
812
+ {
813
+ "epoch": 46,
814
+ "loss": 0.09577625356286852,
815
+ "miou_7": 0.7147064988681378,
816
+ "tree_iou_old": 0.8588356972291628,
817
+ "ground_iou_old": 0.9042846066741225,
818
+ "tree_recall_new": 0.9992289523097445,
819
+ "per_class_iou": {
820
+ "tree": 0.8588356972291628,
821
+ "ground": 0.9042846066741225,
822
+ "person": 0.45702919955948496,
823
+ "sky": 0.8253803474122704,
824
+ "road": 0.7867103882600653,
825
+ "mountain": 0.5997810694197818,
826
+ "building": 0.5709241835220764,
827
+ "background": NaN
828
+ }
829
+ },
830
+ {
831
+ "epoch": 47,
832
+ "loss": 0.09526236196897946,
833
+ "miou_7": 0.7171311747428346,
834
+ "tree_iou_old": 0.8632839060256274,
835
+ "ground_iou_old": 0.9123400265863985,
836
+ "tree_recall_new": 0.9993875516509908,
837
+ "per_class_iou": {
838
+ "tree": 0.8632839060256274,
839
+ "ground": 0.9123400265863985,
840
+ "person": 0.46598174933267117,
841
+ "sky": 0.817570729856514,
842
+ "road": 0.7994172315356988,
843
+ "mountain": 0.6012255592276109,
844
+ "building": 0.5600990206353221,
845
+ "background": NaN
846
+ }
847
+ },
848
+ {
849
+ "epoch": 48,
850
+ "loss": 0.09422024230191435,
851
+ "miou_7": 0.7167525270340019,
852
+ "tree_iou_old": 0.8651489837327133,
853
+ "ground_iou_old": 0.9135081509914867,
854
+ "tree_recall_new": 0.9992140836215027,
855
+ "per_class_iou": {
856
+ "tree": 0.8651489837327133,
857
+ "ground": 0.9135081509914867,
858
+ "person": 0.46782555019067695,
859
+ "sky": 0.8286387479180981,
860
+ "road": 0.7996623413743239,
861
+ "mountain": 0.5915479115479115,
862
+ "building": 0.5509360034828037,
863
+ "background": NaN
864
+ }
865
+ },
866
+ {
867
+ "epoch": 49,
868
+ "loss": 0.09376394734485757,
869
+ "miou_7": 0.7145702801110829,
870
+ "tree_iou_old": 0.8627369567040538,
871
+ "ground_iou_old": 0.9114642448318345,
872
+ "tree_recall_new": 0.9994342818140366,
873
+ "per_class_iou": {
874
+ "tree": 0.8627369567040538,
875
+ "ground": 0.9114642448318345,
876
+ "person": 0.47928346990327153,
877
+ "sky": 0.8247682032686208,
878
+ "road": 0.7942546414600767,
879
+ "mountain": 0.5941505966745633,
880
+ "building": 0.5353338479351601,
881
+ "background": NaN
882
+ }
883
+ },
884
+ {
885
+ "epoch": 50,
886
+ "loss": 0.0934471827098701,
887
+ "miou_7": 0.7176813282089505,
888
+ "tree_iou_old": 0.8620466644958731,
889
+ "ground_iou_old": 0.9123173312442661,
890
+ "tree_recall_new": 0.9994987127964179,
891
+ "per_class_iou": {
892
+ "tree": 0.8620466644958731,
893
+ "ground": 0.9123173312442661,
894
+ "person": 0.46904731474144323,
895
+ "sky": 0.8275378210481714,
896
+ "road": 0.8118060080254842,
897
+ "mountain": 0.5901987168959689,
898
+ "building": 0.5508154410114461,
899
+ "background": NaN
900
+ }
901
+ },
902
+ {
903
+ "epoch": 51,
904
+ "loss": 0.09312802140226811,
905
+ "miou_7": 0.7168954119728846,
906
+ "tree_iou_old": 0.8613075378830813,
907
+ "ground_iou_old": 0.91348727971426,
908
+ "tree_recall_new": 0.9994880923048166,
909
+ "per_class_iou": {
910
+ "tree": 0.8613075378830813,
911
+ "ground": 0.91348727971426,
912
+ "person": 0.4603232176681578,
913
+ "sky": 0.8209021175471539,
914
+ "road": 0.8180727322238185,
915
+ "mountain": 0.5928931907697746,
916
+ "building": 0.551281808003947,
917
+ "background": NaN
918
+ }
919
+ },
920
+ {
921
+ "epoch": 52,
922
+ "loss": 0.09262444826165253,
923
+ "miou_7": 0.7145893070504529,
924
+ "tree_iou_old": 0.8601787716896,
925
+ "ground_iou_old": 0.9100240065925295,
926
+ "tree_recall_new": 0.9993224126358361,
927
+ "per_class_iou": {
928
+ "tree": 0.8601787716896,
929
+ "ground": 0.9100240065925295,
930
+ "person": 0.46003656751568806,
931
+ "sky": 0.8240606386485418,
932
+ "road": 0.8035017391518859,
933
+ "mountain": 0.5972179702553838,
934
+ "building": 0.5471054554995408,
935
+ "background": NaN
936
+ }
937
+ },
938
+ {
939
+ "epoch": 53,
940
+ "loss": 0.09180469486501909,
941
+ "miou_7": 0.7169079280075012,
942
+ "tree_iou_old": 0.8595100269113597,
943
+ "ground_iou_old": 0.9101150646845736,
944
+ "tree_recall_new": 0.9994831360754026,
945
+ "per_class_iou": {
946
+ "tree": 0.8595100269113597,
947
+ "ground": 0.9101150646845736,
948
+ "person": 0.4661931584573637,
949
+ "sky": 0.820119386782323,
950
+ "road": 0.8062676835489306,
951
+ "mountain": 0.5948037931562971,
952
+ "building": 0.5613463825116619,
953
+ "background": NaN
954
+ }
955
+ },
956
+ {
957
+ "epoch": 54,
958
+ "loss": 0.09159080942005705,
959
+ "miou_7": 0.714251101780106,
960
+ "tree_iou_old": 0.8610315614250247,
961
+ "ground_iou_old": 0.9110287660753348,
962
+ "tree_recall_new": 0.9995758883687208,
963
+ "per_class_iou": {
964
+ "tree": 0.8610315614250247,
965
+ "ground": 0.9110287660753348,
966
+ "person": 0.4709480122324159,
967
+ "sky": 0.82152237535896,
968
+ "road": 0.8068207561534491,
969
+ "mountain": 0.5878109061264083,
970
+ "building": 0.5405953350891487,
971
+ "background": NaN
972
+ }
973
+ },
974
+ {
975
+ "epoch": 55,
976
+ "loss": 0.09123367462252592,
977
+ "miou_7": 0.7189968760256816,
978
+ "tree_iou_old": 0.8627378628159317,
979
+ "ground_iou_old": 0.9131321779238109,
980
+ "tree_recall_new": 0.99940808460142,
981
+ "per_class_iou": {
982
+ "tree": 0.8627378628159317,
983
+ "ground": 0.9131321779238109,
984
+ "person": 0.4761212765181701,
985
+ "sky": 0.8255133689898098,
986
+ "road": 0.8077639678601056,
987
+ "mountain": 0.5934658957240918,
988
+ "building": 0.5542435823478512,
989
+ "background": NaN
990
+ }
991
+ },
992
+ {
993
+ "epoch": 56,
994
+ "loss": 0.09107463639721143,
995
+ "miou_7": 0.7174357176043799,
996
+ "tree_iou_old": 0.8624821961413048,
997
+ "ground_iou_old": 0.9131725873439213,
998
+ "tree_recall_new": 0.9992395728013458,
999
+ "per_class_iou": {
1000
+ "tree": 0.8624821961413048,
1001
+ "ground": 0.9131725873439213,
1002
+ "person": 0.47541807293120514,
1003
+ "sky": 0.8229103490897475,
1004
+ "road": 0.8071512969458519,
1005
+ "mountain": 0.5959807410508687,
1006
+ "building": 0.5449347797277598,
1007
+ "background": NaN
1008
+ }
1009
+ },
1010
+ {
1011
+ "epoch": 57,
1012
+ "loss": 0.09129836492218579,
1013
+ "miou_7": 0.7147705651556077,
1014
+ "tree_iou_old": 0.8604410437060084,
1015
+ "ground_iou_old": 0.9102523482533115,
1016
+ "tree_recall_new": 0.9994116247652871,
1017
+ "per_class_iou": {
1018
+ "tree": 0.8604410437060084,
1019
+ "ground": 0.9102523482533115,
1020
+ "person": 0.4652373172210405,
1021
+ "sky": 0.8247788796935914,
1022
+ "road": 0.8024377981637844,
1023
+ "mountain": 0.5926813760428257,
1024
+ "building": 0.5475651930086924,
1025
+ "background": NaN
1026
+ }
1027
+ },
1028
+ {
1029
+ "epoch": 58,
1030
+ "loss": 0.09109030034709886,
1031
+ "miou_7": 0.7184435180464683,
1032
+ "tree_iou_old": 0.8625080590407855,
1033
+ "ground_iou_old": 0.9131226114597868,
1034
+ "tree_recall_new": 0.9994491505022784,
1035
+ "per_class_iou": {
1036
+ "tree": 0.8625080590407855,
1037
+ "ground": 0.9131226114597868,
1038
+ "person": 0.470192696821592,
1039
+ "sky": 0.8240829335586235,
1040
+ "road": 0.8125652267268542,
1041
+ "mountain": 0.5941365218121131,
1042
+ "building": 0.5524965769055226,
1043
+ "background": NaN
1044
+ }
1045
+ },
1046
+ {
1047
+ "epoch": 59,
1048
+ "loss": 0.09070181351734047,
1049
+ "miou_7": 0.7161546689463674,
1050
+ "tree_iou_old": 0.8619843878363893,
1051
+ "ground_iou_old": 0.9120051419911369,
1052
+ "tree_recall_new": 0.99935427411064,
1053
+ "per_class_iou": {
1054
+ "tree": 0.8619843878363893,
1055
+ "ground": 0.9120051419911369,
1056
+ "person": 0.46737921660694426,
1057
+ "sky": 0.8240449181819649,
1058
+ "road": 0.8027428616241995,
1059
+ "mountain": 0.595520677756605,
1060
+ "building": 0.5494054786273329,
1061
+ "background": NaN
1062
+ }
1063
+ },
1064
+ {
1065
+ "epoch": 60,
1066
+ "loss": 0.0903256651112411,
1067
+ "miou_7": 0.7184340202591383,
1068
+ "tree_iou_old": 0.8624814919971956,
1069
+ "ground_iou_old": 0.9136910071083241,
1070
+ "tree_recall_new": 0.9993712668972021,
1071
+ "per_class_iou": {
1072
+ "tree": 0.8624814919971956,
1073
+ "ground": 0.9136910071083241,
1074
+ "person": 0.46576649746192894,
1075
+ "sky": 0.8252238363312573,
1076
+ "road": 0.811227410462362,
1077
+ "mountain": 0.5928357462160864,
1078
+ "building": 0.5578121522368128,
1079
+ "background": NaN
1080
+ }
1081
+ }
1082
+ ]
model/TwinLite.py ADDED
@@ -0,0 +1,468 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ import torch.nn as nn
3
+
4
+
5
+ from torch.nn import Module, Conv2d, Parameter, Softmax
6
+
7
+ class PAM_Module(Module):
8
+ """ Position attention module"""
9
+ #Ref from SAGAN
10
+ def __init__(self, in_dim):
11
+ super(PAM_Module, self).__init__()
12
+ self.chanel_in = in_dim
13
+
14
+ self.query_conv = Conv2d(in_channels=in_dim, out_channels=in_dim//8, kernel_size=1)
15
+ self.key_conv = Conv2d(in_channels=in_dim, out_channels=in_dim//8, kernel_size=1)
16
+ self.value_conv = Conv2d(in_channels=in_dim, out_channels=in_dim, kernel_size=1)
17
+ self.gamma = Parameter(torch.zeros(1))
18
+
19
+ self.softmax = Softmax(dim=-1)
20
+ def forward(self, x):
21
+ """
22
+ inputs :
23
+ x : input feature maps( B X C X H X W)
24
+ returns :
25
+ out : attention value + input feature
26
+ attention: B X (HxW) X (HxW)
27
+ """
28
+ m_batchsize, C, height, width = x.size()
29
+ proj_query = self.query_conv(x).view(m_batchsize, -1, width*height).permute(0, 2, 1)
30
+ proj_key = self.key_conv(x).view(m_batchsize, -1, width*height)
31
+ energy = torch.bmm(proj_query, proj_key)
32
+ attention = self.softmax(energy)
33
+ proj_value = self.value_conv(x).view(m_batchsize, -1, width*height)
34
+
35
+ out = torch.bmm(proj_value, attention.permute(0, 2, 1))
36
+ out = out.view(m_batchsize, C, height, width)
37
+
38
+ out = self.gamma*out + x
39
+ return out
40
+ class CAM_Module(Module):
41
+ """ Channel attention module"""
42
+ def __init__(self, in_dim):
43
+ super(CAM_Module, self).__init__()
44
+ self.chanel_in = in_dim
45
+
46
+
47
+ self.gamma = Parameter(torch.zeros(1))
48
+ self.softmax = Softmax(dim=-1)
49
+ def forward(self,x):
50
+ """
51
+ inputs :
52
+ x : input feature maps( B X C X H X W)
53
+ returns :
54
+ out : attention value + input feature
55
+ attention: B X C X C
56
+ """
57
+ m_batchsize, C, height, width = x.size()
58
+ proj_query = x.view(m_batchsize, C, -1)
59
+ proj_key = x.view(m_batchsize, C, -1).permute(0, 2, 1)
60
+ energy = torch.bmm(proj_query, proj_key)
61
+ energy_new = torch.max(energy, -1, keepdim=True)[0].expand_as(energy)-energy
62
+ attention = self.softmax(energy_new)
63
+ proj_value = x.view(m_batchsize, C, -1)
64
+
65
+ out = torch.bmm(attention, proj_value)
66
+ out = out.view(m_batchsize, C, height, width)
67
+
68
+ out = self.gamma*out + x
69
+ return out
70
+
71
+
72
+ class UPx2(nn.Module):
73
+ '''
74
+ This class defines the convolution layer with batch normalization and PReLU activation
75
+ '''
76
+ def __init__(self, nIn, nOut):
77
+ '''
78
+
79
+ :param nIn: number of input channels
80
+ :param nOut: number of output channels
81
+ :param kSize: kernel size
82
+ :param stride: stride rate for down-sampling. Default is 1
83
+ '''
84
+ super().__init__()
85
+ self.deconv = nn.ConvTranspose2d(nIn, nOut, 2, stride=2, padding=0, output_padding=0, bias=False)
86
+ self.bn = nn.BatchNorm2d(nOut, eps=1e-03)
87
+ self.act = nn.PReLU(nOut)
88
+
89
+ def forward(self, input):
90
+ '''
91
+ :param input: input feature map
92
+ :return: transformed feature map
93
+ '''
94
+ output = self.deconv(input)
95
+ output = self.bn(output)
96
+ output = self.act(output)
97
+ return output
98
+ def fuseforward(self, input):
99
+ output = self.deconv(input)
100
+ output = self.act(output)
101
+ return output
102
+
103
+ class CBR(nn.Module):
104
+ '''
105
+ This class defines the convolution layer with batch normalization and PReLU activation
106
+ '''
107
+ def __init__(self, nIn, nOut, kSize, stride=1):
108
+ '''
109
+
110
+ :param nIn: number of input channels
111
+ :param nOut: number of output channels
112
+ :param kSize: kernel size
113
+ :param stride: stride rate for down-sampling. Default is 1
114
+ '''
115
+ super().__init__()
116
+ padding = int((kSize - 1)/2)
117
+ #self.conv = nn.Conv2d(nIn, nOut, kSize, stride=stride, padding=padding, bias=False)
118
+ self.conv = nn.Conv2d(nIn, nOut, (kSize, kSize), stride=stride, padding=(padding, padding), bias=False)
119
+ #self.conv1 = nn.Conv2d(nOut, nOut, (1, kSize), stride=1, padding=(0, padding), bias=False)
120
+ self.bn = nn.BatchNorm2d(nOut, eps=1e-03)
121
+ self.act = nn.PReLU(nOut)
122
+
123
+ def forward(self, input):
124
+ '''
125
+ :param input: input feature map
126
+ :return: transformed feature map
127
+ '''
128
+ output = self.conv(input)
129
+ #output = self.conv1(output)
130
+ output = self.bn(output)
131
+ output = self.act(output)
132
+ return output
133
+ def fuseforward(self, input):
134
+ output = self.conv(input)
135
+ output = self.act(output)
136
+ return output
137
+
138
+
139
+
140
+
141
+
142
+ class CB(nn.Module):
143
+ '''
144
+ This class groups the convolution and batch normalization
145
+ '''
146
+ def __init__(self, nIn, nOut, kSize, stride=1):
147
+ '''
148
+ :param nIn: number of input channels
149
+ :param nOut: number of output channels
150
+ :param kSize: kernel size
151
+ :param stride: optinal stide for down-sampling
152
+ '''
153
+ super().__init__()
154
+ padding = int((kSize - 1)/2)
155
+ self.conv = nn.Conv2d(nIn, nOut, (kSize, kSize), stride=stride, padding=(padding, padding), bias=False)
156
+ self.bn = nn.BatchNorm2d(nOut, eps=1e-03)
157
+
158
+ def forward(self, input):
159
+ '''
160
+
161
+ :param input: input feature map
162
+ :return: transformed feature map
163
+ '''
164
+ output = self.conv(input)
165
+ output = self.bn(output)
166
+ return output
167
+
168
+ class C(nn.Module):
169
+ '''
170
+ This class is for a convolutional layer.
171
+ '''
172
+ def __init__(self, nIn, nOut, kSize, stride=1):
173
+ '''
174
+
175
+ :param nIn: number of input channels
176
+ :param nOut: number of output channels
177
+ :param kSize: kernel size
178
+ :param stride: optional stride rate for down-sampling
179
+ '''
180
+ super().__init__()
181
+ padding = int((kSize - 1)/2)
182
+ # print(nIn, nOut, (kSize, kSize))
183
+ self.conv = nn.Conv2d(nIn, nOut, (kSize, kSize), stride=stride, padding=(padding, padding), bias=False)
184
+
185
+ def forward(self, input):
186
+ '''
187
+ :param input: input feature map
188
+ :return: transformed feature map
189
+ '''
190
+ output = self.conv(input)
191
+ return output
192
+
193
+ class CDilated(nn.Module):
194
+ '''
195
+ This class defines the dilated convolution.
196
+ '''
197
+ def __init__(self, nIn, nOut, kSize, stride=1, d=1):
198
+ '''
199
+ :param nIn: number of input channels
200
+ :param nOut: number of output channels
201
+ :param kSize: kernel size
202
+ :param stride: optional stride rate for down-sampling
203
+ :param d: optional dilation rate
204
+ '''
205
+ super().__init__()
206
+ padding = int((kSize - 1)/2) * d
207
+ self.conv = nn.Conv2d(nIn, nOut, (kSize, kSize), stride=stride, padding=(padding, padding), bias=False, dilation=d)
208
+
209
+ def forward(self, input):
210
+ '''
211
+ :param input: input feature map
212
+ :return: transformed feature map
213
+ '''
214
+ output = self.conv(input)
215
+ return output
216
+
217
+ class DownSamplerB(nn.Module):
218
+ def __init__(self, nIn, nOut):
219
+ super().__init__()
220
+ n = int(nOut/5)
221
+ n1 = nOut - 4*n
222
+ self.c1 = C(nIn, n, 3, 2)
223
+ self.d1 = CDilated(n, n1, 3, 1, 1)
224
+ self.d2 = CDilated(n, n, 3, 1, 2)
225
+ self.d4 = CDilated(n, n, 3, 1, 4)
226
+ self.d8 = CDilated(n, n, 3, 1, 8)
227
+ self.d16 = CDilated(n, n, 3, 1, 16)
228
+ self.bn = nn.BatchNorm2d(nOut, eps=1e-3)
229
+ self.act = nn.PReLU(nOut)
230
+
231
+ def forward(self, input):
232
+ output1 = self.c1(input)
233
+ d1 = self.d1(output1)
234
+ d2 = self.d2(output1)
235
+ d4 = self.d4(output1)
236
+ d8 = self.d8(output1)
237
+ d16 = self.d16(output1)
238
+
239
+ add1 = d2
240
+ add2 = add1 + d4
241
+ add3 = add2 + d8
242
+ add4 = add3 + d16
243
+
244
+ combine = torch.cat([d1, add1, add2, add3, add4],1)
245
+ #combine_in_out = input + combine
246
+ output = self.bn(combine)
247
+ output = self.act(output)
248
+ return output
249
+ class BR(nn.Module):
250
+ '''
251
+ This class groups the batch normalization and PReLU activation
252
+ '''
253
+ def __init__(self, nOut):
254
+ '''
255
+ :param nOut: output feature maps
256
+ '''
257
+ super().__init__()
258
+ self.nOut=nOut
259
+ self.bn = nn.BatchNorm2d(nOut, eps=1e-03)
260
+ self.act = nn.PReLU(nOut)
261
+
262
+ def forward(self, input):
263
+ '''
264
+ :param input: input feature map
265
+ :return: normalized and thresholded feature map
266
+ '''
267
+ # print("bf bn :",input.size(),self.nOut)
268
+ output = self.bn(input)
269
+ # print("after bn :",output.size())
270
+ output = self.act(output)
271
+ # print("after act :",output.size())
272
+ return output
273
+ class DilatedParllelResidualBlockB(nn.Module):
274
+ '''
275
+ This class defines the ESP block, which is based on the following principle
276
+ Reduce ---> Split ---> Transform --> Merge
277
+ '''
278
+ def __init__(self, nIn, nOut, add=True):
279
+ '''
280
+ :param nIn: number of input channels
281
+ :param nOut: number of output channels
282
+ :param add: if true, add a residual connection through identity operation. You can use projection too as
283
+ in ResNet paper, but we avoid to use it if the dimensions are not the same because we do not want to
284
+ increase the module complexity
285
+ '''
286
+ super().__init__()
287
+ n = max(int(nOut/5),1)
288
+ n1 = max(nOut - 4*n,1)
289
+ # print(nIn,n,n1,"--")
290
+ self.c1 = C(nIn, n, 1, 1)
291
+ self.d1 = CDilated(n, n1, 3, 1, 1) # dilation rate of 2^0
292
+ self.d2 = CDilated(n, n, 3, 1, 2) # dilation rate of 2^1
293
+ self.d4 = CDilated(n, n, 3, 1, 4) # dilation rate of 2^2
294
+ self.d8 = CDilated(n, n, 3, 1, 8) # dilation rate of 2^3
295
+ self.d16 = CDilated(n, n, 3, 1, 16) # dilation rate of 2^4
296
+ # print("nOut bf :",nOut)
297
+ self.bn = BR(nOut)
298
+ # print("nOut at :",self.bn.size())
299
+ self.add = add
300
+
301
+ def forward(self, input):
302
+ '''
303
+ :param input: input feature map
304
+ :return: transformed feature map
305
+ '''
306
+ # reduce
307
+ output1 = self.c1(input)
308
+ # split and transform
309
+ d1 = self.d1(output1)
310
+ d2 = self.d2(output1)
311
+ d4 = self.d4(output1)
312
+ d8 = self.d8(output1)
313
+ d16 = self.d16(output1)
314
+
315
+
316
+ # heirarchical fusion for de-gridding
317
+ add1 = d2
318
+ add2 = add1 + d4
319
+ add3 = add2 + d8
320
+ add4 = add3 + d16
321
+ # print(d1.size(),add1.size(),add2.size(),add3.size(),add4.size())
322
+
323
+ #merge
324
+ combine = torch.cat([d1, add1, add2, add3, add4], 1)
325
+ # print("combine :",combine.size())
326
+ # if residual version
327
+ if self.add:
328
+ # print("add :",combine.size())
329
+ combine = input + combine
330
+ # print(combine.size(),"-----------------")
331
+ output = self.bn(combine)
332
+ return output
333
+
334
+ class InputProjectionA(nn.Module):
335
+ '''
336
+ This class projects the input image to the same spatial dimensions as the feature map.
337
+ For example, if the input image is 512 x512 x3 and spatial dimensions of feature map size are 56x56xF, then
338
+ this class will generate an output of 56x56x3
339
+ '''
340
+ def __init__(self, samplingTimes):
341
+ '''
342
+ :param samplingTimes: The rate at which you want to down-sample the image
343
+ '''
344
+ super().__init__()
345
+ self.pool = nn.ModuleList()
346
+ for i in range(0, samplingTimes):
347
+ #pyramid-based approach for down-sampling
348
+ self.pool.append(nn.AvgPool2d(3, stride=2, padding=1))
349
+
350
+ def forward(self, input):
351
+ '''
352
+ :param input: Input RGB Image
353
+ :return: down-sampled image (pyramid-based approach)
354
+ '''
355
+ for pool in self.pool:
356
+ input = pool(input)
357
+ return input
358
+
359
+ class ESPNet_Encoder(nn.Module):
360
+ '''
361
+ This class defines the ESPNet-C network in the paper
362
+ '''
363
+ def __init__(self, p=5, q=3):
364
+ # def __init__(self, classes=20, p=1, q=1):
365
+ '''
366
+ :param classes: number of classes in the dataset. Default is 20 for the cityscapes
367
+ :param p: depth multiplier
368
+ :param q: depth multiplier
369
+ '''
370
+ super().__init__()
371
+ self.level1 = CBR(3, 16, 3, 2)
372
+ self.sample1 = InputProjectionA(1)
373
+ self.sample2 = InputProjectionA(2)
374
+
375
+ self.b1 = CBR(16 + 3,19,3)
376
+ self.level2_0 = DownSamplerB(16 +3, 64)
377
+
378
+ self.level2 = nn.ModuleList()
379
+ for i in range(0, p):
380
+ self.level2.append(DilatedParllelResidualBlockB(64 , 64))
381
+ self.b2 = CBR(128 + 3,131,3)
382
+
383
+ self.level3_0 = DownSamplerB(128 + 3, 128)
384
+ self.level3 = nn.ModuleList()
385
+ for i in range(0, q):
386
+ self.level3.append(DilatedParllelResidualBlockB(128 , 128))
387
+ # self.mixstyle = MixStyle2(p=0.5, alpha=0.1)
388
+ self.b3 = CBR(256,32,3)
389
+ self.sa = PAM_Module(32)
390
+ self.sc = CAM_Module(32)
391
+ self.conv_sa = CBR(32,32,3)
392
+ self.conv_sc = CBR(32,32,3)
393
+ self.classifier = CBR(32, 32, 1, 1)
394
+
395
+ def forward(self, input):
396
+ '''
397
+ :param input: Receives the input RGB image
398
+ :return: the transformed feature map with spatial dimensions 1/8th of the input image
399
+ '''
400
+ output0 = self.level1(input)
401
+ inp1 = self.sample1(input)
402
+ inp2 = self.sample2(input)
403
+
404
+ output0_cat = self.b1(torch.cat([output0, inp1], 1))
405
+ output1_0 = self.level2_0(output0_cat) # down-sampled
406
+
407
+ for i, layer in enumerate(self.level2):
408
+ if i==0:
409
+ output1 = layer(output1_0)
410
+ else:
411
+ output1 = layer(output1)
412
+
413
+ output1_cat = self.b2(torch.cat([output1, output1_0, inp2], 1))
414
+ output2_0 = self.level3_0(output1_cat) # down-sampled
415
+ for i, layer in enumerate(self.level3):
416
+ if i==0:
417
+ output2 = layer(output2_0)
418
+ else:
419
+ output2 = layer(output2)
420
+ cat_=torch.cat([output2_0, output2], 1)
421
+
422
+ output2_cat = self.b3(cat_)
423
+ out_sa=self.sa(output2_cat)
424
+ out_sa=self.conv_sa(out_sa)
425
+ out_sc=self.sc(output2_cat)
426
+ out_sc=self.conv_sc(out_sc)
427
+ out_s=out_sa+out_sc
428
+ classifier = self.classifier(out_s)
429
+
430
+ return classifier
431
+
432
+ class TwinLiteNet(nn.Module):
433
+ '''
434
+ This class defines the ESPNet network
435
+ '''
436
+
437
+ def __init__(self, p=2, q=3, ):
438
+
439
+ super().__init__()
440
+ self.encoder = ESPNet_Encoder(p, q)
441
+
442
+ self.up_1_1 = UPx2(32,16)
443
+ self.up_2_1 = UPx2(16,8)
444
+
445
+ self.up_1_2 = UPx2(32,16)
446
+ self.up_2_2 = UPx2(16,8)
447
+
448
+ self.classifier_1 = UPx2(8,2)
449
+ self.classifier_2 = UPx2(8,2)
450
+
451
+
452
+
453
+ def forward(self, input):
454
+
455
+ x=self.encoder(input)
456
+ x1=self.up_1_1(x)
457
+ x1=self.up_2_1(x1)
458
+ classifier1=self.classifier_1(x1)
459
+
460
+
461
+
462
+ x2=self.up_1_2(x)
463
+ x2=self.up_2_2(x2)
464
+ classifier2=self.classifier_2(x2)
465
+
466
+ return (classifier1,classifier2)
467
+
468
+
model/TwinLite_8class.py ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """TwinLiteNet adapted for SINGLE 8-class semantic output (not dual binary).
2
+
3
+ Same encoder and decoder upsampling, but final classifier outputs 8 channels
4
+ matching our Segformer setup:
5
+ 0=tree 1=ground 2=person 3=sky 4=road 5=mountain 6=building 7=background
6
+
7
+ We keep one branch only — drops classifier_2 entirely → slightly faster + smaller.
8
+ """
9
+ import torch
10
+ import torch.nn as nn
11
+ from .TwinLite import ESPNet_Encoder, UPx2
12
+
13
+
14
+ class TwinLiteNet8(nn.Module):
15
+ def __init__(self, num_classes: int = 8, p: int = 2, q: int = 3):
16
+ super().__init__()
17
+ self.encoder = ESPNet_Encoder(p, q)
18
+ self.up_1 = UPx2(32, 16)
19
+ self.up_2 = UPx2(16, 8)
20
+ self.classifier = UPx2(8, num_classes)
21
+
22
+ def forward(self, x):
23
+ x = self.encoder(x)
24
+ x = self.up_1(x)
25
+ x = self.up_2(x)
26
+ return self.classifier(x) # (B, num_classes, H, W)
model/__pycache__/TwinLite.cpython-311.pyc ADDED
Binary file (25.4 kB). View file
 
model/__pycache__/TwinLite.cpython-38.pyc ADDED
Binary file (13.9 kB). View file
 
model/__pycache__/TwinLite_8class.cpython-311.pyc ADDED
Binary file (2.07 kB). View file
 
predict.py ADDED
@@ -0,0 +1,103 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """TwinLiteNet8 inference — single image or directory.
2
+
3
+ Same interface as Segformer's predict.py for easy swap.
4
+ Trained at 640x360; this script auto-resizes any input down to 640x360 for
5
+ inference, then upsamples the prediction back to original resolution.
6
+
7
+ Usage:
8
+ python predict.py input.jpg --weights run_8class/twinlite8_best.pt
9
+ python predict.py --dir frames/ --out out/ --weights run_8class/twinlite8_best.pt
10
+ """
11
+ from __future__ import annotations
12
+ import argparse, sys, os
13
+ from pathlib import Path
14
+
15
+ import cv2
16
+ import numpy as np
17
+ import torch
18
+ import torch.nn.functional as F
19
+
20
+ sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
21
+ from model.TwinLite_8class import TwinLiteNet8
22
+
23
+ NAMES = ["tree", "ground", "person", "sky", "road", "mountain", "building", "background"]
24
+ PALETTE = np.array([
25
+ [60, 220, 60], # tree
26
+ [40, 100, 160], # ground
27
+ [40, 40, 230], # person
28
+ [230, 200, 60], # sky
29
+ [140, 140, 140], # road
30
+ [180, 60, 180], # mountain
31
+ [50, 220, 220], # building
32
+ [100, 100, 100], # background
33
+ ], dtype=np.uint8)
34
+ TRAIN_W, TRAIN_H = 640, 360
35
+
36
+
37
+ def load_model(weights, device="cuda"):
38
+ model = TwinLiteNet8(num_classes=8).to(device).eval()
39
+ ckpt = torch.load(weights, map_location=device, weights_only=False)
40
+ model.load_state_dict(ckpt["model"] if "model" in ckpt else ckpt)
41
+ return model
42
+
43
+
44
+ def predict(model, bgr_img, device="cuda"):
45
+ """BGR uint8 → (H,W) class id mask 0..7 at original resolution."""
46
+ H, W = bgr_img.shape[:2]
47
+ inp_bgr = cv2.resize(bgr_img, (TRAIN_W, TRAIN_H))
48
+ rgb = cv2.cvtColor(inp_bgr, cv2.COLOR_BGR2RGB).astype(np.float32) / 255.0
49
+ x = torch.from_numpy(rgb.transpose(2, 0, 1)).unsqueeze(0).float().to(device)
50
+ with torch.no_grad():
51
+ logits = model(x)
52
+ # Upsample logits to original resolution before argmax (cleaner boundaries)
53
+ logits = F.interpolate(logits, size=(H, W), mode="bilinear", align_corners=False)
54
+ # v2: channel 7 (background) was never trained -> mask it out so it can't win argmax
55
+ logits[:, 7, :, :] = -1e9
56
+ return logits.argmax(1)[0].cpu().numpy().astype(np.uint8)
57
+
58
+
59
+ def colorize(mask):
60
+ return PALETTE[mask]
61
+
62
+
63
+ def overlay(bgr, mask, alpha=0.45):
64
+ return cv2.addWeighted(bgr, 1 - alpha, colorize(mask), alpha, 0)
65
+
66
+
67
+ def main():
68
+ ap = argparse.ArgumentParser()
69
+ ap.add_argument("input", nargs="?")
70
+ ap.add_argument("--dir")
71
+ ap.add_argument("--out", default=".")
72
+ ap.add_argument("--weights", default="run_8class/twinlite8_best.pt")
73
+ ap.add_argument("--device", default="cuda" if torch.cuda.is_available() else "cpu")
74
+ args = ap.parse_args()
75
+
76
+ if not args.input and not args.dir:
77
+ ap.print_help(); return
78
+
79
+ print(f"loading model from {args.weights} on {args.device} ...")
80
+ model = load_model(args.weights, device=args.device)
81
+ out_dir = Path(args.out); out_dir.mkdir(parents=True, exist_ok=True)
82
+
83
+ paths = []
84
+ if args.dir:
85
+ paths = sorted(p for p in Path(args.dir).iterdir() if p.suffix.lower() in {".jpg",".jpeg",".png",".bmp"})
86
+ if args.input:
87
+ paths.append(Path(args.input))
88
+
89
+ for p in paths:
90
+ img = cv2.imread(str(p))
91
+ if img is None:
92
+ print(f" skip: {p}"); continue
93
+ mask = predict(model, img, device=args.device)
94
+ cv2.imwrite(str(out_dir / f"{p.stem}_pred.png"), mask)
95
+ cv2.imwrite(str(out_dir / f"{p.stem}_overlay.jpg"), overlay(img, mask))
96
+ counts = np.bincount(mask.flatten(), minlength=8)
97
+ top = counts.argmax()
98
+ print(f" {p.name:<50} top: {NAMES[top]} ({100*counts[top]/counts.sum():.1f}%)")
99
+ print(f"\noutputs -> {out_dir.resolve()}")
100
+
101
+
102
+ if __name__ == "__main__":
103
+ main()
predict_onnx.py ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """TwinLiteNet8 ONNX inference — for edge deployment / cross-platform.
2
+
3
+ Runs entirely via ONNX Runtime (no PyTorch needed at deploy time).
4
+ Use CPUExecutionProvider for CPU, CUDAExecutionProvider for GPU,
5
+ TensorRTExecutionProvider for TensorRT-accelerated runs on Jetson.
6
+
7
+ Usage:
8
+ python predict_onnx.py input.jpg --onnx twinlite8.onnx
9
+ python predict_onnx.py --dir frames/ --out out/ --onnx twinlite8.onnx --provider CUDAExecutionProvider
10
+ """
11
+ import argparse
12
+ from pathlib import Path
13
+ import cv2
14
+ import numpy as np
15
+ import onnxruntime as ort
16
+
17
+ NAMES = ["tree","ground","person","sky","road","mountain","building","background"]
18
+ PALETTE = np.array([
19
+ [60,220,60],[40,100,160],[40,40,230],[230,200,60],
20
+ [140,140,140],[180,60,180],[50,220,220],[100,100,100],
21
+ ], dtype=np.uint8)
22
+ TRAIN_W, TRAIN_H = 640, 360
23
+
24
+
25
+ def predict(sess, bgr_img):
26
+ H, W = bgr_img.shape[:2]
27
+ inp = cv2.resize(bgr_img, (TRAIN_W, TRAIN_H))
28
+ rgb = cv2.cvtColor(inp, cv2.COLOR_BGR2RGB).astype(np.float32) / 255.0
29
+ x = rgb.transpose(2, 0, 1)[None].astype(np.float32) # (1,3,H,W)
30
+ logits = sess.run(None, {"input": x})[0] # (1,8,H,W)
31
+ logits[:, 7, :, :] = -1e9 # v2: bg channel never trained
32
+ pred_small = logits.argmax(1)[0].astype(np.uint8) # at training res
33
+ if (H, W) != (TRAIN_H, TRAIN_W):
34
+ return cv2.resize(pred_small, (W, H), interpolation=cv2.INTER_NEAREST)
35
+ return pred_small
36
+
37
+
38
+ def main():
39
+ ap = argparse.ArgumentParser()
40
+ ap.add_argument("input", nargs="?")
41
+ ap.add_argument("--dir")
42
+ ap.add_argument("--out", default=".")
43
+ ap.add_argument("--onnx", default="twinlite8.onnx")
44
+ ap.add_argument("--provider", default=None,
45
+ help="ONNX provider: CPUExecutionProvider | CUDAExecutionProvider | TensorrtExecutionProvider")
46
+ args = ap.parse_args()
47
+
48
+ if not args.input and not args.dir:
49
+ ap.print_help(); return
50
+
51
+ available = ort.get_available_providers()
52
+ if args.provider:
53
+ providers = [args.provider]
54
+ else:
55
+ # Auto-pick best
56
+ for p in ["TensorrtExecutionProvider", "CUDAExecutionProvider", "CPUExecutionProvider"]:
57
+ if p in available: providers = [p]; break
58
+ print(f"available providers: {available}")
59
+ print(f"using: {providers}")
60
+ sess = ort.InferenceSession(args.onnx, providers=providers)
61
+ print(f"actual provider: {sess.get_providers()}")
62
+
63
+ out_dir = Path(args.out); out_dir.mkdir(parents=True, exist_ok=True)
64
+ paths = []
65
+ if args.dir:
66
+ paths = sorted(p for p in Path(args.dir).iterdir() if p.suffix.lower() in {".jpg",".jpeg",".png",".bmp"})
67
+ if args.input: paths.append(Path(args.input))
68
+
69
+ for p in paths:
70
+ img = cv2.imread(str(p))
71
+ if img is None: continue
72
+ mask = predict(sess, img)
73
+ cv2.imwrite(str(out_dir / f"{p.stem}_pred.png"), mask)
74
+ overlay = cv2.addWeighted(img, 0.55, PALETTE[mask], 0.45, 0)
75
+ cv2.imwrite(str(out_dir / f"{p.stem}_overlay.jpg"), overlay)
76
+ counts = np.bincount(mask.flatten(), minlength=8)
77
+ top = counts.argmax()
78
+ print(f" {p.name:<50} top: {NAMES[top]} ({100*counts[top]/counts.sum():.1f}%)")
79
+
80
+ print(f"\noutputs -> {out_dir.resolve()}")
81
+
82
+
83
+ if __name__ == "__main__":
84
+ main()
samples/0_frame_3884.jpg ADDED

Git LFS Details

  • SHA256: 0e7ffd46a313018e24c30bf790f96000c1f10f0b7c922b7157dee48078ae0112
  • Pointer size: 131 Bytes
  • Size of remote file: 593 kB
samples/1_frame_2803.jpg ADDED

Git LFS Details

  • SHA256: ae8153f02dffb4bc45d07c0681b14f6d1faf0efba59f8f3588c83ff380607ace
  • Pointer size: 131 Bytes
  • Size of remote file: 627 kB
samples/2_frame_2626.jpg ADDED

Git LFS Details

  • SHA256: e542c7cbda64f21b734563835fc6a903d4e93028ce00a74c3a39132a8e9caa8e
  • Pointer size: 131 Bytes
  • Size of remote file: 585 kB
samples/3_frame_4093.jpg ADDED

Git LFS Details

  • SHA256: 450ae432fd89fa32874aa7b518f7a4693ec63e6a3af018837a09eedef57f969a
  • Pointer size: 131 Bytes
  • Size of remote file: 618 kB
samples/4_frame_3138.jpg ADDED

Git LFS Details

  • SHA256: 8825e95881d9fe7f5d8464f449f6bc5f80e7a38a572b2b18db3a9367c21eee80
  • Pointer size: 131 Bytes
  • Size of remote file: 567 kB
samples/5_frame_3076.jpg ADDED

Git LFS Details

  • SHA256: c51194cc7c7410605b6d4f3cd6513eb540d47fe9c50e7ae1ef1f46a872273c3b
  • Pointer size: 131 Bytes
  • Size of remote file: 635 kB
samples_20/sample_00_frame_3884.jpg ADDED

Git LFS Details

  • SHA256: 06a45917f6f22aed83bf71fce580198daff56f4d915649a1e6075b2a4a529901
  • Pointer size: 131 Bytes
  • Size of remote file: 598 kB
samples_20/sample_01_frame_2803.jpg ADDED

Git LFS Details

  • SHA256: 78f8ab77e3c9ae89985b93b7c25de18e34ae9203edc4db7a79d666e01e6a0bab
  • Pointer size: 131 Bytes
  • Size of remote file: 629 kB
samples_20/sample_02_frame_2626.jpg ADDED

Git LFS Details

  • SHA256: e8a9fd7d2db82129ba70c64dd7429aa348466e97c8326a376b3cf4eddf12a9cc
  • Pointer size: 131 Bytes
  • Size of remote file: 588 kB
samples_20/sample_03_frame_4093.jpg ADDED

Git LFS Details

  • SHA256: a7445337163ef8c505b6c70fe0a02b8b3923915f3f6478e375ccb6f0cbd9ecf1
  • Pointer size: 131 Bytes
  • Size of remote file: 621 kB
samples_20/sample_04_frame_3138.jpg ADDED

Git LFS Details

  • SHA256: 355cd2f0d06f82c6c930b4aae7b45667169afdbeff2c9946f8a8ad3e68fe866e
  • Pointer size: 131 Bytes
  • Size of remote file: 564 kB
samples_20/sample_05_frame_3076.jpg ADDED

Git LFS Details

  • SHA256: d2288fe50ebd6e51352b5decf46d0365637e7b033c3aa69f2287e88049820358
  • Pointer size: 131 Bytes
  • Size of remote file: 624 kB
samples_20/sample_06_frame_3032.jpg ADDED

Git LFS Details

  • SHA256: c0cdc5bada40565175908590dfaa32b6ad50ebd2ba79895185dc5e4858e47e75
  • Pointer size: 131 Bytes
  • Size of remote file: 536 kB
samples_20/sample_07_frame_2860.jpg ADDED

Git LFS Details

  • SHA256: f784538abff1dfc3d066e162b92edaa34cd068ecbb66ee72c4f13c0d6a3b03dd
  • Pointer size: 131 Bytes
  • Size of remote file: 571 kB
samples_20/sample_08_frame_4083.jpg ADDED

Git LFS Details

  • SHA256: e0c8229ddb71e30dce1d092cec091cef9a50dd983ba7c3b43c68ac104a076677
  • Pointer size: 131 Bytes
  • Size of remote file: 598 kB
samples_20/sample_09_frame_2784.jpg ADDED

Git LFS Details

  • SHA256: 9a468ca29e5eb427119edcbc1d7101275c7680979590f5cdda3634d7602dacdc
  • Pointer size: 131 Bytes
  • Size of remote file: 618 kB
samples_20/sample_10_frame_3960.jpg ADDED

Git LFS Details

  • SHA256: 282880811250d63924dff1cb87e6e73d9314a6cc2a0537e95dc936f5734e309b
  • Pointer size: 131 Bytes
  • Size of remote file: 580 kB
samples_20/sample_11_frame_4091.jpg ADDED

Git LFS Details

  • SHA256: bec509c2b188adfb4ae7a630ebc07460e3207779a843b59890ba9ad7ba1e1aa6
  • Pointer size: 131 Bytes
  • Size of remote file: 626 kB
samples_20/sample_12_frame_4402.jpg ADDED

Git LFS Details

  • SHA256: 5ec26967a021ba6279ee1a4630b7c13951e2e3833b2f1ee87c836520f1427aab
  • Pointer size: 131 Bytes
  • Size of remote file: 600 kB
samples_20/sample_13_frame_3691.jpg ADDED

Git LFS Details

  • SHA256: b32d03143eeab425cba9506ce08a27cf36ce0de2705071f9428a6527c28b2530
  • Pointer size: 131 Bytes
  • Size of remote file: 720 kB
samples_20/sample_14_frame_2753.jpg ADDED

Git LFS Details

  • SHA256: 70a9dd164493046ab49105864a4af902d8c88e3f27549552d3c58cf8a5d961f3
  • Pointer size: 131 Bytes
  • Size of remote file: 590 kB
samples_20/sample_15_frame_3784.jpg ADDED

Git LFS Details

  • SHA256: 68fd7411124a1661f002eb2df9f906d47a2c1131b4ac6c059b99b6e822cefec0
  • Pointer size: 131 Bytes
  • Size of remote file: 613 kB
samples_20/sample_16_frame_3439.jpg ADDED

Git LFS Details

  • SHA256: 5790ee3e9709d8f0a102ad09ca4c5a03eacc0421f12bb913911f77735697a67f
  • Pointer size: 131 Bytes
  • Size of remote file: 638 kB
samples_20/sample_17_frame_2640.jpg ADDED

Git LFS Details

  • SHA256: 0357e1cf5ed27d66b5933915178dfbc89ac12c32931fc7a3be27b2b907efc7a4
  • Pointer size: 131 Bytes
  • Size of remote file: 585 kB
samples_20/sample_18_frame_2636.jpg ADDED

Git LFS Details

  • SHA256: 52b1d9355c6edf4acc2ce6fbb990fa5dc6ae75d9353f45797002ee9039251dd5
  • Pointer size: 131 Bytes
  • Size of remote file: 593 kB
samples_20/sample_19_frame_2766.jpg ADDED

Git LFS Details

  • SHA256: debf233c10fc4ef1a85858f5af26d4c917e11ea98fa998bee3dc33549ba42b9b
  • Pointer size: 131 Bytes
  • Size of remote file: 592 kB
train_8class.py ADDED
@@ -0,0 +1,247 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """TwinLiteNet8 — single-branch 8-class semantic seg, directly comparable to Segformer.
2
+
3
+ Classes: 0 tree 1 ground 2 person 3 sky 4 road 5 mountain 6 building 7 background
4
+ """
5
+ from __future__ import annotations
6
+ import os, sys, json, re, time, random
7
+ from pathlib import Path
8
+ import numpy as np, cv2, torch
9
+ import torch.nn as nn
10
+ import torch.nn.functional as F
11
+ from torch.utils.data import Dataset, DataLoader, ConcatDataset
12
+
13
+ sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
14
+ from model.TwinLite_8class import TwinLiteNet8
15
+
16
+ # ───────── config ─────────
17
+ ROOT = Path(r"C:/Users/room104/Desktop/AGMOtree/semantic_segmantation")
18
+ OLD_IMG = ROOT / "merged_dataset/train/images"
19
+ OLD_MSK = ROOT / "merged_dataset/train/masks_pseudo"
20
+ NEW_IMG = ROOT / "orchard_nav/train/images"
21
+ NEW_MSK = ROOT / "orchard_nav/train/masks"
22
+
23
+ OUT_DIR = Path(r"C:/Users/room104/Desktop/AGMOtree/TwinLiteNet_train/run_v2")
24
+ OUT_DIR.mkdir(parents=True, exist_ok=True)
25
+
26
+ NAMES = ["tree","ground","person","sky","road","mountain","building","background"]
27
+ NUM_CLASSES = 8
28
+ IGNORE_INDEX = 255
29
+
30
+ W_IN, H_IN = 640, 360
31
+ BATCH = 16
32
+ EPOCHS = 60
33
+ LR = 5e-4
34
+ NUM_WORKERS = 4
35
+ SEED = 42
36
+ DEVICE = "cuda"
37
+
38
+ # v2 design: background is NOT a real class. Pixels labeled 7 → 255 (ignore_index)
39
+ # in the loader, so loss never trains channel 7. Weight 0 as belt-and-braces.
40
+ # At inference, channel 7 logit is set to -inf before argmax (see predict.py update).
41
+ WEIGHTS = np.array([1.5, 0.5, 1.5, 1.0, 1.0, 1.0, 1.0, 0.0], dtype=np.float32)
42
+
43
+ random.seed(SEED); np.random.seed(SEED); torch.manual_seed(SEED)
44
+
45
+
46
+ def frame_num(p):
47
+ m = re.match(r"frame_(\d+)", p.stem); return int(m.group(1)) if m else -1
48
+
49
+
50
+ class OrchardDS(Dataset):
51
+ def __init__(self, paths, mask_dir, augment=False, source="old"):
52
+ self.paths = paths
53
+ self.mask_dir = mask_dir
54
+ self.augment = augment
55
+ self.source = source
56
+
57
+ def __len__(self): return len(self.paths)
58
+
59
+ def __getitem__(self, i):
60
+ ip = self.paths[i]
61
+ img = cv2.imread(str(ip))
62
+ msk = cv2.imread(str(self.mask_dir / (ip.stem + ".png")), cv2.IMREAD_GRAYSCALE)
63
+ if img is None or msk is None:
64
+ img = np.zeros((H_IN, W_IN, 3), dtype=np.uint8)
65
+ msk = np.full((H_IN, W_IN), IGNORE_INDEX, dtype=np.uint8)
66
+
67
+ if self.augment:
68
+ if random.random() < 0.5:
69
+ img = np.ascontiguousarray(img[:, ::-1])
70
+ msk = np.ascontiguousarray(msk[:, ::-1])
71
+ if random.random() < 0.5:
72
+ hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV).astype(np.int16)
73
+ hsv[..., 0] = (hsv[..., 0] + random.randint(-10, 10)) % 180
74
+ hsv[..., 1] = np.clip(hsv[..., 1] * random.uniform(0.7, 1.3), 0, 255)
75
+ hsv[..., 2] = np.clip(hsv[..., 2] * random.uniform(0.7, 1.3), 0, 255)
76
+ img = cv2.cvtColor(hsv.astype(np.uint8), cv2.COLOR_HSV2BGR)
77
+
78
+ img = cv2.resize(img, (W_IN, H_IN))
79
+ msk = cv2.resize(msk, (W_IN, H_IN), interpolation=cv2.INTER_NEAREST)
80
+
81
+ # v2: remap class 7 (background) -> IGNORE_INDEX so it does NOT train.
82
+ # The user's intent: "background = stuff the model can't recognize", not a real class.
83
+ if self.source == "old":
84
+ msk[msk == 7] = IGNORE_INDEX
85
+ # new-source masks already have 255 for non-tree pixels, no change needed.
86
+
87
+ img = img[:, :, ::-1].transpose(2, 0, 1).astype(np.float32) / 255.0
88
+ return (torch.from_numpy(img).float(),
89
+ torch.from_numpy(msk).long())
90
+
91
+
92
+ # ─── temporal split ───
93
+ old_all = sorted(OLD_IMG.glob("*.jpg"))
94
+ old_train = [p for p in old_all if frame_num(p) <= 4500]
95
+ old_val = [p for p in old_all if frame_num(p) > 4500]
96
+
97
+ new_all = sorted(NEW_IMG.glob("*.jpg")); random.shuffle(new_all)
98
+ n_new_val = max(20, len(new_all) // 10)
99
+ new_val = new_all[:n_new_val]
100
+ new_train = new_all[n_new_val:]
101
+
102
+ train_ds = ConcatDataset([
103
+ OrchardDS(old_train, OLD_MSK, augment=True, source="old"),
104
+ OrchardDS(new_train, NEW_MSK, augment=True, source="new"),
105
+ ])
106
+ old_val_ds = OrchardDS(old_val, OLD_MSK, augment=False, source="old")
107
+ new_val_ds = OrchardDS(new_val, NEW_MSK, augment=False, source="new")
108
+
109
+ print(f"=== TwinLiteNet8 (single-branch, 8-class) ===")
110
+ print(f" old train: {len(old_train)} new train: {len(new_train)}")
111
+ print(f" old val: {len(old_val)} new val: {len(new_val)}")
112
+
113
+
114
+ # ─── eval ───
115
+ def confusion(preds, ys, n, ignore=IGNORE_INDEX):
116
+ cm = np.zeros((n, n), dtype=np.int64)
117
+ valid = ys != ignore
118
+ if not valid.any(): return cm
119
+ p = preds[valid]; t = ys[valid]
120
+ for tc in range(n):
121
+ mt = (t == tc)
122
+ if not mt.any(): continue
123
+ for pc in range(n):
124
+ cm[tc, pc] += int(((p == pc) & mt).sum())
125
+ return cm
126
+
127
+ def iou_from_cm(cm):
128
+ n = cm.shape[0]; ious = np.zeros(n)
129
+ for c in range(n):
130
+ tp = cm[c,c]; fp = cm[:,c].sum()-tp; fn = cm[c,:].sum()-tp
131
+ ious[c] = tp / (tp+fp+fn) if (tp+fp+fn) > 0 else float("nan")
132
+ return ious
133
+
134
+
135
+ # ─── train ───
136
+ log_path = OUT_DIR / "log.txt"
137
+ def log(m):
138
+ print(m, flush=True)
139
+ with log_path.open("a", encoding="utf-8") as f: f.write(m + "\n")
140
+
141
+
142
+ def main():
143
+ log_path.write_text("")
144
+ train_loader = DataLoader(train_ds, batch_size=BATCH, shuffle=True,
145
+ num_workers=NUM_WORKERS, pin_memory=True, drop_last=True,
146
+ persistent_workers=True)
147
+ old_val_loader = DataLoader(old_val_ds, batch_size=BATCH, shuffle=False,
148
+ num_workers=2, pin_memory=True, persistent_workers=True)
149
+ new_val_loader = DataLoader(new_val_ds, batch_size=BATCH, shuffle=False,
150
+ num_workers=2, pin_memory=True, persistent_workers=True)
151
+
152
+ model = TwinLiteNet8(num_classes=NUM_CLASSES).to(DEVICE)
153
+ n_params = sum(p.numel() for p in model.parameters())
154
+ log(f"model: TwinLiteNet8 params: {n_params/1e6:.3f}M")
155
+ log(f"input: {W_IN}x{H_IN} batch: {BATCH} epochs: {EPOCHS} LR: {LR}")
156
+ log(f"classes: {NAMES}")
157
+ log(f"weights: {dict(zip(NAMES, [round(float(w),2) for w in WEIGHTS]))}")
158
+ log(f"train: {len(train_ds)} old_val: {len(old_val_ds)} new_val: {len(new_val_ds)}")
159
+
160
+ cw = torch.tensor(WEIGHTS, dtype=torch.float32, device=DEVICE)
161
+ loss_fn = nn.CrossEntropyLoss(weight=cw, ignore_index=IGNORE_INDEX)
162
+ optim = torch.optim.AdamW(model.parameters(), lr=LR, weight_decay=1e-4)
163
+ sched = torch.optim.lr_scheduler.CosineAnnealingLR(optim, T_max=EPOCHS * len(train_loader))
164
+
165
+ best_tree = -1.0
166
+ history = []
167
+ for epoch in range(1, EPOCHS+1):
168
+ model.train()
169
+ t0 = time.time()
170
+ ep_loss = 0.0
171
+ for x, y in train_loader:
172
+ x = x.cuda(non_blocking=True); y = y.cuda(non_blocking=True)
173
+ logits = model(x)
174
+ loss = loss_fn(logits, y)
175
+ optim.zero_grad(); loss.backward(); optim.step(); sched.step()
176
+ ep_loss += loss.item()
177
+ train_loss = ep_loss / len(train_loader)
178
+
179
+ model.eval()
180
+ cm_old = np.zeros((NUM_CLASSES, NUM_CLASSES), dtype=np.int64)
181
+ tree_tp = tree_fn = 0
182
+ with torch.no_grad():
183
+ for x, y in old_val_loader:
184
+ x = x.cuda(); y = y.cuda()
185
+ logits = model(x)
186
+ logits[:, 7, :, :] = -1e9 # never predict background — that channel is untrained
187
+ preds = logits.argmax(1)
188
+ cm_old += confusion(preds.cpu().numpy(), y.cpu().numpy(), NUM_CLASSES)
189
+ for x, y in new_val_loader:
190
+ x = x.cuda(); y = y.cuda()
191
+ logits = model(x)
192
+ logits[:, 7, :, :] = -1e9
193
+ preds = logits.argmax(1).cpu().numpy()
194
+ ys = y.cpu().numpy()
195
+ tm = (ys == 0)
196
+ tree_tp += int(((preds == 0) & tm).sum())
197
+ tree_fn += int(((preds != 0) & tm).sum())
198
+
199
+ iou_old = iou_from_cm(cm_old)
200
+ miou_7 = float(np.nanmean(iou_old[:7]))
201
+ tree_old = float(iou_old[0])
202
+ ground_old = float(iou_old[1])
203
+ tree_recall_new = tree_tp / (tree_tp + tree_fn) if (tree_tp + tree_fn) > 0 else float("nan")
204
+ elapsed = time.time() - t0
205
+
206
+ log(f"epoch {epoch:02d}/{EPOCHS} loss={train_loss:.4f} "
207
+ f"mIoU(7)={miou_7:.3f} tree_old={tree_old:.3f} ground_old={ground_old:.3f} "
208
+ f"tree_new_recall={tree_recall_new:.3f} ({elapsed:.0f}s)")
209
+ log(f" per-class IoU: " + ", ".join(f"{n}={v:.3f}" for n, v in zip(NAMES, iou_old)))
210
+
211
+ history.append({
212
+ "epoch": epoch, "loss": float(train_loss),
213
+ "miou_7": miou_7, "tree_iou_old": tree_old, "ground_iou_old": ground_old,
214
+ "tree_recall_new": float(tree_recall_new),
215
+ "per_class_iou": {n: float(v) for n, v in zip(NAMES, iou_old)},
216
+ })
217
+ torch.save({"model": model.state_dict(), "epoch": epoch,
218
+ "tree_iou_old": tree_old, "miou_7": miou_7, "tree_recall_new": float(tree_recall_new)},
219
+ OUT_DIR / "twinlite8_last.pt")
220
+ if tree_old > best_tree:
221
+ best_tree = tree_old
222
+ torch.save({"model": model.state_dict(), "epoch": epoch,
223
+ "tree_iou_old": tree_old, "miou_7": miou_7, "tree_recall_new": float(tree_recall_new)},
224
+ OUT_DIR / "twinlite8_best.pt")
225
+ log(f" saved best (tree_old {tree_old:.3f})")
226
+ (OUT_DIR / "history.json").write_text(json.dumps(history, indent=2))
227
+
228
+ log(f"\n=== DONE === best tree_old IoU: {best_tree:.3f}")
229
+
230
+ # ─── FPS benchmark ───
231
+ log(f"\n=== FPS BENCHMARK (RTX 3080, batch=1, 640x360) ===")
232
+ model.eval()
233
+ x = torch.randn(1, 3, H_IN, W_IN, device=DEVICE)
234
+ with torch.no_grad():
235
+ for _ in range(20): model(x)
236
+ torch.cuda.synchronize()
237
+ t0 = time.time()
238
+ N = 200
239
+ for _ in range(N): model(x)
240
+ torch.cuda.synchronize()
241
+ fps = N / (time.time() - t0)
242
+ log(f" TwinLiteNet8 @ 640x360 batch=1: {fps:.1f} FPS")
243
+ log(f" Jetson Orin Nano estimate: ~{fps/4:.0f}-{fps/3:.0f} FPS")
244
+
245
+
246
+ if __name__ == "__main__":
247
+ main()
training_log.txt ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ model: TwinLiteNet8 params: 0.437M
2
+ input: 640x360 batch: 16 epochs: 60 LR: 0.0005
3
+ classes: ['tree', 'ground', 'person', 'sky', 'road', 'mountain', 'building', 'background']
4
+ weights: {'tree': 1.5, 'ground': 0.5, 'person': 1.5, 'sky': 1.0, 'road': 1.0, 'mountain': 1.0, 'building': 1.0, 'background': 0.0}
5
+ train: 5457 old_val: 155 new_val: 31
6
+ epoch 01/60 loss=1.3311 mIoU(7)=0.375 tree_old=0.818 ground_old=0.877 tree_new_recall=0.872 (94s)
7
+ per-class IoU: tree=0.818, ground=0.877, person=0.044, sky=0.779, road=0.001, mountain=0.090, building=0.012, background=nan
8
+ saved best (tree_old 0.818)
9
+ epoch 02/60 loss=0.7963 mIoU(7)=0.426 tree_old=0.812 ground_old=0.859 tree_new_recall=0.865 (65s)
10
+ per-class IoU: tree=0.812, ground=0.859, person=0.066, sky=0.800, road=0.020, mountain=0.396, building=0.033, background=nan
11
+ epoch 03/60 loss=0.5680 mIoU(7)=0.558 tree_old=0.850 ground_old=0.884 tree_new_recall=0.939 (65s)
12
+ per-class IoU: tree=0.850, ground=0.884, person=0.293, sky=0.819, road=0.645, mountain=0.393, building=0.023, background=nan
13
+ saved best (tree_old 0.850)
14
+ epoch 04/60 loss=0.4366 mIoU(7)=0.596 tree_old=0.853 ground_old=0.892 tree_new_recall=0.967 (65s)
15
+ per-class IoU: tree=0.853, ground=0.892, person=0.372, sky=0.826, road=0.584, mountain=0.475, building=0.169, background=nan
16
+ saved best (tree_old 0.853)
17
+ epoch 05/60 loss=0.3549 mIoU(7)=0.622 tree_old=0.836 ground_old=0.885 tree_new_recall=0.963 (66s)
18
+ per-class IoU: tree=0.836, ground=0.885, person=0.396, sky=0.824, road=0.618, mountain=0.485, building=0.310, background=nan
19
+ epoch 06/60 loss=0.2965 mIoU(7)=0.620 tree_old=0.831 ground_old=0.889 tree_new_recall=0.967 (65s)
20
+ per-class IoU: tree=0.831, ground=0.889, person=0.346, sky=0.820, road=0.710, mountain=0.495, building=0.245, background=nan
21
+ epoch 07/60 loss=0.2606 mIoU(7)=0.661 tree_old=0.860 ground_old=0.904 tree_new_recall=0.981 (66s)
22
+ per-class IoU: tree=0.860, ground=0.904, person=0.396, sky=0.830, road=0.643, mountain=0.552, building=0.439, background=nan
23
+ saved best (tree_old 0.860)
24
+ epoch 08/60 loss=0.2286 mIoU(7)=0.644 tree_old=0.854 ground_old=0.902 tree_new_recall=0.991 (66s)
25
+ per-class IoU: tree=0.854, ground=0.902, person=0.412, sky=0.816, road=0.649, mountain=0.531, building=0.347, background=nan
26
+ epoch 09/60 loss=0.2093 mIoU(7)=0.432 tree_old=0.770 ground_old=0.855 tree_new_recall=0.887 (65s)
27
+ per-class IoU: tree=0.770, ground=0.855, person=0.360, sky=0.435, road=0.118, mountain=0.349, building=0.136, background=nan
28
+ epoch 10/60 loss=0.1984 mIoU(7)=0.661 tree_old=0.834 ground_old=0.883 tree_new_recall=0.990 (66s)
29
+ per-class IoU: tree=0.834, ground=0.883, person=0.404, sky=0.824, road=0.715, mountain=0.523, building=0.442, background=nan
30
+ epoch 11/60 loss=0.1792 mIoU(7)=0.683 tree_old=0.855 ground_old=0.910 tree_new_recall=0.996 (65s)
31
+ per-class IoU: tree=0.855, ground=0.910, person=0.388, sky=0.825, road=0.764, mountain=0.538, building=0.503, background=nan
32
+ epoch 12/60 loss=0.1724 mIoU(7)=0.669 tree_old=0.842 ground_old=0.882 tree_new_recall=0.995 (65s)
33
+ per-class IoU: tree=0.842, ground=0.882, person=0.398, sky=0.823, road=0.720, mountain=0.550, building=0.467, background=nan
34
+ epoch 13/60 loss=0.1639 mIoU(7)=0.683 tree_old=0.834 ground_old=0.883 tree_new_recall=0.994 (66s)
35
+ per-class IoU: tree=0.834, ground=0.883, person=0.421, sky=0.827, road=0.740, mountain=0.560, building=0.513, background=nan
36
+ epoch 14/60 loss=0.1574 mIoU(7)=0.674 tree_old=0.848 ground_old=0.895 tree_new_recall=0.999 (67s)
37
+ per-class IoU: tree=0.848, ground=0.895, person=0.436, sky=0.809, road=0.727, mountain=0.547, building=0.458, background=nan
38
+ epoch 15/60 loss=0.1535 mIoU(7)=0.664 tree_old=0.847 ground_old=0.897 tree_new_recall=0.998 (65s)
39
+ per-class IoU: tree=0.847, ground=0.897, person=0.380, sky=0.794, road=0.686, mountain=0.559, building=0.483, background=nan
40
+ epoch 16/60 loss=0.1486 mIoU(7)=0.685 tree_old=0.851 ground_old=0.893 tree_new_recall=0.997 (67s)
41
+ per-class IoU: tree=0.851, ground=0.893, person=0.496, sky=0.803, road=0.692, mountain=0.567, building=0.492, background=nan
42
+ epoch 17/60 loss=0.1457 mIoU(7)=0.698 tree_old=0.856 ground_old=0.901 tree_new_recall=0.997 (68s)
43
+ per-class IoU: tree=0.856, ground=0.901, person=0.460, sky=0.826, road=0.720, mountain=0.570, building=0.550, background=nan
44
+ epoch 18/60 loss=0.1392 mIoU(7)=0.672 tree_old=0.859 ground_old=0.908 tree_new_recall=0.998 (66s)
45
+ per-class IoU: tree=0.859, ground=0.908, person=0.417, sky=0.830, road=0.739, mountain=0.579, building=0.368, background=nan
46
+ epoch 19/60 loss=0.1354 mIoU(7)=0.687 tree_old=0.862 ground_old=0.903 tree_new_recall=0.999 (65s)
47
+ per-class IoU: tree=0.862, ground=0.903, person=0.442, sky=0.821, road=0.694, mountain=0.581, building=0.505, background=nan
48
+ saved best (tree_old 0.862)
49
+ epoch 20/60 loss=0.1325 mIoU(7)=0.702 tree_old=0.863 ground_old=0.909 tree_new_recall=0.995 (67s)
50
+ per-class IoU: tree=0.863, ground=0.909, person=0.411, sky=0.829, road=0.747, mountain=0.536, building=0.620, background=nan
51
+ saved best (tree_old 0.863)
52
+ epoch 21/60 loss=0.1303 mIoU(7)=0.676 tree_old=0.859 ground_old=0.899 tree_new_recall=0.996 (66s)
53
+ per-class IoU: tree=0.859, ground=0.899, person=0.390, sky=0.825, road=0.689, mountain=0.595, building=0.473, background=nan
54
+ epoch 22/60 loss=0.1275 mIoU(7)=0.713 tree_old=0.865 ground_old=0.907 tree_new_recall=0.998 (65s)
55
+ per-class IoU: tree=0.865, ground=0.907, person=0.490, sky=0.820, road=0.724, mountain=0.576, building=0.606, background=nan
56
+ saved best (tree_old 0.865)
57
+ epoch 23/60 loss=0.1288 mIoU(7)=0.711 tree_old=0.864 ground_old=0.909 tree_new_recall=0.999 (67s)
58
+ per-class IoU: tree=0.864, ground=0.909, person=0.458, sky=0.827, road=0.728, mountain=0.577, building=0.611, background=nan
59
+ epoch 24/60 loss=0.1230 mIoU(7)=0.696 tree_old=0.863 ground_old=0.912 tree_new_recall=0.999 (66s)
60
+ per-class IoU: tree=0.863, ground=0.912, person=0.431, sky=0.820, road=0.757, mountain=0.584, building=0.506, background=nan
61
+ epoch 25/60 loss=0.1228 mIoU(7)=0.700 tree_old=0.857 ground_old=0.912 tree_new_recall=1.000 (65s)
62
+ per-class IoU: tree=0.857, ground=0.912, person=0.444, sky=0.824, road=0.802, mountain=0.571, building=0.494, background=nan
63
+ epoch 26/60 loss=0.1200 mIoU(7)=0.695 tree_old=0.866 ground_old=0.913 tree_new_recall=0.999 (67s)
64
+ per-class IoU: tree=0.866, ground=0.913, person=0.422, sky=0.837, road=0.716, mountain=0.563, building=0.549, background=nan
65
+ saved best (tree_old 0.866)
66
+ epoch 27/60 loss=0.1185 mIoU(7)=0.695 tree_old=0.862 ground_old=0.908 tree_new_recall=0.999 (66s)
67
+ per-class IoU: tree=0.862, ground=0.908, person=0.407, sky=0.828, road=0.732, mountain=0.578, building=0.550, background=nan
68
+ epoch 28/60 loss=0.1161 mIoU(7)=0.714 tree_old=0.860 ground_old=0.910 tree_new_recall=0.998 (64s)
69
+ per-class IoU: tree=0.860, ground=0.910, person=0.458, sky=0.826, road=0.768, mountain=0.577, building=0.596, background=nan
70
+ epoch 29/60 loss=0.1151 mIoU(7)=0.708 tree_old=0.872 ground_old=0.916 tree_new_recall=0.999 (66s)
71
+ per-class IoU: tree=0.872, ground=0.916, person=0.441, sky=0.835, road=0.745, mountain=0.592, building=0.555, background=nan
72
+ saved best (tree_old 0.872)
73
+ epoch 30/60 loss=0.1131 mIoU(7)=0.698 tree_old=0.865 ground_old=0.910 tree_new_recall=0.999 (66s)
74
+ per-class IoU: tree=0.865, ground=0.910, person=0.467, sky=0.834, road=0.755, mountain=0.592, building=0.467, background=nan
75
+ epoch 31/60 loss=0.1110 mIoU(7)=0.694 tree_old=0.855 ground_old=0.905 tree_new_recall=0.999 (65s)
76
+ per-class IoU: tree=0.855, ground=0.905, person=0.433, sky=0.812, road=0.772, mountain=0.586, building=0.498, background=nan
77
+ epoch 32/60 loss=0.1115 mIoU(7)=0.719 tree_old=0.865 ground_old=0.916 tree_new_recall=0.998 (67s)
78
+ per-class IoU: tree=0.865, ground=0.916, person=0.466, sky=0.833, road=0.803, mountain=0.578, building=0.570, background=nan
79
+ epoch 33/60 loss=0.1088 mIoU(7)=0.711 tree_old=0.869 ground_old=0.916 tree_new_recall=0.999 (69s)
80
+ per-class IoU: tree=0.869, ground=0.916, person=0.494, sky=0.838, road=0.761, mountain=0.595, building=0.502, background=nan
81
+ epoch 34/60 loss=0.1071 mIoU(7)=0.702 tree_old=0.865 ground_old=0.910 tree_new_recall=0.999 (70s)
82
+ per-class IoU: tree=0.865, ground=0.910, person=0.455, sky=0.827, road=0.753, mountain=0.562, building=0.541, background=nan
83
+ epoch 35/60 loss=0.1064 mIoU(7)=0.696 tree_old=0.861 ground_old=0.908 tree_new_recall=1.000 (65s)
84
+ per-class IoU: tree=0.861, ground=0.908, person=0.476, sky=0.816, road=0.754, mountain=0.578, building=0.478, background=nan
85
+ epoch 36/60 loss=0.1052 mIoU(7)=0.704 tree_old=0.860 ground_old=0.910 tree_new_recall=0.999 (66s)
86
+ per-class IoU: tree=0.860, ground=0.910, person=0.477, sky=0.829, road=0.787, mountain=0.592, building=0.470, background=nan
87
+ epoch 37/60 loss=0.1050 mIoU(7)=0.703 tree_old=0.860 ground_old=0.908 tree_new_recall=0.999 (64s)
88
+ per-class IoU: tree=0.860, ground=0.908, person=0.479, sky=0.827, road=0.769, mountain=0.593, building=0.488, background=nan
89
+ epoch 38/60 loss=0.1034 mIoU(7)=0.704 tree_old=0.863 ground_old=0.911 tree_new_recall=0.999 (63s)
90
+ per-class IoU: tree=0.863, ground=0.911, person=0.441, sky=0.829, road=0.778, mountain=0.587, building=0.521, background=nan
91
+ epoch 39/60 loss=0.1025 mIoU(7)=0.713 tree_old=0.865 ground_old=0.912 tree_new_recall=0.999 (64s)
92
+ per-class IoU: tree=0.865, ground=0.912, person=0.449, sky=0.842, road=0.760, mountain=0.597, building=0.565, background=nan
93
+ epoch 40/60 loss=0.1010 mIoU(7)=0.713 tree_old=0.858 ground_old=0.909 tree_new_recall=0.999 (65s)
94
+ per-class IoU: tree=0.858, ground=0.909, person=0.477, sky=0.820, road=0.800, mountain=0.596, building=0.530, background=nan
95
+ epoch 41/60 loss=0.0999 mIoU(7)=0.704 tree_old=0.862 ground_old=0.911 tree_new_recall=1.000 (66s)
96
+ per-class IoU: tree=0.862, ground=0.911, person=0.455, sky=0.815, road=0.788, mountain=0.581, building=0.514, background=nan
97
+ epoch 42/60 loss=0.0992 mIoU(7)=0.713 tree_old=0.866 ground_old=0.916 tree_new_recall=0.999 (66s)
98
+ per-class IoU: tree=0.866, ground=0.916, person=0.453, sky=0.836, road=0.804, mountain=0.595, building=0.524, background=nan
99
+ epoch 43/60 loss=0.0980 mIoU(7)=0.717 tree_old=0.860 ground_old=0.909 tree_new_recall=1.000 (65s)
100
+ per-class IoU: tree=0.860, ground=0.909, person=0.460, sky=0.822, road=0.814, mountain=0.591, building=0.559, background=nan
101
+ epoch 44/60 loss=0.0975 mIoU(7)=0.704 tree_old=0.860 ground_old=0.910 tree_new_recall=0.999 (65s)
102
+ per-class IoU: tree=0.860, ground=0.910, person=0.444, sky=0.830, road=0.809, mountain=0.578, building=0.495, background=nan
103
+ epoch 45/60 loss=0.0967 mIoU(7)=0.720 tree_old=0.861 ground_old=0.910 tree_new_recall=0.999 (66s)
104
+ per-class IoU: tree=0.861, ground=0.910, person=0.462, sky=0.830, road=0.809, mountain=0.598, building=0.574, background=nan
105
+ epoch 46/60 loss=0.0958 mIoU(7)=0.715 tree_old=0.859 ground_old=0.904 tree_new_recall=0.999 (64s)
106
+ per-class IoU: tree=0.859, ground=0.904, person=0.457, sky=0.825, road=0.787, mountain=0.600, building=0.571, background=nan
107
+ epoch 47/60 loss=0.0953 mIoU(7)=0.717 tree_old=0.863 ground_old=0.912 tree_new_recall=0.999 (65s)
108
+ per-class IoU: tree=0.863, ground=0.912, person=0.466, sky=0.818, road=0.799, mountain=0.601, building=0.560, background=nan
109
+ epoch 48/60 loss=0.0942 mIoU(7)=0.717 tree_old=0.865 ground_old=0.914 tree_new_recall=0.999 (65s)
110
+ per-class IoU: tree=0.865, ground=0.914, person=0.468, sky=0.829, road=0.800, mountain=0.592, building=0.551, background=nan
111
+ epoch 49/60 loss=0.0938 mIoU(7)=0.715 tree_old=0.863 ground_old=0.911 tree_new_recall=0.999 (66s)
112
+ per-class IoU: tree=0.863, ground=0.911, person=0.479, sky=0.825, road=0.794, mountain=0.594, building=0.535, background=nan
113
+ epoch 50/60 loss=0.0934 mIoU(7)=0.718 tree_old=0.862 ground_old=0.912 tree_new_recall=0.999 (65s)
114
+ per-class IoU: tree=0.862, ground=0.912, person=0.469, sky=0.828, road=0.812, mountain=0.590, building=0.551, background=nan
115
+ epoch 51/60 loss=0.0931 mIoU(7)=0.717 tree_old=0.861 ground_old=0.913 tree_new_recall=0.999 (64s)
116
+ per-class IoU: tree=0.861, ground=0.913, person=0.460, sky=0.821, road=0.818, mountain=0.593, building=0.551, background=nan
117
+ epoch 52/60 loss=0.0926 mIoU(7)=0.715 tree_old=0.860 ground_old=0.910 tree_new_recall=0.999 (65s)
118
+ per-class IoU: tree=0.860, ground=0.910, person=0.460, sky=0.824, road=0.804, mountain=0.597, building=0.547, background=nan
119
+ epoch 53/60 loss=0.0918 mIoU(7)=0.717 tree_old=0.860 ground_old=0.910 tree_new_recall=0.999 (64s)
120
+ per-class IoU: tree=0.860, ground=0.910, person=0.466, sky=0.820, road=0.806, mountain=0.595, building=0.561, background=nan
121
+ epoch 54/60 loss=0.0916 mIoU(7)=0.714 tree_old=0.861 ground_old=0.911 tree_new_recall=1.000 (67s)
122
+ per-class IoU: tree=0.861, ground=0.911, person=0.471, sky=0.822, road=0.807, mountain=0.588, building=0.541, background=nan
123
+ epoch 55/60 loss=0.0912 mIoU(7)=0.719 tree_old=0.863 ground_old=0.913 tree_new_recall=0.999 (69s)
124
+ per-class IoU: tree=0.863, ground=0.913, person=0.476, sky=0.826, road=0.808, mountain=0.593, building=0.554, background=nan
125
+ epoch 56/60 loss=0.0911 mIoU(7)=0.717 tree_old=0.862 ground_old=0.913 tree_new_recall=0.999 (70s)
126
+ per-class IoU: tree=0.862, ground=0.913, person=0.475, sky=0.823, road=0.807, mountain=0.596, building=0.545, background=nan
127
+ epoch 57/60 loss=0.0913 mIoU(7)=0.715 tree_old=0.860 ground_old=0.910 tree_new_recall=0.999 (68s)
128
+ per-class IoU: tree=0.860, ground=0.910, person=0.465, sky=0.825, road=0.802, mountain=0.593, building=0.548, background=nan
129
+ epoch 58/60 loss=0.0911 mIoU(7)=0.718 tree_old=0.863 ground_old=0.913 tree_new_recall=0.999 (65s)
130
+ per-class IoU: tree=0.863, ground=0.913, person=0.470, sky=0.824, road=0.813, mountain=0.594, building=0.552, background=nan
131
+ epoch 59/60 loss=0.0907 mIoU(7)=0.716 tree_old=0.862 ground_old=0.912 tree_new_recall=0.999 (64s)
132
+ per-class IoU: tree=0.862, ground=0.912, person=0.467, sky=0.824, road=0.803, mountain=0.596, building=0.549, background=nan
133
+ epoch 60/60 loss=0.0903 mIoU(7)=0.718 tree_old=0.862 ground_old=0.914 tree_new_recall=0.999 (67s)
134
+ per-class IoU: tree=0.862, ground=0.914, person=0.466, sky=0.825, road=0.811, mountain=0.593, building=0.558, background=nan
135
+
136
+ === DONE === best tree_old IoU: 0.872
137
+
138
+ === FPS BENCHMARK (RTX 3080, batch=1, 640x360) ===
139
+ TwinLiteNet8 @ 640x360 batch=1: 137.1 FPS
140
+ Jetson Orin Nano estimate: ~34-46 FPS
twinlite8.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c499f31d388b377f6234db8b6417418846c73b003cc9b9fbc8369e854a823056
3
+ size 1787561
twinlite8_best.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:933bbf0b34134823c2cf9fb9eafd71a8362e26b14fff5eca551bdf78f76badab
3
+ size 1815544