WEN0256 commited on about 1 month ago

Commit

f5cc6c0

verified ·

1 Parent(s): 3db0649

Initial release: TwinLiteNet8 (0.44M params, 7-class orchard semantic seg, edge-deployment ready)

Browse files

Files changed (43) hide show

.gitattributes +27 -0
JETSON_DEPLOY.md +68 -0
README.md +160 -0
demo_twinlite_12s.mp4 +3 -0
export_onnx.py +69 -0
history.json +1082 -0
model/TwinLite.py +468 -0
model/TwinLite_8class.py +26 -0
model/__pycache__/TwinLite.cpython-311.pyc +0 -0
model/__pycache__/TwinLite.cpython-38.pyc +0 -0
model/__pycache__/TwinLite_8class.cpython-311.pyc +0 -0
predict.py +103 -0
predict_onnx.py +84 -0
samples/0_frame_3884.jpg +3 -0
samples/1_frame_2803.jpg +3 -0
samples/2_frame_2626.jpg +3 -0
samples/3_frame_4093.jpg +3 -0
samples/4_frame_3138.jpg +3 -0
samples/5_frame_3076.jpg +3 -0
samples_20/sample_00_frame_3884.jpg +3 -0
samples_20/sample_01_frame_2803.jpg +3 -0
samples_20/sample_02_frame_2626.jpg +3 -0
samples_20/sample_03_frame_4093.jpg +3 -0
samples_20/sample_04_frame_3138.jpg +3 -0
samples_20/sample_05_frame_3076.jpg +3 -0
samples_20/sample_06_frame_3032.jpg +3 -0
samples_20/sample_07_frame_2860.jpg +3 -0
samples_20/sample_08_frame_4083.jpg +3 -0
samples_20/sample_09_frame_2784.jpg +3 -0
samples_20/sample_10_frame_3960.jpg +3 -0
samples_20/sample_11_frame_4091.jpg +3 -0
samples_20/sample_12_frame_4402.jpg +3 -0
samples_20/sample_13_frame_3691.jpg +3 -0
samples_20/sample_14_frame_2753.jpg +3 -0
samples_20/sample_15_frame_3784.jpg +3 -0
samples_20/sample_16_frame_3439.jpg +3 -0
samples_20/sample_17_frame_2640.jpg +3 -0
samples_20/sample_18_frame_2636.jpg +3 -0
samples_20/sample_19_frame_2766.jpg +3 -0
train_8class.py +247 -0
training_log.txt +140 -0
twinlite8.onnx +3 -0
twinlite8_best.pt +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,30 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+demo_twinlite_12s.mp4 filter=lfs diff=lfs merge=lfs -text
+samples/0_frame_3884.jpg filter=lfs diff=lfs merge=lfs -text
+samples/1_frame_2803.jpg filter=lfs diff=lfs merge=lfs -text
+samples/2_frame_2626.jpg filter=lfs diff=lfs merge=lfs -text
+samples/3_frame_4093.jpg filter=lfs diff=lfs merge=lfs -text
+samples/4_frame_3138.jpg filter=lfs diff=lfs merge=lfs -text
+samples/5_frame_3076.jpg filter=lfs diff=lfs merge=lfs -text
+samples_20/sample_00_frame_3884.jpg filter=lfs diff=lfs merge=lfs -text
+samples_20/sample_01_frame_2803.jpg filter=lfs diff=lfs merge=lfs -text
+samples_20/sample_02_frame_2626.jpg filter=lfs diff=lfs merge=lfs -text
+samples_20/sample_03_frame_4093.jpg filter=lfs diff=lfs merge=lfs -text
+samples_20/sample_04_frame_3138.jpg filter=lfs diff=lfs merge=lfs -text
+samples_20/sample_05_frame_3076.jpg filter=lfs diff=lfs merge=lfs -text
+samples_20/sample_06_frame_3032.jpg filter=lfs diff=lfs merge=lfs -text
+samples_20/sample_07_frame_2860.jpg filter=lfs diff=lfs merge=lfs -text
+samples_20/sample_08_frame_4083.jpg filter=lfs diff=lfs merge=lfs -text
+samples_20/sample_09_frame_2784.jpg filter=lfs diff=lfs merge=lfs -text
+samples_20/sample_10_frame_3960.jpg filter=lfs diff=lfs merge=lfs -text
+samples_20/sample_11_frame_4091.jpg filter=lfs diff=lfs merge=lfs -text
+samples_20/sample_12_frame_4402.jpg filter=lfs diff=lfs merge=lfs -text
+samples_20/sample_13_frame_3691.jpg filter=lfs diff=lfs merge=lfs -text
+samples_20/sample_14_frame_2753.jpg filter=lfs diff=lfs merge=lfs -text
+samples_20/sample_15_frame_3784.jpg filter=lfs diff=lfs merge=lfs -text
+samples_20/sample_16_frame_3439.jpg filter=lfs diff=lfs merge=lfs -text
+samples_20/sample_17_frame_2640.jpg filter=lfs diff=lfs merge=lfs -text
+samples_20/sample_18_frame_2636.jpg filter=lfs diff=lfs merge=lfs -text
+samples_20/sample_19_frame_2766.jpg filter=lfs diff=lfs merge=lfs -text

JETSON_DEPLOY.md ADDED Viewed

	@@ -0,0 +1,68 @@

+# TwinLiteNet8 — Jetson Deployment Guide
+Pipeline:  PyTorch `.pt`  →  ONNX  →  TensorRT engine  →  fast inference on Jetson
+## On a host machine (Linux/Win/Mac with PyTorch)
+```bash
+pip install onnx onnxruntime
+python export_onnx.py --ckpt run_8class/twinlite8_best.pt --out twinlite8.onnx
+# → produces twinlite8.onnx (~2 MB, fixed shape 1x3x360x640)
+```
+For dynamic batch / spatial dims (slightly slower at runtime, more flexible):
+```bash
+python export_onnx.py --ckpt ... --out twinlite8_dynamic.onnx --dynamic
+```
+## On the Jetson (Orin Nano / NX / AGX)
+JetPack ships with `trtexec`. Run **on the device**:
+```bash
+# FP16 (recommended — best speed/accuracy trade-off)
+trtexec --onnx=twinlite8.onnx --saveEngine=twinlite8_fp16.engine \
+        --fp16 --workspace=2048
+# Or INT8 (faster but needs calibration data; small accuracy drop)
+trtexec --onnx=twinlite8.onnx --saveEngine=twinlite8_int8.engine \
+        --int8 --workspace=2048
+```
+Then in Python (Jetson):
+```python
+import onnxruntime as ort
+sess = ort.InferenceSession("twinlite8.onnx",
+                            providers=["TensorrtExecutionProvider"])
+# OR load the pre-built .engine via TensorRT Python API
+```
+## Expected speeds (640×360, batch 1)
+| Device | PyTorch | ONNX-CUDA | TensorRT FP16 | TensorRT INT8 |
+|---|---|---|---|---|
+| RTX 3080 (host) | ~150 FPS | ~250 FPS | ~400 FPS | ~600 FPS |
+| RTX 5090 (host) | ~500 FPS | ~700 FPS | ~1200 FPS | — |
+| Jetson Orin Nano | ~10 FPS | ~25 FPS | **~40 FPS** ← target | ~60 FPS |
+| Jetson Orin NX | ~25 FPS | ~50 FPS | ~80 FPS | ~120 FPS |
+| Jetson Nano (old) | ~3 FPS | ~8 FPS | ~15 FPS | ~25 FPS |
+(rough estimates; exact numbers depend on power mode + JetPack version)
+## Validating numerical parity
+Always run after export to confirm ONNX matches PyTorch:
+```bash
+python -c "
+import onnxruntime, torch, numpy as np
+from model.TwinLite_8class import TwinLiteNet8
+m = TwinLiteNet8().eval()
+m.load_state_dict(torch.load('run_8class/twinlite8_best.pt')['model'])
+sess = onnxruntime.InferenceSession('twinlite8.onnx', providers=['CPUExecutionProvider'])
+x = torch.randn(1,3,360,640)
+torch_out = m(x).detach().numpy()
+onnx_out  = sess.run(None, {'input': x.numpy()})[0]
+print('argmax agreement:', (torch_out.argmax(1) == onnx_out.argmax(1)).mean())
+"
+```
+Should print **1.0**. Anything <0.999 means the export went wrong.

README.md ADDED Viewed

	@@ -0,0 +1,160 @@

+---
+license: apache-2.0
+language: [en]
+tags: [semantic-segmentation, twinlitenet, agriculture, orchard, real-time, edge-deployment, jetson]
+pipeline_tag: image-segmentation
+---
+# TwinLiteNet8 — Real-time orchard segmentation for edge devices
+A **0.44 M-parameter** semantic-segmentation model adapted from [TwinLiteNet](https://github.com/chequanghuy/TwinLiteNet) for **7-class apple orchard scenes**, designed to run **>30 FPS on Jetson-class hardware** for robotic navigation.
+Drop-in lightweight alternative to [WEN0256/Segformer85Mv1](https://huggingface.co/WEN0256/Segformer85Mv1) for low-compute deployments.
+## Why "7-class" but 8 logit channels?
+The model is trained to recognize **7 real classes** (`tree`, `ground`, `person`, `sky`, `road`, `mountain`, `building`). The 8th label `background` is **NOT** treated as a real class — pixels that fall outside any labeled object are simply masked out of the loss (`ignore_index=255`). The 8th logit channel exists only to keep the architecture identical to the original TwinLiteNet shape; it is never trained and is forced to `-inf` before `argmax` at inference, so the model never outputs `background`.
+This matches what you usually want from a robot's perception stack: "tell me what you DO recognize", not "tell me you don't know".
+## Performance (no data leakage, temporal split val, fair apples-to-apples)
+| Metric | TwinLiteNet8 | Segformer-b5 (85 M) | Δ vs Segformer |
+|---|---|---|---|
+| Tree IoU | **0.872** | 0.742 | **+13 pp** ⭐ |
+| Ground IoU | **0.916** | 0.851 | **+6.5 pp** |
+| Person IoU | 0.441 | 0.72 | -28 pp |
+| Sky IoU | 0.835 | 0.77 | +6 pp |
+| Road IoU | 0.745 | 0.80 | -5 pp |
+| Mountain IoU | 0.592 | 0.44 | +15 pp |
+| Building IoU | 0.555 | 0.71 | -16 pp |
+| **mIoU (7 classes)** | **0.708** | 0.714 | -0.6 pp |
+| Model size | **1.8 MB** | 339 MB | **188× smaller** |
+| Params | **0.437 M** | 85 M | **194× fewer** |
+(Segformer numbers come from `WEN0256/Segformer85Mv1`. Both models tested on the same 155-frame temporal-split val from the original orchard recording, with the same "background pixels excluded" protocol so the IoUs are directly comparable.)
+**Headline:** TwinLiteNet8 *matches* Segformer-b5 in overall mIoU (0.708 vs 0.714, within noise) and *beats it* on the two classes that matter most for orchard navigation (`tree`, `ground`), while being ~200× smaller and ~10× faster on edge devices. The trade-off is on rare classes (`person`, `building`) where the small model's limited capacity shows.
+### FPS (640×360 input, batch 1)
+| Device | TwinLiteNet8 | Segformer-b5 | Speedup |
+|---|---|---|---|
+| RTX 3080 (PyTorch fp32) | **137 FPS** | ~50 | 2.7× |
+| RTX 5090 (PyTorch fp32) | ~500 FPS | ~150 | 3.3× |
+| **Jetson Orin Nano (TRT FP16, est)** | **~34–46 FPS** ⭐ | ~2–5 | **~10×** |
+| Jetson Orin NX (TRT FP16, est) | ~60–80 FPS | ~20 | ~3× |
+Target was **10–20 FPS** on Orin Nano — TwinLiteNet8 doubles that.
+## Files
+| File | Purpose |
+|---|---|
+| `twinlite8_best.pt` | PyTorch checkpoint (1.8 MB), epoch 29, best tree IoU 0.872 |
+| `twinlite8.onnx` | ONNX export (1.8 MB), 100% argmax parity verified |
+| `predict.py` | PyTorch inference (matches Segformer's API) |
+| `predict_onnx.py` | ONNX-Runtime inference (CPU/CUDA/TensorRT auto-pick) |
+| `export_onnx.py` | Re-export ONNX from any checkpoint |
+| `train_8class.py` | Full training script (60 epochs, ~70 min on RTX 3080) |
+| `model/` | TwinLiteNet8 architecture (single-branch 8-output head, channel 7 = unused) |
+| `JETSON_DEPLOY.md` | Step-by-step Jetson deployment + FPS table |
+| `samples_20/` | 20 OOD inference samples (original ‖ prediction overlay) |
+| `demo_twinlite_12s.mp4` | 12-s demo video (360 frames @ 30 FPS, original ‖ overlay) |
+| `samples/` | 6 in-domain validation samples |
+| `training_log.txt` + `history.json` | Per-epoch metrics |
+## Quick Use (PyTorch)
+```python
+import sys, cv2, torch
+sys.path.insert(0, "<this_dir>")
+from predict import load_model, predict, overlay
+model = load_model("twinlite8_best.pt", device="cuda")
+img = cv2.imread("orchard.jpg")
+mask = predict(model, img)            # H×W uint8, values 0..6 (never 7)
+viz = overlay(img, mask)
+cv2.imwrite("out.jpg", viz)
+```
+## Quick Use (ONNX, no PyTorch)
+```python
+import onnxruntime as ort, cv2, numpy as np
+sess = ort.InferenceSession("twinlite8.onnx", providers=["CUDAExecutionProvider"])
+img = cv2.imread("orchard.jpg")
+inp = cv2.resize(img, (640, 360))
+rgb = cv2.cvtColor(inp, cv2.COLOR_BGR2RGB).astype(np.float32) / 255.0
+x = rgb.transpose(2, 0, 1)[None]
+logits = sess.run(None, {"input": x})[0]
+logits[:, 7, :, :] = -1e9              # mask the unused background channel
+mask = logits.argmax(1)[0]              # 360×640 uint8, values 0..6
+```
+## Classes (id → name)
+| ID | Class | Color (BGR) |
+|---|---|---|
+| 0 | **tree** (priority) | green |
+| 1 | ground | brown |
+| 2 | person | red |
+| 3 | sky | cyan |
+| 4 | road | gray |
+| 5 | mountain | purple |
+| 6 | building | yellow |
+| 7 | (unused — never output) | — |
+## Architecture
+Single-branch 8-output adaptation of [TwinLiteNet](https://github.com/chequanghuy/TwinLiteNet):
+- **Encoder**: ESPNet (`ESPNet_Encoder`, p = 2 q = 3)
+- **Decoder**: 3 × `UPx2` upsampling blocks
+- **Head**: 8-channel softmax (7 real classes; channel 7 untrained, masked at inference)
+- **Input**: 640×360 BGR → ImageNet-style normalize
+- **Output**: (B, 8, H, W) logits
+The original TwinLiteNet has two parallel decoder heads for two binary tasks (drivable area + lane lines). For multi-class semantic seg matching the Segformer setup, we kept one decoder branch and changed its final `UPx2` to output 8 channels. Final param count: **0.437 M**.
+## Training Recipe
+| Hyperparameter | Value |
+|---|---|
+| Optimizer | AdamW, weight_decay 1e-4 |
+| LR | 5e-4, cosine schedule |
+| Epochs | 60 |
+| Batch | 16 |
+| Resolution | 640×360 |
+| Loss | weighted cross-entropy with `ignore_index=255` |
+| Class weights | tree 1.5, ground 0.5, person 1.5, sky 1.0, road 1.0, mountain 1.0, building 1.0, **background 0.0** |
+| Background handling | mask pixels remapped 7 → 255 so they never contribute to loss |
+| Augmentation | hflip + HSV jitter |
+| Hardware | RTX 3080, ~70 minutes total |
+## Dataset
+Same dataset as [WEN0256/Segformer85Mv1](https://huggingface.co/WEN0256/Segformer85Mv1) v2:
+- ~5300 frames from `oak_0415_oneRadar_1` (spring 2024 Korean apple orchard, single OAK-D camera)
+- 311 frames from "Orchard Navigation" (Sep autumn capture + Aug Windows-webcam capture)
+- Pseudo-mask labels generated by Segformer v1 to fill SAM-annotated gaps
+- Temporal split: frames `≤ 4500` → train, frames `> 4500` → val (155 frames). No neighbor leakage.
+## Limitations (same as parent Segformer model)
+- Trained on a single Korean apple orchard, spring + partial autumn
+- ❌ Different orchards (different tree species/layouts) — likely degraded
+- ❌ Winter (no leaves), night, rain — no training data
+- ❌ Aerial/drone perspectives — robot-eye view only
+- For a new deployment, plan to fine-tune on 100–300 in-domain frames (~13 min on a single GPU)
+## Deployment to Jetson
+See `JETSON_DEPLOY.md` for the full pipeline:
+1. Export to ONNX (this repo already has `twinlite8.onnx`)
+2. On Jetson: `trtexec --onnx=twinlite8.onnx --saveEngine=...engine --fp16`
+3. Run via `predict_onnx.py --provider TensorrtExecutionProvider` or load the `.engine` via TRT API
+## License
+Apache 2.0

demo_twinlite_12s.mp4 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b6315c2fd2bb7cbdd9b59c2882bea5d8d1f8abdc96b6dbe5245a744e4f1d3034
+size 68107137

export_onnx.py ADDED Viewed

	@@ -0,0 +1,69 @@

+"""Export TwinLiteNet8 to ONNX for cross-platform deployment.
+Usage:
+    python export_onnx.py --ckpt run_8class/twinlite8_best.pt --out twinlite8.onnx
+    python export_onnx.py --ckpt run_8class/twinlite8_best.pt --out twinlite8_dynamic.onnx --dynamic
+"""
+import argparse, sys, os
+sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
+from pathlib import Path
+import numpy as np, torch
+from model.TwinLite_8class import TwinLiteNet8
+def main():
+    ap = argparse.ArgumentParser()
+    ap.add_argument("--ckpt", required=True)
+    ap.add_argument("--out", required=True)
+    ap.add_argument("--height", type=int, default=360)
+    ap.add_argument("--width",  type=int, default=640)
+    ap.add_argument("--dynamic", action="store_true",
+                    help="Allow dynamic batch + spatial dims (slightly slower at runtime)")
+    ap.add_argument("--opset", type=int, default=17)
+    args = ap.parse_args()
+    print(f"loading ckpt: {args.ckpt}")
+    model = TwinLiteNet8(num_classes=8).eval()
+    ckpt = torch.load(args.ckpt, map_location="cpu", weights_only=False)
+    model.load_state_dict(ckpt["model"])
+    print(f"  epoch {ckpt['epoch']}  tree IoU {ckpt.get('tree_iou_old','?')}")
+    dummy = torch.randn(1, 3, args.height, args.width)
+    if args.dynamic:
+        dyn = {"input": {0: "batch", 2: "height", 3: "width"},
+               "output": {0: "batch", 2: "height", 3: "width"}}
+    else:
+        dyn = None
+    print(f"exporting to ONNX (opset {args.opset}) ...")
+    torch.onnx.export(
+        model, dummy, args.out,
+        input_names=["input"], output_names=["output"],
+        dynamic_axes=dyn,
+        opset_version=args.opset,
+        do_constant_folding=True,
+    )
+    sz = os.path.getsize(args.out) / 1e6
+    print(f"  saved: {args.out}  ({sz:.2f} MB)")
+    # Validate ONNX numerical parity vs PyTorch
+    try:
+        import onnxruntime as ort
+        sess = ort.InferenceSession(args.out, providers=["CPUExecutionProvider"])
+        with torch.no_grad():
+            torch_out = model(dummy).numpy()
+        onnx_out = sess.run(None, {"input": dummy.numpy()})[0]
+        diff = np.abs(torch_out - onnx_out)
+        argmax_match = (torch_out.argmax(1) == onnx_out.argmax(1)).mean()
+        print(f"  parity: max_abs_diff={diff.max():.6f}  mean={diff.mean():.6f}")
+        print(f"  argmax agreement: {100*argmax_match:.4f}%  (must be ~100% for safe deploy)")
+        assert argmax_match > 0.999, "argmax disagreement > 0.1% — investigate"
+        print("  PARITY OK")
+    except ImportError:
+        print("  (skip parity check — onnxruntime not installed; pip install onnxruntime)")
+if __name__ == "__main__":
+    main()

history.json ADDED Viewed

	@@ -0,0 +1,1082 @@

+[
+  {
+    "epoch": 1,
+    "loss": 1.3310781220886365,
+    "miou_7": 0.37451867570160186,
+    "tree_iou_old": 0.8175889982052515,
+    "ground_iou_old": 0.8771987595010953,
+    "tree_recall_new": 0.8720591858755958,
+    "per_class_iou": {
+      "tree": 0.8175889982052515,
+      "ground": 0.8771987595010953,
+      "person": 0.04389061394240307,
+      "sky": 0.7793865097876477,
+      "road": 0.001261266546266693,
+      "mountain": 0.09015012685328469,
+      "building": 0.012154455075264126,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 2,
+    "loss": 0.7962898902179908,
+    "miou_7": 0.4263919160239557,
+    "tree_iou_old": 0.8120707191596923,
+    "ground_iou_old": 0.8586761495775145,
+    "tree_recall_new": 0.864925755683379,
+    "per_class_iou": {
+      "tree": 0.8120707191596923,
+      "ground": 0.8586761495775145,
+      "person": 0.06558714109864322,
+      "sky": 0.7999256871820133,
+      "road": 0.019873436446123636,
+      "mountain": 0.39602489908191335,
+      "building": 0.032585379621789444,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 3,
+    "loss": 0.5679835767165656,
+    "miou_7": 0.5584064801846481,
+    "tree_iou_old": 0.8504976314678695,
+    "ground_iou_old": 0.8843725580277102,
+    "tree_recall_new": 0.9388755306705637,
+    "per_class_iou": {
+      "tree": 0.8504976314678695,
+      "ground": 0.8843725580277102,
+      "person": 0.29319772033339875,
+      "sky": 0.8191901750544341,
+      "road": 0.6448235890748388,
+      "mountain": 0.39345474488848675,
+      "building": 0.023308942445799074,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 4,
+    "loss": 0.4366068317393753,
+    "miou_7": 0.5958384978496121,
+    "tree_iou_old": 0.8534589788600526,
+    "ground_iou_old": 0.8922631824167708,
+    "tree_recall_new": 0.9672329512788488,
+    "per_class_iou": {
+      "tree": 0.8534589788600526,
+      "ground": 0.8922631824167708,
+      "person": 0.37156245263636517,
+      "sky": 0.8260741562564967,
+      "road": 0.5835668689224565,
+      "mountain": 0.47504755441513025,
+      "building": 0.16889629144001275,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 5,
+    "loss": 0.3549347817023828,
+    "miou_7": 0.6220458616228167,
+    "tree_iou_old": 0.8363749004074456,
+    "ground_iou_old": 0.8852004546947622,
+    "tree_recall_new": 0.9633982457780006,
+    "per_class_iou": {
+      "tree": 0.8363749004074456,
+      "ground": 0.8852004546947622,
+      "person": 0.3957433003440445,
+      "sky": 0.8244260552327038,
+      "road": 0.6178698597724389,
+      "mountain": 0.4848031930793919,
+      "building": 0.30990326782893096,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 6,
+    "loss": 0.2965410981010482,
+    "miou_7": 0.6195229709106116,
+    "tree_iou_old": 0.8307015597161879,
+    "ground_iou_old": 0.8889140076510883,
+    "tree_recall_new": 0.967347652588143,
+    "per_class_iou": {
+      "tree": 0.8307015597161879,
+      "ground": 0.8889140076510883,
+      "person": 0.34606393858495704,
+      "sky": 0.8203392783845447,
+      "road": 0.7104870880331203,
+      "mountain": 0.49466764013384806,
+      "building": 0.2454872838705348,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 7,
+    "loss": 0.2605974777790109,
+    "miou_7": 0.6608496131211873,
+    "tree_iou_old": 0.8600555277203275,
+    "ground_iou_old": 0.9041158609890793,
+    "tree_recall_new": 0.9808632901999768,
+    "per_class_iou": {
+      "tree": 0.8600555277203275,
+      "ground": 0.9041158609890793,
+      "person": 0.39638474697269704,
+      "sky": 0.8303220440766499,
+      "road": 0.6432374151811943,
+      "mountain": 0.5524623861151251,
+      "building": 0.439369310793238,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 8,
+    "loss": 0.22860906262201997,
+    "miou_7": 0.6444995447460438,
+    "tree_iou_old": 0.8542548970213149,
+    "ground_iou_old": 0.9016265876093191,
+    "tree_recall_new": 0.9905392660815484,
+    "per_class_iou": {
+      "tree": 0.8542548970213149,
+      "ground": 0.9016265876093191,
+      "person": 0.4122549495341615,
+      "sky": 0.8160172675420245,
+      "road": 0.6488806985459417,
+      "mountain": 0.5314521835332935,
+      "building": 0.3470102294362516,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 9,
+    "loss": 0.20931702563839574,
+    "miou_7": 0.43161408360048903,
+    "tree_iou_old": 0.7695186615335882,
+    "ground_iou_old": 0.8546917001099084,
+    "tree_recall_new": 0.8871473642771976,
+    "per_class_iou": {
+      "tree": 0.7695186615335882,
+      "ground": 0.8546917001099084,
+      "person": 0.359624677355642,
+      "sky": 0.43488592691950145,
+      "road": 0.11753312041637094,
+      "mountain": 0.3491913165177679,
+      "building": 0.13585318235064428,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 10,
+    "loss": 0.19840111109343442,
+    "miou_7": 0.6606398486088663,
+    "tree_iou_old": 0.8338563413583029,
+    "ground_iou_old": 0.8831831425045409,
+    "tree_recall_new": 0.9901880818259315,
+    "per_class_iou": {
+      "tree": 0.8338563413583029,
+      "ground": 0.8831831425045409,
+      "person": 0.40429846068091946,
+      "sky": 0.8237072702221991,
+      "road": 0.7148262084503572,
+      "mountain": 0.5229702347017845,
+      "building": 0.44163728234396016,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 11,
+    "loss": 0.17923572080488429,
+    "miou_7": 0.6830302547462325,
+    "tree_iou_old": 0.8548939555986949,
+    "ground_iou_old": 0.9095141301915333,
+    "tree_recall_new": 0.9958006576208399,
+    "per_class_iou": {
+      "tree": 0.8548939555986949,
+      "ground": 0.9095141301915333,
+      "person": 0.3875322179022323,
+      "sky": 0.8246677405547581,
+      "road": 0.7644137551908834,
+      "mountain": 0.5375454528224748,
+      "building": 0.5026445309630501,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 12,
+    "loss": 0.1724150039452262,
+    "miou_7": 0.668918280520857,
+    "tree_iou_old": 0.8423760996137923,
+    "ground_iou_old": 0.8821698496521413,
+    "tree_recall_new": 0.9945828412505558,
+    "per_class_iou": {
+      "tree": 0.8423760996137923,
+      "ground": 0.8821698496521413,
+      "person": 0.39795968294353656,
+      "sky": 0.8225902005034851,
+      "road": 0.7203922138904981,
+      "mountain": 0.5495641818156543,
+      "building": 0.4673757352268918,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 13,
+    "loss": 0.16389323436706996,
+    "miou_7": 0.6825473779097335,
+    "tree_iou_old": 0.8337379023047351,
+    "ground_iou_old": 0.8829405113369486,
+    "tree_recall_new": 0.9935130037299167,
+    "per_class_iou": {
+      "tree": 0.8337379023047351,
+      "ground": 0.8829405113369486,
+      "person": 0.42147440680495446,
+      "sky": 0.8271820064942336,
+      "road": 0.7398392373685672,
+      "mountain": 0.5597518680318201,
+      "building": 0.5129057130268762,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 14,
+    "loss": 0.15743425162918756,
+    "miou_7": 0.6744691063520801,
+    "tree_iou_old": 0.8481754467002482,
+    "ground_iou_old": 0.8952094538543592,
+    "tree_recall_new": 0.9989067973978379,
+    "per_class_iou": {
+      "tree": 0.8481754467002482,
+      "ground": 0.8952094538543592,
+      "person": 0.4356588273855861,
+      "sky": 0.8093282743850084,
+      "road": 0.7274880284020421,
+      "mountain": 0.547309004129935,
+      "building": 0.45811470960738193,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 15,
+    "loss": 0.15353474125834154,
+    "miou_7": 0.6638122326903052,
+    "tree_iou_old": 0.8468084387339117,
+    "ground_iou_old": 0.8968202477703683,
+    "tree_recall_new": 0.9984826857665587,
+    "per_class_iou": {
+      "tree": 0.8468084387339117,
+      "ground": 0.8968202477703683,
+      "person": 0.3803193694656955,
+      "sky": 0.7937484883274253,
+      "road": 0.6863063433123712,
+      "mountain": 0.5594115155198771,
+      "building": 0.48327122570248765,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 16,
+    "loss": 0.14856905372174253,
+    "miou_7": 0.6851351311052538,
+    "tree_iou_old": 0.8512448354850919,
+    "ground_iou_old": 0.8934746447462905,
+    "tree_recall_new": 0.9969179333372983,
+    "per_class_iou": {
+      "tree": 0.8512448354850919,
+      "ground": 0.8934746447462905,
+      "person": 0.4963661975950126,
+      "sky": 0.8032382135701676,
+      "road": 0.6923475060384158,
+      "mountain": 0.5673653099412674,
+      "building": 0.49190921036053065,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 17,
+    "loss": 0.14565541855226163,
+    "miou_7": 0.6976304308763035,
+    "tree_iou_old": 0.8559357861690475,
+    "ground_iou_old": 0.9008399324780962,
+    "tree_recall_new": 0.9970113936633899,
+    "per_class_iou": {
+      "tree": 0.8559357861690475,
+      "ground": 0.9008399324780962,
+      "person": 0.45982943007987004,
+      "sky": 0.8260117146625021,
+      "road": 0.7203236325464007,
+      "mountain": 0.570204347198094,
+      "building": 0.5502681730001141,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 18,
+    "loss": 0.13922102244630938,
+    "miou_7": 0.6715259962090111,
+    "tree_iou_old": 0.8590085776748139,
+    "ground_iou_old": 0.9080001376838421,
+    "tree_recall_new": 0.9984515323245282,
+    "per_class_iou": {
+      "tree": 0.8590085776748139,
+      "ground": 0.9080001376838421,
+      "person": 0.4165244067867354,
+      "sky": 0.8300622522021492,
+      "road": 0.7391183653159966,
+      "mountain": 0.5794701815195625,
+      "building": 0.3684980522799781,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 19,
+    "loss": 0.1353529858187147,
+    "miou_7": 0.6868000456457597,
+    "tree_iou_old": 0.8615431837256211,
+    "ground_iou_old": 0.9031047849108651,
+    "tree_recall_new": 0.998964148052485,
+    "per_class_iou": {
+      "tree": 0.8615431837256211,
+      "ground": 0.9031047849108651,
+      "person": 0.4420047326739608,
+      "sky": 0.8209615163638296,
+      "road": 0.6942001658572324,
+      "mountain": 0.5805041823745026,
+      "building": 0.5052817536143058,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 20,
+    "loss": 0.13253521772860782,
+    "miou_7": 0.7021224084576373,
+    "tree_iou_old": 0.8631602330267258,
+    "ground_iou_old": 0.9091200309612043,
+    "tree_recall_new": 0.994837025016214,
+    "per_class_iou": {
+      "tree": 0.8631602330267258,
+      "ground": 0.9091200309612043,
+      "person": 0.41137926177027906,
+      "sky": 0.8285066518604141,
+      "road": 0.7466690639312616,
+      "mountain": 0.536259817918243,
+      "building": 0.6197617997353331,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 21,
+    "loss": 0.1302845889915469,
+    "miou_7": 0.675810914326681,
+    "tree_iou_old": 0.8592563773281188,
+    "ground_iou_old": 0.8991446138165377,
+    "tree_recall_new": 0.9961822872857139,
+    "per_class_iou": {
+      "tree": 0.8592563773281188,
+      "ground": 0.8991446138165377,
+      "person": 0.3903491003433304,
+      "sky": 0.824675927094987,
+      "road": 0.6894249147918818,
+      "mountain": 0.5948788564749821,
+      "building": 0.47294661043692804,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 22,
+    "loss": 0.12750109743949606,
+    "miou_7": 0.7125322874970034,
+    "tree_iou_old": 0.8645704875357654,
+    "ground_iou_old": 0.9072065152568384,
+    "tree_recall_new": 0.9984989705203474,
+    "per_class_iou": {
+      "tree": 0.8645704875357654,
+      "ground": 0.9072065152568384,
+      "person": 0.48954839497752917,
+      "sky": 0.8196931627498157,
+      "road": 0.7243334862756741,
+      "mountain": 0.5763691965526828,
+      "building": 0.6060047691307175,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 23,
+    "loss": 0.1288085954149098,
+    "miou_7": 0.7107471261265749,
+    "tree_iou_old": 0.8642860495130823,
+    "ground_iou_old": 0.909492093404392,
+    "tree_recall_new": 0.9991893024744329,
+    "per_class_iou": {
+      "tree": 0.8642860495130823,
+      "ground": 0.909492093404392,
+      "person": 0.45766399589516893,
+      "sky": 0.8274124364025386,
+      "road": 0.7277521111617834,
+      "mountain": 0.5772087040752052,
+      "building": 0.6114144924338537,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 24,
+    "loss": 0.12303933197539572,
+    "miou_7": 0.6960249423927186,
+    "tree_iou_old": 0.862591625892603,
+    "ground_iou_old": 0.9117833962636658,
+    "tree_recall_new": 0.9990137103466246,
+    "per_class_iou": {
+      "tree": 0.862591625892603,
+      "ground": 0.9117833962636658,
+      "person": 0.43123219925818773,
+      "sky": 0.8202446601586809,
+      "road": 0.7568995059507706,
+      "mountain": 0.583646623917872,
+      "building": 0.5057765853072502,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 25,
+    "loss": 0.12275176438505699,
+    "miou_7": 0.700392134832647,
+    "tree_iou_old": 0.8569619510987356,
+    "ground_iou_old": 0.9115664983076798,
+    "tree_recall_new": 0.9996254506628602,
+    "per_class_iou": {
+      "tree": 0.8569619510987356,
+      "ground": 0.9115664983076798,
+      "person": 0.44410947241906,
+      "sky": 0.8242495156785404,
+      "road": 0.801578288829682,
+      "mountain": 0.5707688209809969,
+      "building": 0.4935103965138338,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 26,
+    "loss": 0.12000550889024986,
+    "miou_7": 0.6951813403928189,
+    "tree_iou_old": 0.8663251304061781,
+    "ground_iou_old": 0.9131697744358082,
+    "tree_recall_new": 0.9989188339549862,
+    "per_class_iou": {
+      "tree": 0.8663251304061781,
+      "ground": 0.9131697744358082,
+      "person": 0.42213996324450753,
+      "sky": 0.837097544021675,
+      "road": 0.7159604646377167,
+      "mountain": 0.5628351351092097,
+      "building": 0.5487413708946377,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 27,
+    "loss": 0.11845032005540786,
+    "miou_7": 0.6951365926228066,
+    "tree_iou_old": 0.8623787392359329,
+    "ground_iou_old": 0.9081680280222872,
+    "tree_recall_new": 0.9991772659172847,
+    "per_class_iou": {
+      "tree": 0.8623787392359329,
+      "ground": 0.9081680280222872,
+      "person": 0.40674057490499754,
+      "sky": 0.8283362865383408,
+      "road": 0.732462168304907,
+      "mountain": 0.577814835386126,
+      "building": 0.550055515967055,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 28,
+    "loss": 0.11608812912118749,
+    "miou_7": 0.7135802554765919,
+    "tree_iou_old": 0.8601389203474185,
+    "ground_iou_old": 0.9097109594291864,
+    "tree_recall_new": 0.9980252965949288,
+    "per_class_iou": {
+      "tree": 0.8601389203474185,
+      "ground": 0.9097109594291864,
+      "person": 0.4577523593452128,
+      "sky": 0.8263436060803816,
+      "road": 0.7681716123910687,
+      "mountain": 0.5772636562892896,
+      "building": 0.5956806744535849,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 29,
+    "loss": 0.11513655359619174,
+    "miou_7": 0.7080678106432002,
+    "tree_iou_old": 0.8720526616871654,
+    "ground_iou_old": 0.9159501362721273,
+    "tree_recall_new": 0.9990172505104916,
+    "per_class_iou": {
+      "tree": 0.8720526616871654,
+      "ground": 0.9159501362721273,
+      "person": 0.4413353907706845,
+      "sky": 0.8354333730973998,
+      "road": 0.7446705296683885,
+      "mountain": 0.5920758624448199,
+      "building": 0.5549567205618161,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 30,
+    "loss": 0.11306144819319074,
+    "miou_7": 0.6983805157012603,
+    "tree_iou_old": 0.8645833022730479,
+    "ground_iou_old": 0.9095457655833904,
+    "tree_recall_new": 0.9991652293601366,
+    "per_class_iou": {
+      "tree": 0.8645833022730479,
+      "ground": 0.9095457655833904,
+      "person": 0.4674763392232738,
+      "sky": 0.8335070597837729,
+      "road": 0.7545198948370945,
+      "mountain": 0.5924738174364902,
+      "building": 0.4665574307717516,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 31,
+    "loss": 0.11102715956150962,
+    "miou_7": 0.6944315353561886,
+    "tree_iou_old": 0.8547747382286776,
+    "ground_iou_old": 0.9053165982665526,
+    "tree_recall_new": 0.9990321191987335,
+    "per_class_iou": {
+      "tree": 0.8547747382286776,
+      "ground": 0.9053165982665526,
+      "person": 0.4329003766160034,
+      "sky": 0.8115273504852353,
+      "road": 0.7724850398518984,
+      "mountain": 0.5864169089017646,
+      "building": 0.4975997351431882,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 32,
+    "loss": 0.11149858427711945,
+    "miou_7": 0.7188664866966874,
+    "tree_iou_old": 0.8651834760565779,
+    "ground_iou_old": 0.9162814340108099,
+    "tree_recall_new": 0.9983141739664846,
+    "per_class_iou": {
+      "tree": 0.8651834760565779,
+      "ground": 0.9162814340108099,
+      "person": 0.4658711751302083,
+      "sky": 0.8333352426932886,
+      "road": 0.8032374478908707,
+      "mountain": 0.5783057614808421,
+      "building": 0.5698508696142144,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 33,
+    "loss": 0.10876328530898892,
+    "miou_7": 0.7107569762939843,
+    "tree_iou_old": 0.8693782454023541,
+    "ground_iou_old": 0.9157389094490351,
+    "tree_recall_new": 0.9991135429676768,
+    "per_class_iou": {
+      "tree": 0.8693782454023541,
+      "ground": 0.9157389094490351,
+      "person": 0.4937770576865971,
+      "sky": 0.8377474204852441,
+      "road": 0.761413185992059,
+      "mountain": 0.5953795868906355,
+      "building": 0.5018644281519644,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 34,
+    "loss": 0.10711049049262428,
+    "miou_7": 0.7019345534232959,
+    "tree_iou_old": 0.8648191490436973,
+    "ground_iou_old": 0.9101472043077795,
+    "tree_recall_new": 0.9986235842884695,
+    "per_class_iou": {
+      "tree": 0.8648191490436973,
+      "ground": 0.9101472043077795,
+      "person": 0.4554277216874071,
+      "sky": 0.8266031525406107,
+      "road": 0.7529576282399317,
+      "mountain": 0.5621906580777194,
+      "building": 0.5413963600659256,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 35,
+    "loss": 0.10637939819667347,
+    "miou_7": 0.6957735870270542,
+    "tree_iou_old": 0.8611342036897985,
+    "ground_iou_old": 0.9078926943174735,
+    "tree_recall_new": 0.9995305742712218,
+    "per_class_iou": {
+      "tree": 0.8611342036897985,
+      "ground": 0.9078926943174735,
+      "person": 0.4761426204094724,
+      "sky": 0.8155434964100852,
+      "road": 0.7537532650546288,
+      "mountain": 0.5780520205569928,
+      "building": 0.4778968087509277,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 36,
+    "loss": 0.10524913378038014,
+    "miou_7": 0.7036256072219315,
+    "tree_iou_old": 0.8601403942334339,
+    "ground_iou_old": 0.9103111483757813,
+    "tree_recall_new": 0.9991737257534177,
+    "per_class_iou": {
+      "tree": 0.8601403942334339,
+      "ground": 0.9103111483757813,
+      "person": 0.47734196127129497,
+      "sky": 0.8288868144372747,
+      "road": 0.7870753306011031,
+      "mountain": 0.591627622543358,
+      "building": 0.4699959790912746,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 37,
+    "loss": 0.10495941000075634,
+    "miou_7": 0.703440393493414,
+    "tree_iou_old": 0.8602822036239717,
+    "ground_iou_old": 0.907919303549861,
+    "tree_recall_new": 0.9988990090373303,
+    "per_class_iou": {
+      "tree": 0.8602822036239717,
+      "ground": 0.907919303549861,
+      "person": 0.47949983928674933,
+      "sky": 0.826995678624557,
+      "road": 0.7687920564109944,
+      "mountain": 0.5929370462471802,
+      "building": 0.4876566267105844,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 38,
+    "loss": 0.10339566608590464,
+    "miou_7": 0.7042723060688616,
+    "tree_iou_old": 0.8632348498053344,
+    "ground_iou_old": 0.9109321716698625,
+    "tree_recall_new": 0.9990023818222498,
+    "per_class_iou": {
+      "tree": 0.8632348498053344,
+      "ground": 0.9109321716698625,
+      "person": 0.4414322276666005,
+      "sky": 0.8285902039590752,
+      "road": 0.7778623497742087,
+      "mountain": 0.5870131637402792,
+      "building": 0.5208411758666712,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 39,
+    "loss": 0.10248555226628382,
+    "miou_7": 0.7127856236134917,
+    "tree_iou_old": 0.8653419624639606,
+    "ground_iou_old": 0.911729733752646,
+    "tree_recall_new": 0.9990243308382258,
+    "per_class_iou": {
+      "tree": 0.8653419624639606,
+      "ground": 0.911729733752646,
+      "person": 0.4490896965989601,
+      "sky": 0.8417416017949311,
+      "road": 0.7603018524388181,
+      "mountain": 0.5967423810275937,
+      "building": 0.5645521372175334,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 40,
+    "loss": 0.10101223195141013,
+    "miou_7": 0.7128466877853661,
+    "tree_iou_old": 0.8584637085289819,
+    "ground_iou_old": 0.9087631816776888,
+    "tree_recall_new": 0.999443486240091,
+    "per_class_iou": {
+      "tree": 0.8584637085289819,
+      "ground": 0.9087631816776888,
+      "person": 0.47705684717008523,
+      "sky": 0.8197330510672493,
+      "road": 0.800078154842082,
+      "mountain": 0.5957071073649332,
+      "building": 0.5301247638465424,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 41,
+    "loss": 0.09991852833160207,
+    "miou_7": 0.7035958499558952,
+    "tree_iou_old": 0.8617602975517414,
+    "ground_iou_old": 0.9110861746358995,
+    "tree_recall_new": 0.9996962539402023,
+    "per_class_iou": {
+      "tree": 0.8617602975517414,
+      "ground": 0.9110861746358995,
+      "person": 0.4549658643035961,
+      "sky": 0.8145149321096569,
+      "road": 0.7879282571846876,
+      "mountain": 0.5812266630987688,
+      "building": 0.5136887608069164,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 42,
+    "loss": 0.09915690325047613,
+    "miou_7": 0.7134583358474373,
+    "tree_iou_old": 0.8660831554987213,
+    "ground_iou_old": 0.9159726755955384,
+    "tree_recall_new": 0.9993656026350147,
+    "per_class_iou": {
+      "tree": 0.8660831554987213,
+      "ground": 0.9159726755955384,
+      "person": 0.4530905947265377,
+      "sky": 0.8362661921781568,
+      "road": 0.8043563655137067,
+      "mountain": 0.594773170031178,
+      "building": 0.5236661973882227,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 43,
+    "loss": 0.09795961531201416,
+    "miou_7": 0.7165406079983356,
+    "tree_iou_old": 0.8598378459161425,
+    "ground_iou_old": 0.9089394699785583,
+    "tree_recall_new": 0.9995454429594637,
+    "per_class_iou": {
+      "tree": 0.8598378459161425,
+      "ground": 0.9089394699785583,
+      "person": 0.46012050287958917,
+      "sky": 0.8221400259687008,
+      "road": 0.814444981073297,
+      "mountain": 0.591331730954615,
+      "building": 0.5589696992174461,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 44,
+    "loss": 0.09747363334177526,
+    "miou_7": 0.703682647474599,
+    "tree_iou_old": 0.8598806708971567,
+    "ground_iou_old": 0.9099442298065352,
+    "tree_recall_new": 0.9989683962491256,
+    "per_class_iou": {
+      "tree": 0.8598806708971567,
+      "ground": 0.9099442298065352,
+      "person": 0.44416668701022877,
+      "sky": 0.8300241915048083,
+      "road": 0.8091502958402874,
+      "mountain": 0.5780742065040855,
+      "building": 0.4945382507590906,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 45,
+    "loss": 0.09669119091240191,
+    "miou_7": 0.7204684750938577,
+    "tree_iou_old": 0.8609388333802934,
+    "ground_iou_old": 0.9099041175374463,
+    "tree_recall_new": 0.9992608137845485,
+    "per_class_iou": {
+      "tree": 0.8609388333802934,
+      "ground": 0.9099041175374463,
+      "person": 0.4618446874123914,
+      "sky": 0.8297634971589024,
+      "road": 0.8090713840099226,
+      "mountain": 0.5977877820863925,
+      "building": 0.5739690240716552,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 46,
+    "loss": 0.09577625356286852,
+    "miou_7": 0.7147064988681378,
+    "tree_iou_old": 0.8588356972291628,
+    "ground_iou_old": 0.9042846066741225,
+    "tree_recall_new": 0.9992289523097445,
+    "per_class_iou": {
+      "tree": 0.8588356972291628,
+      "ground": 0.9042846066741225,
+      "person": 0.45702919955948496,
+      "sky": 0.8253803474122704,
+      "road": 0.7867103882600653,
+      "mountain": 0.5997810694197818,
+      "building": 0.5709241835220764,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 47,
+    "loss": 0.09526236196897946,
+    "miou_7": 0.7171311747428346,
+    "tree_iou_old": 0.8632839060256274,
+    "ground_iou_old": 0.9123400265863985,
+    "tree_recall_new": 0.9993875516509908,
+    "per_class_iou": {
+      "tree": 0.8632839060256274,
+      "ground": 0.9123400265863985,
+      "person": 0.46598174933267117,
+      "sky": 0.817570729856514,
+      "road": 0.7994172315356988,
+      "mountain": 0.6012255592276109,
+      "building": 0.5600990206353221,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 48,
+    "loss": 0.09422024230191435,
+    "miou_7": 0.7167525270340019,
+    "tree_iou_old": 0.8651489837327133,
+    "ground_iou_old": 0.9135081509914867,
+    "tree_recall_new": 0.9992140836215027,
+    "per_class_iou": {
+      "tree": 0.8651489837327133,
+      "ground": 0.9135081509914867,
+      "person": 0.46782555019067695,
+      "sky": 0.8286387479180981,
+      "road": 0.7996623413743239,
+      "mountain": 0.5915479115479115,
+      "building": 0.5509360034828037,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 49,
+    "loss": 0.09376394734485757,
+    "miou_7": 0.7145702801110829,
+    "tree_iou_old": 0.8627369567040538,
+    "ground_iou_old": 0.9114642448318345,
+    "tree_recall_new": 0.9994342818140366,
+    "per_class_iou": {
+      "tree": 0.8627369567040538,
+      "ground": 0.9114642448318345,
+      "person": 0.47928346990327153,
+      "sky": 0.8247682032686208,
+      "road": 0.7942546414600767,
+      "mountain": 0.5941505966745633,
+      "building": 0.5353338479351601,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 50,
+    "loss": 0.0934471827098701,
+    "miou_7": 0.7176813282089505,
+    "tree_iou_old": 0.8620466644958731,
+    "ground_iou_old": 0.9123173312442661,
+    "tree_recall_new": 0.9994987127964179,
+    "per_class_iou": {
+      "tree": 0.8620466644958731,
+      "ground": 0.9123173312442661,
+      "person": 0.46904731474144323,
+      "sky": 0.8275378210481714,
+      "road": 0.8118060080254842,
+      "mountain": 0.5901987168959689,
+      "building": 0.5508154410114461,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 51,
+    "loss": 0.09312802140226811,
+    "miou_7": 0.7168954119728846,
+    "tree_iou_old": 0.8613075378830813,
+    "ground_iou_old": 0.91348727971426,
+    "tree_recall_new": 0.9994880923048166,
+    "per_class_iou": {
+      "tree": 0.8613075378830813,
+      "ground": 0.91348727971426,
+      "person": 0.4603232176681578,
+      "sky": 0.8209021175471539,
+      "road": 0.8180727322238185,
+      "mountain": 0.5928931907697746,
+      "building": 0.551281808003947,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 52,
+    "loss": 0.09262444826165253,
+    "miou_7": 0.7145893070504529,
+    "tree_iou_old": 0.8601787716896,
+    "ground_iou_old": 0.9100240065925295,
+    "tree_recall_new": 0.9993224126358361,
+    "per_class_iou": {
+      "tree": 0.8601787716896,
+      "ground": 0.9100240065925295,
+      "person": 0.46003656751568806,
+      "sky": 0.8240606386485418,
+      "road": 0.8035017391518859,
+      "mountain": 0.5972179702553838,
+      "building": 0.5471054554995408,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 53,
+    "loss": 0.09180469486501909,
+    "miou_7": 0.7169079280075012,
+    "tree_iou_old": 0.8595100269113597,
+    "ground_iou_old": 0.9101150646845736,
+    "tree_recall_new": 0.9994831360754026,
+    "per_class_iou": {
+      "tree": 0.8595100269113597,
+      "ground": 0.9101150646845736,
+      "person": 0.4661931584573637,
+      "sky": 0.820119386782323,
+      "road": 0.8062676835489306,
+      "mountain": 0.5948037931562971,
+      "building": 0.5613463825116619,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 54,
+    "loss": 0.09159080942005705,
+    "miou_7": 0.714251101780106,
+    "tree_iou_old": 0.8610315614250247,
+    "ground_iou_old": 0.9110287660753348,
+    "tree_recall_new": 0.9995758883687208,
+    "per_class_iou": {
+      "tree": 0.8610315614250247,
+      "ground": 0.9110287660753348,
+      "person": 0.4709480122324159,
+      "sky": 0.82152237535896,
+      "road": 0.8068207561534491,
+      "mountain": 0.5878109061264083,
+      "building": 0.5405953350891487,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 55,
+    "loss": 0.09123367462252592,
+    "miou_7": 0.7189968760256816,
+    "tree_iou_old": 0.8627378628159317,
+    "ground_iou_old": 0.9131321779238109,
+    "tree_recall_new": 0.99940808460142,
+    "per_class_iou": {
+      "tree": 0.8627378628159317,
+      "ground": 0.9131321779238109,
+      "person": 0.4761212765181701,
+      "sky": 0.8255133689898098,
+      "road": 0.8077639678601056,
+      "mountain": 0.5934658957240918,
+      "building": 0.5542435823478512,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 56,
+    "loss": 0.09107463639721143,
+    "miou_7": 0.7174357176043799,
+    "tree_iou_old": 0.8624821961413048,
+    "ground_iou_old": 0.9131725873439213,
+    "tree_recall_new": 0.9992395728013458,
+    "per_class_iou": {
+      "tree": 0.8624821961413048,
+      "ground": 0.9131725873439213,
+      "person": 0.47541807293120514,
+      "sky": 0.8229103490897475,
+      "road": 0.8071512969458519,
+      "mountain": 0.5959807410508687,
+      "building": 0.5449347797277598,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 57,
+    "loss": 0.09129836492218579,
+    "miou_7": 0.7147705651556077,
+    "tree_iou_old": 0.8604410437060084,
+    "ground_iou_old": 0.9102523482533115,
+    "tree_recall_new": 0.9994116247652871,
+    "per_class_iou": {
+      "tree": 0.8604410437060084,
+      "ground": 0.9102523482533115,
+      "person": 0.4652373172210405,
+      "sky": 0.8247788796935914,
+      "road": 0.8024377981637844,
+      "mountain": 0.5926813760428257,
+      "building": 0.5475651930086924,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 58,
+    "loss": 0.09109030034709886,
+    "miou_7": 0.7184435180464683,
+    "tree_iou_old": 0.8625080590407855,
+    "ground_iou_old": 0.9131226114597868,
+    "tree_recall_new": 0.9994491505022784,
+    "per_class_iou": {
+      "tree": 0.8625080590407855,
+      "ground": 0.9131226114597868,
+      "person": 0.470192696821592,
+      "sky": 0.8240829335586235,
+      "road": 0.8125652267268542,
+      "mountain": 0.5941365218121131,
+      "building": 0.5524965769055226,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 59,
+    "loss": 0.09070181351734047,
+    "miou_7": 0.7161546689463674,
+    "tree_iou_old": 0.8619843878363893,
+    "ground_iou_old": 0.9120051419911369,
+    "tree_recall_new": 0.99935427411064,
+    "per_class_iou": {
+      "tree": 0.8619843878363893,
+      "ground": 0.9120051419911369,
+      "person": 0.46737921660694426,
+      "sky": 0.8240449181819649,
+      "road": 0.8027428616241995,
+      "mountain": 0.595520677756605,
+      "building": 0.5494054786273329,
+      "background": NaN
+    }
+  },
+  {
+    "epoch": 60,
+    "loss": 0.0903256651112411,
+    "miou_7": 0.7184340202591383,
+    "tree_iou_old": 0.8624814919971956,
+    "ground_iou_old": 0.9136910071083241,
+    "tree_recall_new": 0.9993712668972021,
+    "per_class_iou": {
+      "tree": 0.8624814919971956,
+      "ground": 0.9136910071083241,
+      "person": 0.46576649746192894,
+      "sky": 0.8252238363312573,
+      "road": 0.811227410462362,
+      "mountain": 0.5928357462160864,
+      "building": 0.5578121522368128,
+      "background": NaN
+    }
+  }
+]

model/TwinLite.py ADDED Viewed

	@@ -0,0 +1,468 @@

+import torch
+import torch.nn as nn
+from torch.nn import Module, Conv2d, Parameter, Softmax
+class PAM_Module(Module):
+    """ Position attention module"""
+    #Ref from SAGAN
+    def __init__(self, in_dim):
+        super(PAM_Module, self).__init__()
+        self.chanel_in = in_dim
+        self.query_conv = Conv2d(in_channels=in_dim, out_channels=in_dim//8, kernel_size=1)
+        self.key_conv = Conv2d(in_channels=in_dim, out_channels=in_dim//8, kernel_size=1)
+        self.value_conv = Conv2d(in_channels=in_dim, out_channels=in_dim, kernel_size=1)
+        self.gamma = Parameter(torch.zeros(1))
+        self.softmax = Softmax(dim=-1)
+    def forward(self, x):
+        """
+            inputs :
+                x : input feature maps( B X C X H X W)
+            returns :
+                out : attention value + input feature
+                attention: B X (HxW) X (HxW)
+        """
+        m_batchsize, C, height, width = x.size()
+        proj_query = self.query_conv(x).view(m_batchsize, -1, width*height).permute(0, 2, 1)
+        proj_key = self.key_conv(x).view(m_batchsize, -1, width*height)
+        energy = torch.bmm(proj_query, proj_key)
+        attention = self.softmax(energy)
+        proj_value = self.value_conv(x).view(m_batchsize, -1, width*height)
+        out = torch.bmm(proj_value, attention.permute(0, 2, 1))
+        out = out.view(m_batchsize, C, height, width)
+        out = self.gamma*out + x
+        return out
+class CAM_Module(Module):
+    """ Channel attention module"""
+    def __init__(self, in_dim):
+        super(CAM_Module, self).__init__()
+        self.chanel_in = in_dim
+        self.gamma = Parameter(torch.zeros(1))
+        self.softmax  = Softmax(dim=-1)
+    def forward(self,x):
+        """
+            inputs :
+                x : input feature maps( B X C X H X W)
+            returns :
+                out : attention value + input feature
+                attention: B X C X C
+        """
+        m_batchsize, C, height, width = x.size()
+        proj_query = x.view(m_batchsize, C, -1)
+        proj_key = x.view(m_batchsize, C, -1).permute(0, 2, 1)
+        energy = torch.bmm(proj_query, proj_key)
+        energy_new = torch.max(energy, -1, keepdim=True)[0].expand_as(energy)-energy
+        attention = self.softmax(energy_new)
+        proj_value = x.view(m_batchsize, C, -1)
+        out = torch.bmm(attention, proj_value)
+        out = out.view(m_batchsize, C, height, width)
+        out = self.gamma*out + x
+        return out
+class UPx2(nn.Module):
+    '''
+    This class defines the convolution layer with batch normalization and PReLU activation
+    '''
+    def __init__(self, nIn, nOut):
+        '''
+        :param nIn: number of input channels
+        :param nOut: number of output channels
+        :param kSize: kernel size
+        :param stride: stride rate for down-sampling. Default is 1
+        '''
+        super().__init__()
+        self.deconv = nn.ConvTranspose2d(nIn, nOut, 2, stride=2, padding=0, output_padding=0, bias=False)
+        self.bn = nn.BatchNorm2d(nOut, eps=1e-03)
+        self.act = nn.PReLU(nOut)
+    def forward(self, input):
+        '''
+        :param input: input feature map
+        :return: transformed feature map
+        '''
+        output = self.deconv(input)
+        output = self.bn(output)
+        output = self.act(output)
+        return output
+    def fuseforward(self, input):
+        output = self.deconv(input)
+        output = self.act(output)
+        return output
+class CBR(nn.Module):
+    '''
+    This class defines the convolution layer with batch normalization and PReLU activation
+    '''
+    def __init__(self, nIn, nOut, kSize, stride=1):
+        '''
+        :param nIn: number of input channels
+        :param nOut: number of output channels
+        :param kSize: kernel size
+        :param stride: stride rate for down-sampling. Default is 1
+        '''
+        super().__init__()
+        padding = int((kSize - 1)/2)
+        #self.conv = nn.Conv2d(nIn, nOut, kSize, stride=stride, padding=padding, bias=False)
+        self.conv = nn.Conv2d(nIn, nOut, (kSize, kSize), stride=stride, padding=(padding, padding), bias=False)
+        #self.conv1 = nn.Conv2d(nOut, nOut, (1, kSize), stride=1, padding=(0, padding), bias=False)
+        self.bn = nn.BatchNorm2d(nOut, eps=1e-03)
+        self.act = nn.PReLU(nOut)
+    def forward(self, input):
+        '''
+        :param input: input feature map
+        :return: transformed feature map
+        '''
+        output = self.conv(input)
+        #output = self.conv1(output)
+        output = self.bn(output)
+        output = self.act(output)
+        return output
+    def fuseforward(self, input):
+        output = self.conv(input)
+        output = self.act(output)
+        return output
+class CB(nn.Module):
+    '''
+       This class groups the convolution and batch normalization
+    '''
+    def __init__(self, nIn, nOut, kSize, stride=1):
+        '''
+        :param nIn: number of input channels
+        :param nOut: number of output channels
+        :param kSize: kernel size
+        :param stride: optinal stide for down-sampling
+        '''
+        super().__init__()
+        padding = int((kSize - 1)/2)
+        self.conv = nn.Conv2d(nIn, nOut, (kSize, kSize), stride=stride, padding=(padding, padding), bias=False)
+        self.bn = nn.BatchNorm2d(nOut, eps=1e-03)
+    def forward(self, input):
+        '''
+        :param input: input feature map
+        :return: transformed feature map
+        '''
+        output = self.conv(input)
+        output = self.bn(output)
+        return output
+class C(nn.Module):
+    '''
+    This class is for a convolutional layer.
+    '''
+    def __init__(self, nIn, nOut, kSize, stride=1):
+        '''
+        :param nIn: number of input channels
+        :param nOut: number of output channels
+        :param kSize: kernel size
+        :param stride: optional stride rate for down-sampling
+        '''
+        super().__init__()
+        padding = int((kSize - 1)/2)
+        # print(nIn, nOut, (kSize, kSize))
+        self.conv = nn.Conv2d(nIn, nOut, (kSize, kSize), stride=stride, padding=(padding, padding), bias=False)
+    def forward(self, input):
+        '''
+        :param input: input feature map
+        :return: transformed feature map
+        '''
+        output = self.conv(input)
+        return output
+class CDilated(nn.Module):
+    '''
+    This class defines the dilated convolution.
+    '''
+    def __init__(self, nIn, nOut, kSize, stride=1, d=1):
+        '''
+        :param nIn: number of input channels
+        :param nOut: number of output channels
+        :param kSize: kernel size
+        :param stride: optional stride rate for down-sampling
+        :param d: optional dilation rate
+        '''
+        super().__init__()
+        padding = int((kSize - 1)/2) * d
+        self.conv = nn.Conv2d(nIn, nOut, (kSize, kSize), stride=stride, padding=(padding, padding), bias=False, dilation=d)
+    def forward(self, input):
+        '''
+        :param input: input feature map
+        :return: transformed feature map
+        '''
+        output = self.conv(input)
+        return output
+class DownSamplerB(nn.Module):
+    def __init__(self, nIn, nOut):
+        super().__init__()
+        n = int(nOut/5)
+        n1 = nOut - 4*n
+        self.c1 = C(nIn, n, 3, 2)
+        self.d1 = CDilated(n, n1, 3, 1, 1)
+        self.d2 = CDilated(n, n, 3, 1, 2)
+        self.d4 = CDilated(n, n, 3, 1, 4)
+        self.d8 = CDilated(n, n, 3, 1, 8)
+        self.d16 = CDilated(n, n, 3, 1, 16)
+        self.bn = nn.BatchNorm2d(nOut, eps=1e-3)
+        self.act = nn.PReLU(nOut)
+    def forward(self, input):
+        output1 = self.c1(input)
+        d1 = self.d1(output1)
+        d2 = self.d2(output1)
+        d4 = self.d4(output1)
+        d8 = self.d8(output1)
+        d16 = self.d16(output1)
+        add1 = d2
+        add2 = add1 + d4
+        add3 = add2 + d8
+        add4 = add3 + d16
+        combine = torch.cat([d1, add1, add2, add3, add4],1)
+        #combine_in_out = input + combine
+        output = self.bn(combine)
+        output = self.act(output)
+        return output
+class BR(nn.Module):
+    '''
+        This class groups the batch normalization and PReLU activation
+    '''
+    def __init__(self, nOut):
+        '''
+        :param nOut: output feature maps
+        '''
+        super().__init__()
+        self.nOut=nOut
+        self.bn = nn.BatchNorm2d(nOut, eps=1e-03)
+        self.act = nn.PReLU(nOut)
+    def forward(self, input):
+        '''
+        :param input: input feature map
+        :return: normalized and thresholded feature map
+        '''
+        # print("bf bn :",input.size(),self.nOut)
+        output = self.bn(input)
+        # print("after bn :",output.size())
+        output = self.act(output)
+        # print("after act :",output.size())
+        return output
+class DilatedParllelResidualBlockB(nn.Module):
+    '''
+    This class defines the ESP block, which is based on the following principle
+        Reduce ---> Split ---> Transform --> Merge
+    '''
+    def __init__(self, nIn, nOut, add=True):
+        '''
+        :param nIn: number of input channels
+        :param nOut: number of output channels
+        :param add: if true, add a residual connection through identity operation. You can use projection too as
+                in ResNet paper, but we avoid to use it if the dimensions are not the same because we do not want to
+                increase the module complexity
+        '''
+        super().__init__()
+        n = max(int(nOut/5),1)
+        n1 = max(nOut - 4*n,1)
+        # print(nIn,n,n1,"--")
+        self.c1 = C(nIn, n, 1, 1)
+        self.d1 = CDilated(n, n1, 3, 1, 1) # dilation rate of 2^0
+        self.d2 = CDilated(n, n, 3, 1, 2) # dilation rate of 2^1
+        self.d4 = CDilated(n, n, 3, 1, 4) # dilation rate of 2^2
+        self.d8 = CDilated(n, n, 3, 1, 8) # dilation rate of 2^3
+        self.d16 = CDilated(n, n, 3, 1, 16) # dilation rate of 2^4
+        # print("nOut bf :",nOut)
+        self.bn = BR(nOut)
+        # print("nOut at :",self.bn.size())
+        self.add = add
+    def forward(self, input):
+        '''
+        :param input: input feature map
+        :return: transformed feature map
+        '''
+        # reduce
+        output1 = self.c1(input)
+        # split and transform
+        d1 = self.d1(output1)
+        d2 = self.d2(output1)
+        d4 = self.d4(output1)
+        d8 = self.d8(output1)
+        d16 = self.d16(output1)
+        # heirarchical fusion for de-gridding
+        add1 = d2
+        add2 = add1 + d4
+        add3 = add2 + d8
+        add4 = add3 + d16
+        # print(d1.size(),add1.size(),add2.size(),add3.size(),add4.size())
+        #merge
+        combine = torch.cat([d1, add1, add2, add3, add4], 1)
+        # print("combine :",combine.size())
+        # if residual version
+        if self.add:
+            # print("add :",combine.size())
+            combine = input + combine
+        # print(combine.size(),"-----------------")
+        output = self.bn(combine)
+        return output
+class InputProjectionA(nn.Module):
+    '''
+    This class projects the input image to the same spatial dimensions as the feature map.
+    For example, if the input image is 512 x512 x3 and spatial dimensions of feature map size are 56x56xF, then
+    this class will generate an output of 56x56x3
+    '''
+    def __init__(self, samplingTimes):
+        '''
+        :param samplingTimes: The rate at which you want to down-sample the image
+        '''
+        super().__init__()
+        self.pool = nn.ModuleList()
+        for i in range(0, samplingTimes):
+            #pyramid-based approach for down-sampling
+            self.pool.append(nn.AvgPool2d(3, stride=2, padding=1))
+    def forward(self, input):
+        '''
+        :param input: Input RGB Image
+        :return: down-sampled image (pyramid-based approach)
+        '''
+        for pool in self.pool:
+            input = pool(input)
+        return input
+class ESPNet_Encoder(nn.Module):
+    '''
+    This class defines the ESPNet-C network in the paper
+    '''
+    def __init__(self, p=5, q=3):
+    # def __init__(self, classes=20, p=1, q=1):
+        '''
+        :param classes: number of classes in the dataset. Default is 20 for the cityscapes
+        :param p: depth multiplier
+        :param q: depth multiplier
+        '''
+        super().__init__()
+        self.level1 = CBR(3, 16, 3, 2)
+        self.sample1 = InputProjectionA(1)
+        self.sample2 = InputProjectionA(2)
+        self.b1 = CBR(16 + 3,19,3)
+        self.level2_0 = DownSamplerB(16 +3, 64)
+        self.level2 = nn.ModuleList()
+        for i in range(0, p):
+            self.level2.append(DilatedParllelResidualBlockB(64 , 64))
+        self.b2 = CBR(128 + 3,131,3)
+        self.level3_0 = DownSamplerB(128 + 3, 128)
+        self.level3 = nn.ModuleList()
+        for i in range(0, q):
+            self.level3.append(DilatedParllelResidualBlockB(128 , 128))
+        # self.mixstyle = MixStyle2(p=0.5, alpha=0.1)
+        self.b3 = CBR(256,32,3)
+        self.sa = PAM_Module(32)
+        self.sc = CAM_Module(32)
+        self.conv_sa = CBR(32,32,3)
+        self.conv_sc = CBR(32,32,3)
+        self.classifier = CBR(32, 32, 1, 1)
+    def forward(self, input):
+        '''
+        :param input: Receives the input RGB image
+        :return: the transformed feature map with spatial dimensions 1/8th of the input image
+        '''
+        output0 = self.level1(input)
+        inp1 = self.sample1(input)
+        inp2 = self.sample2(input)
+        output0_cat = self.b1(torch.cat([output0, inp1], 1))
+        output1_0 = self.level2_0(output0_cat) # down-sampled
+        for i, layer in enumerate(self.level2):
+            if i==0:
+                output1 = layer(output1_0)
+            else:
+                output1 = layer(output1)
+        output1_cat = self.b2(torch.cat([output1,  output1_0, inp2], 1))
+        output2_0 = self.level3_0(output1_cat) # down-sampled
+        for i, layer in enumerate(self.level3):
+            if i==0:
+                output2 = layer(output2_0)
+            else:
+                output2 = layer(output2)
+        cat_=torch.cat([output2_0, output2], 1)
+        output2_cat = self.b3(cat_)
+        out_sa=self.sa(output2_cat)
+        out_sa=self.conv_sa(out_sa)
+        out_sc=self.sc(output2_cat)
+        out_sc=self.conv_sc(out_sc)
+        out_s=out_sa+out_sc
+        classifier = self.classifier(out_s)
+        return classifier
+class TwinLiteNet(nn.Module):
+    '''
+    This class defines the ESPNet network
+    '''
+    def __init__(self, p=2, q=3, ):
+        super().__init__()
+        self.encoder = ESPNet_Encoder(p, q)
+        self.up_1_1 = UPx2(32,16)
+        self.up_2_1 = UPx2(16,8)
+        self.up_1_2 = UPx2(32,16)
+        self.up_2_2 = UPx2(16,8)
+        self.classifier_1 = UPx2(8,2)
+        self.classifier_2 = UPx2(8,2)
+    def forward(self, input):
+        x=self.encoder(input)
+        x1=self.up_1_1(x)
+        x1=self.up_2_1(x1)
+        classifier1=self.classifier_1(x1)
+        x2=self.up_1_2(x)
+        x2=self.up_2_2(x2)
+        classifier2=self.classifier_2(x2)
+        return (classifier1,classifier2)

model/TwinLite_8class.py ADDED Viewed

	@@ -0,0 +1,26 @@

+"""TwinLiteNet adapted for SINGLE 8-class semantic output (not dual binary).
+Same encoder and decoder upsampling, but final classifier outputs 8 channels
+matching our Segformer setup:
+  0=tree  1=ground  2=person  3=sky  4=road  5=mountain  6=building  7=background
+We keep one branch only — drops classifier_2 entirely → slightly faster + smaller.
+"""
+import torch
+import torch.nn as nn
+from .TwinLite import ESPNet_Encoder, UPx2
+class TwinLiteNet8(nn.Module):
+    def __init__(self, num_classes: int = 8, p: int = 2, q: int = 3):
+        super().__init__()
+        self.encoder = ESPNet_Encoder(p, q)
+        self.up_1 = UPx2(32, 16)
+        self.up_2 = UPx2(16, 8)
+        self.classifier = UPx2(8, num_classes)
+    def forward(self, x):
+        x = self.encoder(x)
+        x = self.up_1(x)
+        x = self.up_2(x)
+        return self.classifier(x)   # (B, num_classes, H, W)

model/__pycache__/TwinLite.cpython-311.pyc ADDED Viewed

Binary file (25.4 kB). View file

model/__pycache__/TwinLite.cpython-38.pyc ADDED Viewed

Binary file (13.9 kB). View file

model/__pycache__/TwinLite_8class.cpython-311.pyc ADDED Viewed

Binary file (2.07 kB). View file

predict.py ADDED Viewed

	@@ -0,0 +1,103 @@

+"""TwinLiteNet8 inference — single image or directory.
+Same interface as Segformer's predict.py for easy swap.
+Trained at 640x360; this script auto-resizes any input down to 640x360 for
+inference, then upsamples the prediction back to original resolution.
+Usage:
+    python predict.py input.jpg                  --weights run_8class/twinlite8_best.pt
+    python predict.py --dir frames/ --out out/   --weights run_8class/twinlite8_best.pt
+"""
+from __future__ import annotations
+import argparse, sys, os
+from pathlib import Path
+import cv2
+import numpy as np
+import torch
+import torch.nn.functional as F
+sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
+from model.TwinLite_8class import TwinLiteNet8
+NAMES = ["tree", "ground", "person", "sky", "road", "mountain", "building", "background"]
+PALETTE = np.array([
+    [60, 220, 60],    # tree
+    [40, 100, 160],   # ground
+    [40,  40, 230],   # person
+    [230, 200, 60],   # sky
+    [140, 140, 140],  # road
+    [180,  60, 180],  # mountain
+    [50, 220, 220],   # building
+    [100, 100, 100],  # background
+], dtype=np.uint8)
+TRAIN_W, TRAIN_H = 640, 360
+def load_model(weights, device="cuda"):
+    model = TwinLiteNet8(num_classes=8).to(device).eval()
+    ckpt = torch.load(weights, map_location=device, weights_only=False)
+    model.load_state_dict(ckpt["model"] if "model" in ckpt else ckpt)
+    return model
+def predict(model, bgr_img, device="cuda"):
+    """BGR uint8 → (H,W) class id mask 0..7 at original resolution."""
+    H, W = bgr_img.shape[:2]
+    inp_bgr = cv2.resize(bgr_img, (TRAIN_W, TRAIN_H))
+    rgb = cv2.cvtColor(inp_bgr, cv2.COLOR_BGR2RGB).astype(np.float32) / 255.0
+    x = torch.from_numpy(rgb.transpose(2, 0, 1)).unsqueeze(0).float().to(device)
+    with torch.no_grad():
+        logits = model(x)
+        # Upsample logits to original resolution before argmax (cleaner boundaries)
+        logits = F.interpolate(logits, size=(H, W), mode="bilinear", align_corners=False)
+        # v2: channel 7 (background) was never trained -> mask it out so it can't win argmax
+        logits[:, 7, :, :] = -1e9
+    return logits.argmax(1)[0].cpu().numpy().astype(np.uint8)
+def colorize(mask):
+    return PALETTE[mask]
+def overlay(bgr, mask, alpha=0.45):
+    return cv2.addWeighted(bgr, 1 - alpha, colorize(mask), alpha, 0)
+def main():
+    ap = argparse.ArgumentParser()
+    ap.add_argument("input", nargs="?")
+    ap.add_argument("--dir")
+    ap.add_argument("--out", default=".")
+    ap.add_argument("--weights", default="run_8class/twinlite8_best.pt")
+    ap.add_argument("--device", default="cuda" if torch.cuda.is_available() else "cpu")
+    args = ap.parse_args()
+    if not args.input and not args.dir:
+        ap.print_help(); return
+    print(f"loading model from {args.weights} on {args.device} ...")
+    model = load_model(args.weights, device=args.device)
+    out_dir = Path(args.out); out_dir.mkdir(parents=True, exist_ok=True)
+    paths = []
+    if args.dir:
+        paths = sorted(p for p in Path(args.dir).iterdir() if p.suffix.lower() in {".jpg",".jpeg",".png",".bmp"})
+    if args.input:
+        paths.append(Path(args.input))
+    for p in paths:
+        img = cv2.imread(str(p))
+        if img is None:
+            print(f"  skip: {p}"); continue
+        mask = predict(model, img, device=args.device)
+        cv2.imwrite(str(out_dir / f"{p.stem}_pred.png"), mask)
+        cv2.imwrite(str(out_dir / f"{p.stem}_overlay.jpg"), overlay(img, mask))
+        counts = np.bincount(mask.flatten(), minlength=8)
+        top = counts.argmax()
+        print(f"  {p.name:<50}  top: {NAMES[top]} ({100*counts[top]/counts.sum():.1f}%)")
+    print(f"\noutputs -> {out_dir.resolve()}")
+if __name__ == "__main__":
+    main()

predict_onnx.py ADDED Viewed

	@@ -0,0 +1,84 @@

+"""TwinLiteNet8 ONNX inference — for edge deployment / cross-platform.
+Runs entirely via ONNX Runtime (no PyTorch needed at deploy time).
+Use CPUExecutionProvider for CPU, CUDAExecutionProvider for GPU,
+TensorRTExecutionProvider for TensorRT-accelerated runs on Jetson.
+Usage:
+    python predict_onnx.py input.jpg --onnx twinlite8.onnx
+    python predict_onnx.py --dir frames/ --out out/ --onnx twinlite8.onnx --provider CUDAExecutionProvider
+"""
+import argparse
+from pathlib import Path
+import cv2
+import numpy as np
+import onnxruntime as ort
+NAMES = ["tree","ground","person","sky","road","mountain","building","background"]
+PALETTE = np.array([
+    [60,220,60],[40,100,160],[40,40,230],[230,200,60],
+    [140,140,140],[180,60,180],[50,220,220],[100,100,100],
+], dtype=np.uint8)
+TRAIN_W, TRAIN_H = 640, 360
+def predict(sess, bgr_img):
+    H, W = bgr_img.shape[:2]
+    inp = cv2.resize(bgr_img, (TRAIN_W, TRAIN_H))
+    rgb = cv2.cvtColor(inp, cv2.COLOR_BGR2RGB).astype(np.float32) / 255.0
+    x = rgb.transpose(2, 0, 1)[None].astype(np.float32)            # (1,3,H,W)
+    logits = sess.run(None, {"input": x})[0]                       # (1,8,H,W)
+    logits[:, 7, :, :] = -1e9                                       # v2: bg channel never trained
+    pred_small = logits.argmax(1)[0].astype(np.uint8)              # at training res
+    if (H, W) != (TRAIN_H, TRAIN_W):
+        return cv2.resize(pred_small, (W, H), interpolation=cv2.INTER_NEAREST)
+    return pred_small
+def main():
+    ap = argparse.ArgumentParser()
+    ap.add_argument("input", nargs="?")
+    ap.add_argument("--dir")
+    ap.add_argument("--out", default=".")
+    ap.add_argument("--onnx", default="twinlite8.onnx")
+    ap.add_argument("--provider", default=None,
+                    help="ONNX provider: CPUExecutionProvider | CUDAExecutionProvider | TensorrtExecutionProvider")
+    args = ap.parse_args()
+    if not args.input and not args.dir:
+        ap.print_help(); return
+    available = ort.get_available_providers()
+    if args.provider:
+        providers = [args.provider]
+    else:
+        # Auto-pick best
+        for p in ["TensorrtExecutionProvider", "CUDAExecutionProvider", "CPUExecutionProvider"]:
+            if p in available: providers = [p]; break
+    print(f"available providers: {available}")
+    print(f"using: {providers}")
+    sess = ort.InferenceSession(args.onnx, providers=providers)
+    print(f"actual provider: {sess.get_providers()}")
+    out_dir = Path(args.out); out_dir.mkdir(parents=True, exist_ok=True)
+    paths = []
+    if args.dir:
+        paths = sorted(p for p in Path(args.dir).iterdir() if p.suffix.lower() in {".jpg",".jpeg",".png",".bmp"})
+    if args.input: paths.append(Path(args.input))
+    for p in paths:
+        img = cv2.imread(str(p))
+        if img is None: continue
+        mask = predict(sess, img)
+        cv2.imwrite(str(out_dir / f"{p.stem}_pred.png"), mask)
+        overlay = cv2.addWeighted(img, 0.55, PALETTE[mask], 0.45, 0)
+        cv2.imwrite(str(out_dir / f"{p.stem}_overlay.jpg"), overlay)
+        counts = np.bincount(mask.flatten(), minlength=8)
+        top = counts.argmax()
+        print(f"  {p.name:<50}  top: {NAMES[top]} ({100*counts[top]/counts.sum():.1f}%)")
+    print(f"\noutputs -> {out_dir.resolve()}")
+if __name__ == "__main__":
+    main()

samples/0_frame_3884.jpg ADDED Viewed

Git LFS Details

SHA256: 0e7ffd46a313018e24c30bf790f96000c1f10f0b7c922b7157dee48078ae0112
Pointer size: 131 Bytes
Size of remote file: 593 kB

samples/1_frame_2803.jpg ADDED Viewed

Git LFS Details

SHA256: ae8153f02dffb4bc45d07c0681b14f6d1faf0efba59f8f3588c83ff380607ace
Pointer size: 131 Bytes
Size of remote file: 627 kB

samples/2_frame_2626.jpg ADDED Viewed

Git LFS Details

SHA256: e542c7cbda64f21b734563835fc6a903d4e93028ce00a74c3a39132a8e9caa8e
Pointer size: 131 Bytes
Size of remote file: 585 kB

samples/3_frame_4093.jpg ADDED Viewed

Git LFS Details

SHA256: 450ae432fd89fa32874aa7b518f7a4693ec63e6a3af018837a09eedef57f969a
Pointer size: 131 Bytes
Size of remote file: 618 kB

samples/4_frame_3138.jpg ADDED Viewed

Git LFS Details

SHA256: 8825e95881d9fe7f5d8464f449f6bc5f80e7a38a572b2b18db3a9367c21eee80
Pointer size: 131 Bytes
Size of remote file: 567 kB

samples/5_frame_3076.jpg ADDED Viewed

Git LFS Details

SHA256: c51194cc7c7410605b6d4f3cd6513eb540d47fe9c50e7ae1ef1f46a872273c3b
Pointer size: 131 Bytes
Size of remote file: 635 kB

samples_20/sample_00_frame_3884.jpg ADDED Viewed

Git LFS Details

SHA256: 06a45917f6f22aed83bf71fce580198daff56f4d915649a1e6075b2a4a529901
Pointer size: 131 Bytes
Size of remote file: 598 kB

samples_20/sample_01_frame_2803.jpg ADDED Viewed

Git LFS Details

SHA256: 78f8ab77e3c9ae89985b93b7c25de18e34ae9203edc4db7a79d666e01e6a0bab
Pointer size: 131 Bytes
Size of remote file: 629 kB

samples_20/sample_02_frame_2626.jpg ADDED Viewed

Git LFS Details

SHA256: e8a9fd7d2db82129ba70c64dd7429aa348466e97c8326a376b3cf4eddf12a9cc
Pointer size: 131 Bytes
Size of remote file: 588 kB

samples_20/sample_03_frame_4093.jpg ADDED Viewed

Git LFS Details

SHA256: a7445337163ef8c505b6c70fe0a02b8b3923915f3f6478e375ccb6f0cbd9ecf1
Pointer size: 131 Bytes
Size of remote file: 621 kB

samples_20/sample_04_frame_3138.jpg ADDED Viewed

Git LFS Details

SHA256: 355cd2f0d06f82c6c930b4aae7b45667169afdbeff2c9946f8a8ad3e68fe866e
Pointer size: 131 Bytes
Size of remote file: 564 kB

samples_20/sample_05_frame_3076.jpg ADDED Viewed

Git LFS Details

SHA256: d2288fe50ebd6e51352b5decf46d0365637e7b033c3aa69f2287e88049820358
Pointer size: 131 Bytes
Size of remote file: 624 kB

samples_20/sample_06_frame_3032.jpg ADDED Viewed

Git LFS Details

SHA256: c0cdc5bada40565175908590dfaa32b6ad50ebd2ba79895185dc5e4858e47e75
Pointer size: 131 Bytes
Size of remote file: 536 kB

samples_20/sample_07_frame_2860.jpg ADDED Viewed

Git LFS Details

SHA256: f784538abff1dfc3d066e162b92edaa34cd068ecbb66ee72c4f13c0d6a3b03dd
Pointer size: 131 Bytes
Size of remote file: 571 kB

samples_20/sample_08_frame_4083.jpg ADDED Viewed

Git LFS Details

SHA256: e0c8229ddb71e30dce1d092cec091cef9a50dd983ba7c3b43c68ac104a076677
Pointer size: 131 Bytes
Size of remote file: 598 kB

samples_20/sample_09_frame_2784.jpg ADDED Viewed

Git LFS Details

SHA256: 9a468ca29e5eb427119edcbc1d7101275c7680979590f5cdda3634d7602dacdc
Pointer size: 131 Bytes
Size of remote file: 618 kB

samples_20/sample_10_frame_3960.jpg ADDED Viewed

Git LFS Details

SHA256: 282880811250d63924dff1cb87e6e73d9314a6cc2a0537e95dc936f5734e309b
Pointer size: 131 Bytes
Size of remote file: 580 kB

samples_20/sample_11_frame_4091.jpg ADDED Viewed

Git LFS Details

SHA256: bec509c2b188adfb4ae7a630ebc07460e3207779a843b59890ba9ad7ba1e1aa6
Pointer size: 131 Bytes
Size of remote file: 626 kB

samples_20/sample_12_frame_4402.jpg ADDED Viewed

Git LFS Details

SHA256: 5ec26967a021ba6279ee1a4630b7c13951e2e3833b2f1ee87c836520f1427aab
Pointer size: 131 Bytes
Size of remote file: 600 kB

samples_20/sample_13_frame_3691.jpg ADDED Viewed

Git LFS Details

SHA256: b32d03143eeab425cba9506ce08a27cf36ce0de2705071f9428a6527c28b2530
Pointer size: 131 Bytes
Size of remote file: 720 kB

samples_20/sample_14_frame_2753.jpg ADDED Viewed

Git LFS Details

SHA256: 70a9dd164493046ab49105864a4af902d8c88e3f27549552d3c58cf8a5d961f3
Pointer size: 131 Bytes
Size of remote file: 590 kB

samples_20/sample_15_frame_3784.jpg ADDED Viewed

Git LFS Details

SHA256: 68fd7411124a1661f002eb2df9f906d47a2c1131b4ac6c059b99b6e822cefec0
Pointer size: 131 Bytes
Size of remote file: 613 kB

samples_20/sample_16_frame_3439.jpg ADDED Viewed

Git LFS Details

SHA256: 5790ee3e9709d8f0a102ad09ca4c5a03eacc0421f12bb913911f77735697a67f
Pointer size: 131 Bytes
Size of remote file: 638 kB

samples_20/sample_17_frame_2640.jpg ADDED Viewed

Git LFS Details

SHA256: 0357e1cf5ed27d66b5933915178dfbc89ac12c32931fc7a3be27b2b907efc7a4
Pointer size: 131 Bytes
Size of remote file: 585 kB

samples_20/sample_18_frame_2636.jpg ADDED Viewed

Git LFS Details

SHA256: 52b1d9355c6edf4acc2ce6fbb990fa5dc6ae75d9353f45797002ee9039251dd5
Pointer size: 131 Bytes
Size of remote file: 593 kB

samples_20/sample_19_frame_2766.jpg ADDED Viewed

Git LFS Details

SHA256: debf233c10fc4ef1a85858f5af26d4c917e11ea98fa998bee3dc33549ba42b9b
Pointer size: 131 Bytes
Size of remote file: 592 kB

train_8class.py ADDED Viewed

	@@ -0,0 +1,247 @@

+"""TwinLiteNet8 — single-branch 8-class semantic seg, directly comparable to Segformer.
+Classes:  0 tree  1 ground  2 person  3 sky  4 road  5 mountain  6 building  7 background
+"""
+from __future__ import annotations
+import os, sys, json, re, time, random
+from pathlib import Path
+import numpy as np, cv2, torch
+import torch.nn as nn
+import torch.nn.functional as F
+from torch.utils.data import Dataset, DataLoader, ConcatDataset
+sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
+from model.TwinLite_8class import TwinLiteNet8
+# ───────── config ─────────
+ROOT = Path(r"C:/Users/room104/Desktop/AGMOtree/semantic_segmantation")
+OLD_IMG = ROOT / "merged_dataset/train/images"
+OLD_MSK = ROOT / "merged_dataset/train/masks_pseudo"
+NEW_IMG = ROOT / "orchard_nav/train/images"
+NEW_MSK = ROOT / "orchard_nav/train/masks"
+OUT_DIR = Path(r"C:/Users/room104/Desktop/AGMOtree/TwinLiteNet_train/run_v2")
+OUT_DIR.mkdir(parents=True, exist_ok=True)
+NAMES = ["tree","ground","person","sky","road","mountain","building","background"]
+NUM_CLASSES = 8
+IGNORE_INDEX = 255
+W_IN, H_IN = 640, 360
+BATCH = 16
+EPOCHS = 60
+LR = 5e-4
+NUM_WORKERS = 4
+SEED = 42
+DEVICE = "cuda"
+# v2 design: background is NOT a real class. Pixels labeled 7 → 255 (ignore_index)
+# in the loader, so loss never trains channel 7. Weight 0 as belt-and-braces.
+# At inference, channel 7 logit is set to -inf before argmax (see predict.py update).
+WEIGHTS = np.array([1.5, 0.5, 1.5, 1.0, 1.0, 1.0, 1.0, 0.0], dtype=np.float32)
+random.seed(SEED); np.random.seed(SEED); torch.manual_seed(SEED)
+def frame_num(p):
+    m = re.match(r"frame_(\d+)", p.stem); return int(m.group(1)) if m else -1
+class OrchardDS(Dataset):
+    def __init__(self, paths, mask_dir, augment=False, source="old"):
+        self.paths = paths
+        self.mask_dir = mask_dir
+        self.augment = augment
+        self.source = source
+    def __len__(self): return len(self.paths)
+    def __getitem__(self, i):
+        ip = self.paths[i]
+        img = cv2.imread(str(ip))
+        msk = cv2.imread(str(self.mask_dir / (ip.stem + ".png")), cv2.IMREAD_GRAYSCALE)
+        if img is None or msk is None:
+            img = np.zeros((H_IN, W_IN, 3), dtype=np.uint8)
+            msk = np.full((H_IN, W_IN), IGNORE_INDEX, dtype=np.uint8)
+        if self.augment:
+            if random.random() < 0.5:
+                img = np.ascontiguousarray(img[:, ::-1])
+                msk = np.ascontiguousarray(msk[:, ::-1])
+            if random.random() < 0.5:
+                hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV).astype(np.int16)
+                hsv[..., 0] = (hsv[..., 0] + random.randint(-10, 10)) % 180
+                hsv[..., 1] = np.clip(hsv[..., 1] * random.uniform(0.7, 1.3), 0, 255)
+                hsv[..., 2] = np.clip(hsv[..., 2] * random.uniform(0.7, 1.3), 0, 255)
+                img = cv2.cvtColor(hsv.astype(np.uint8), cv2.COLOR_HSV2BGR)
+        img = cv2.resize(img, (W_IN, H_IN))
+        msk = cv2.resize(msk, (W_IN, H_IN), interpolation=cv2.INTER_NEAREST)
+        # v2: remap class 7 (background) -> IGNORE_INDEX so it does NOT train.
+        # The user's intent: "background = stuff the model can't recognize", not a real class.
+        if self.source == "old":
+            msk[msk == 7] = IGNORE_INDEX
+        # new-source masks already have 255 for non-tree pixels, no change needed.
+        img = img[:, :, ::-1].transpose(2, 0, 1).astype(np.float32) / 255.0
+        return (torch.from_numpy(img).float(),
+                torch.from_numpy(msk).long())
+# ─── temporal split ───
+old_all = sorted(OLD_IMG.glob("*.jpg"))
+old_train = [p for p in old_all if frame_num(p) <= 4500]
+old_val   = [p for p in old_all if frame_num(p) >  4500]
+new_all = sorted(NEW_IMG.glob("*.jpg")); random.shuffle(new_all)
+n_new_val = max(20, len(new_all) // 10)
+new_val = new_all[:n_new_val]
+new_train = new_all[n_new_val:]
+train_ds = ConcatDataset([
+    OrchardDS(old_train, OLD_MSK, augment=True, source="old"),
+    OrchardDS(new_train, NEW_MSK, augment=True, source="new"),
+])
+old_val_ds = OrchardDS(old_val, OLD_MSK, augment=False, source="old")
+new_val_ds = OrchardDS(new_val, NEW_MSK, augment=False, source="new")
+print(f"=== TwinLiteNet8 (single-branch, 8-class) ===")
+print(f"  old train: {len(old_train)}  new train: {len(new_train)}")
+print(f"  old val:   {len(old_val)}    new val:   {len(new_val)}")
+# ─── eval ───
+def confusion(preds, ys, n, ignore=IGNORE_INDEX):
+    cm = np.zeros((n, n), dtype=np.int64)
+    valid = ys != ignore
+    if not valid.any(): return cm
+    p = preds[valid]; t = ys[valid]
+    for tc in range(n):
+        mt = (t == tc)
+        if not mt.any(): continue
+        for pc in range(n):
+            cm[tc, pc] += int(((p == pc) & mt).sum())
+    return cm
+def iou_from_cm(cm):
+    n = cm.shape[0]; ious = np.zeros(n)
+    for c in range(n):
+        tp = cm[c,c]; fp = cm[:,c].sum()-tp; fn = cm[c,:].sum()-tp
+        ious[c] = tp / (tp+fp+fn) if (tp+fp+fn) > 0 else float("nan")
+    return ious
+# ─── train ───
+log_path = OUT_DIR / "log.txt"
+def log(m):
+    print(m, flush=True)
+    with log_path.open("a", encoding="utf-8") as f: f.write(m + "\n")
+def main():
+    log_path.write_text("")
+    train_loader = DataLoader(train_ds, batch_size=BATCH, shuffle=True,
+                              num_workers=NUM_WORKERS, pin_memory=True, drop_last=True,
+                              persistent_workers=True)
+    old_val_loader = DataLoader(old_val_ds, batch_size=BATCH, shuffle=False,
+                                num_workers=2, pin_memory=True, persistent_workers=True)
+    new_val_loader = DataLoader(new_val_ds, batch_size=BATCH, shuffle=False,
+                                num_workers=2, pin_memory=True, persistent_workers=True)
+    model = TwinLiteNet8(num_classes=NUM_CLASSES).to(DEVICE)
+    n_params = sum(p.numel() for p in model.parameters())
+    log(f"model: TwinLiteNet8  params: {n_params/1e6:.3f}M")
+    log(f"input: {W_IN}x{H_IN}  batch: {BATCH}  epochs: {EPOCHS}  LR: {LR}")
+    log(f"classes: {NAMES}")
+    log(f"weights: {dict(zip(NAMES, [round(float(w),2) for w in WEIGHTS]))}")
+    log(f"train: {len(train_ds)}  old_val: {len(old_val_ds)}  new_val: {len(new_val_ds)}")
+    cw = torch.tensor(WEIGHTS, dtype=torch.float32, device=DEVICE)
+    loss_fn = nn.CrossEntropyLoss(weight=cw, ignore_index=IGNORE_INDEX)
+    optim = torch.optim.AdamW(model.parameters(), lr=LR, weight_decay=1e-4)
+    sched = torch.optim.lr_scheduler.CosineAnnealingLR(optim, T_max=EPOCHS * len(train_loader))
+    best_tree = -1.0
+    history = []
+    for epoch in range(1, EPOCHS+1):
+        model.train()
+        t0 = time.time()
+        ep_loss = 0.0
+        for x, y in train_loader:
+            x = x.cuda(non_blocking=True); y = y.cuda(non_blocking=True)
+            logits = model(x)
+            loss = loss_fn(logits, y)
+            optim.zero_grad(); loss.backward(); optim.step(); sched.step()
+            ep_loss += loss.item()
+        train_loss = ep_loss / len(train_loader)
+        model.eval()
+        cm_old = np.zeros((NUM_CLASSES, NUM_CLASSES), dtype=np.int64)
+        tree_tp = tree_fn = 0
+        with torch.no_grad():
+            for x, y in old_val_loader:
+                x = x.cuda(); y = y.cuda()
+                logits = model(x)
+                logits[:, 7, :, :] = -1e9    # never predict background — that channel is untrained
+                preds = logits.argmax(1)
+                cm_old += confusion(preds.cpu().numpy(), y.cpu().numpy(), NUM_CLASSES)
+            for x, y in new_val_loader:
+                x = x.cuda(); y = y.cuda()
+                logits = model(x)
+                logits[:, 7, :, :] = -1e9
+                preds = logits.argmax(1).cpu().numpy()
+                ys = y.cpu().numpy()
+                tm = (ys == 0)
+                tree_tp += int(((preds == 0) & tm).sum())
+                tree_fn += int(((preds != 0) & tm).sum())
+        iou_old = iou_from_cm(cm_old)
+        miou_7 = float(np.nanmean(iou_old[:7]))
+        tree_old = float(iou_old[0])
+        ground_old = float(iou_old[1])
+        tree_recall_new = tree_tp / (tree_tp + tree_fn) if (tree_tp + tree_fn) > 0 else float("nan")
+        elapsed = time.time() - t0
+        log(f"epoch {epoch:02d}/{EPOCHS}  loss={train_loss:.4f}  "
+            f"mIoU(7)={miou_7:.3f}  tree_old={tree_old:.3f}  ground_old={ground_old:.3f}  "
+            f"tree_new_recall={tree_recall_new:.3f}  ({elapsed:.0f}s)")
+        log(f"  per-class IoU: " + ", ".join(f"{n}={v:.3f}" for n, v in zip(NAMES, iou_old)))
+        history.append({
+            "epoch": epoch, "loss": float(train_loss),
+            "miou_7": miou_7, "tree_iou_old": tree_old, "ground_iou_old": ground_old,
+            "tree_recall_new": float(tree_recall_new),
+            "per_class_iou": {n: float(v) for n, v in zip(NAMES, iou_old)},
+        })
+        torch.save({"model": model.state_dict(), "epoch": epoch,
+                    "tree_iou_old": tree_old, "miou_7": miou_7, "tree_recall_new": float(tree_recall_new)},
+                   OUT_DIR / "twinlite8_last.pt")
+        if tree_old > best_tree:
+            best_tree = tree_old
+            torch.save({"model": model.state_dict(), "epoch": epoch,
+                        "tree_iou_old": tree_old, "miou_7": miou_7, "tree_recall_new": float(tree_recall_new)},
+                       OUT_DIR / "twinlite8_best.pt")
+            log(f"  saved best (tree_old {tree_old:.3f})")
+        (OUT_DIR / "history.json").write_text(json.dumps(history, indent=2))
+    log(f"\n=== DONE ===  best tree_old IoU: {best_tree:.3f}")
+    # ─── FPS benchmark ───
+    log(f"\n=== FPS BENCHMARK (RTX 3080, batch=1, 640x360) ===")
+    model.eval()
+    x = torch.randn(1, 3, H_IN, W_IN, device=DEVICE)
+    with torch.no_grad():
+        for _ in range(20): model(x)
+        torch.cuda.synchronize()
+        t0 = time.time()
+        N = 200
+        for _ in range(N): model(x)
+        torch.cuda.synchronize()
+    fps = N / (time.time() - t0)
+    log(f"  TwinLiteNet8 @ 640x360 batch=1: {fps:.1f} FPS")
+    log(f"  Jetson Orin Nano estimate: ~{fps/4:.0f}-{fps/3:.0f} FPS")
+if __name__ == "__main__":
+    main()

training_log.txt ADDED Viewed

	@@ -0,0 +1,140 @@

+model: TwinLiteNet8  params: 0.437M
+input: 640x360  batch: 16  epochs: 60  LR: 0.0005
+classes: ['tree', 'ground', 'person', 'sky', 'road', 'mountain', 'building', 'background']
+weights: {'tree': 1.5, 'ground': 0.5, 'person': 1.5, 'sky': 1.0, 'road': 1.0, 'mountain': 1.0, 'building': 1.0, 'background': 0.0}
+train: 5457  old_val: 155  new_val: 31
+epoch 01/60  loss=1.3311  mIoU(7)=0.375  tree_old=0.818  ground_old=0.877  tree_new_recall=0.872  (94s)
+  per-class IoU: tree=0.818, ground=0.877, person=0.044, sky=0.779, road=0.001, mountain=0.090, building=0.012, background=nan
+  saved best (tree_old 0.818)
+epoch 02/60  loss=0.7963  mIoU(7)=0.426  tree_old=0.812  ground_old=0.859  tree_new_recall=0.865  (65s)
+  per-class IoU: tree=0.812, ground=0.859, person=0.066, sky=0.800, road=0.020, mountain=0.396, building=0.033, background=nan
+epoch 03/60  loss=0.5680  mIoU(7)=0.558  tree_old=0.850  ground_old=0.884  tree_new_recall=0.939  (65s)
+  per-class IoU: tree=0.850, ground=0.884, person=0.293, sky=0.819, road=0.645, mountain=0.393, building=0.023, background=nan
+  saved best (tree_old 0.850)
+epoch 04/60  loss=0.4366  mIoU(7)=0.596  tree_old=0.853  ground_old=0.892  tree_new_recall=0.967  (65s)
+  per-class IoU: tree=0.853, ground=0.892, person=0.372, sky=0.826, road=0.584, mountain=0.475, building=0.169, background=nan
+  saved best (tree_old 0.853)
+epoch 05/60  loss=0.3549  mIoU(7)=0.622  tree_old=0.836  ground_old=0.885  tree_new_recall=0.963  (66s)
+  per-class IoU: tree=0.836, ground=0.885, person=0.396, sky=0.824, road=0.618, mountain=0.485, building=0.310, background=nan
+epoch 06/60  loss=0.2965  mIoU(7)=0.620  tree_old=0.831  ground_old=0.889  tree_new_recall=0.967  (65s)
+  per-class IoU: tree=0.831, ground=0.889, person=0.346, sky=0.820, road=0.710, mountain=0.495, building=0.245, background=nan
+epoch 07/60  loss=0.2606  mIoU(7)=0.661  tree_old=0.860  ground_old=0.904  tree_new_recall=0.981  (66s)
+  per-class IoU: tree=0.860, ground=0.904, person=0.396, sky=0.830, road=0.643, mountain=0.552, building=0.439, background=nan
+  saved best (tree_old 0.860)
+epoch 08/60  loss=0.2286  mIoU(7)=0.644  tree_old=0.854  ground_old=0.902  tree_new_recall=0.991  (66s)
+  per-class IoU: tree=0.854, ground=0.902, person=0.412, sky=0.816, road=0.649, mountain=0.531, building=0.347, background=nan
+epoch 09/60  loss=0.2093  mIoU(7)=0.432  tree_old=0.770  ground_old=0.855  tree_new_recall=0.887  (65s)
+  per-class IoU: tree=0.770, ground=0.855, person=0.360, sky=0.435, road=0.118, mountain=0.349, building=0.136, background=nan
+epoch 10/60  loss=0.1984  mIoU(7)=0.661  tree_old=0.834  ground_old=0.883  tree_new_recall=0.990  (66s)
+  per-class IoU: tree=0.834, ground=0.883, person=0.404, sky=0.824, road=0.715, mountain=0.523, building=0.442, background=nan
+epoch 11/60  loss=0.1792  mIoU(7)=0.683  tree_old=0.855  ground_old=0.910  tree_new_recall=0.996  (65s)
+  per-class IoU: tree=0.855, ground=0.910, person=0.388, sky=0.825, road=0.764, mountain=0.538, building=0.503, background=nan
+epoch 12/60  loss=0.1724  mIoU(7)=0.669  tree_old=0.842  ground_old=0.882  tree_new_recall=0.995  (65s)
+  per-class IoU: tree=0.842, ground=0.882, person=0.398, sky=0.823, road=0.720, mountain=0.550, building=0.467, background=nan
+epoch 13/60  loss=0.1639  mIoU(7)=0.683  tree_old=0.834  ground_old=0.883  tree_new_recall=0.994  (66s)
+  per-class IoU: tree=0.834, ground=0.883, person=0.421, sky=0.827, road=0.740, mountain=0.560, building=0.513, background=nan
+epoch 14/60  loss=0.1574  mIoU(7)=0.674  tree_old=0.848  ground_old=0.895  tree_new_recall=0.999  (67s)
+  per-class IoU: tree=0.848, ground=0.895, person=0.436, sky=0.809, road=0.727, mountain=0.547, building=0.458, background=nan
+epoch 15/60  loss=0.1535  mIoU(7)=0.664  tree_old=0.847  ground_old=0.897  tree_new_recall=0.998  (65s)
+  per-class IoU: tree=0.847, ground=0.897, person=0.380, sky=0.794, road=0.686, mountain=0.559, building=0.483, background=nan
+epoch 16/60  loss=0.1486  mIoU(7)=0.685  tree_old=0.851  ground_old=0.893  tree_new_recall=0.997  (67s)
+  per-class IoU: tree=0.851, ground=0.893, person=0.496, sky=0.803, road=0.692, mountain=0.567, building=0.492, background=nan
+epoch 17/60  loss=0.1457  mIoU(7)=0.698  tree_old=0.856  ground_old=0.901  tree_new_recall=0.997  (68s)
+  per-class IoU: tree=0.856, ground=0.901, person=0.460, sky=0.826, road=0.720, mountain=0.570, building=0.550, background=nan
+epoch 18/60  loss=0.1392  mIoU(7)=0.672  tree_old=0.859  ground_old=0.908  tree_new_recall=0.998  (66s)
+  per-class IoU: tree=0.859, ground=0.908, person=0.417, sky=0.830, road=0.739, mountain=0.579, building=0.368, background=nan
+epoch 19/60  loss=0.1354  mIoU(7)=0.687  tree_old=0.862  ground_old=0.903  tree_new_recall=0.999  (65s)
+  per-class IoU: tree=0.862, ground=0.903, person=0.442, sky=0.821, road=0.694, mountain=0.581, building=0.505, background=nan
+  saved best (tree_old 0.862)
+epoch 20/60  loss=0.1325  mIoU(7)=0.702  tree_old=0.863  ground_old=0.909  tree_new_recall=0.995  (67s)
+  per-class IoU: tree=0.863, ground=0.909, person=0.411, sky=0.829, road=0.747, mountain=0.536, building=0.620, background=nan
+  saved best (tree_old 0.863)
+epoch 21/60  loss=0.1303  mIoU(7)=0.676  tree_old=0.859  ground_old=0.899  tree_new_recall=0.996  (66s)
+  per-class IoU: tree=0.859, ground=0.899, person=0.390, sky=0.825, road=0.689, mountain=0.595, building=0.473, background=nan
+epoch 22/60  loss=0.1275  mIoU(7)=0.713  tree_old=0.865  ground_old=0.907  tree_new_recall=0.998  (65s)
+  per-class IoU: tree=0.865, ground=0.907, person=0.490, sky=0.820, road=0.724, mountain=0.576, building=0.606, background=nan
+  saved best (tree_old 0.865)
+epoch 23/60  loss=0.1288  mIoU(7)=0.711  tree_old=0.864  ground_old=0.909  tree_new_recall=0.999  (67s)
+  per-class IoU: tree=0.864, ground=0.909, person=0.458, sky=0.827, road=0.728, mountain=0.577, building=0.611, background=nan
+epoch 24/60  loss=0.1230  mIoU(7)=0.696  tree_old=0.863  ground_old=0.912  tree_new_recall=0.999  (66s)
+  per-class IoU: tree=0.863, ground=0.912, person=0.431, sky=0.820, road=0.757, mountain=0.584, building=0.506, background=nan
+epoch 25/60  loss=0.1228  mIoU(7)=0.700  tree_old=0.857  ground_old=0.912  tree_new_recall=1.000  (65s)
+  per-class IoU: tree=0.857, ground=0.912, person=0.444, sky=0.824, road=0.802, mountain=0.571, building=0.494, background=nan
+epoch 26/60  loss=0.1200  mIoU(7)=0.695  tree_old=0.866  ground_old=0.913  tree_new_recall=0.999  (67s)
+  per-class IoU: tree=0.866, ground=0.913, person=0.422, sky=0.837, road=0.716, mountain=0.563, building=0.549, background=nan
+  saved best (tree_old 0.866)
+epoch 27/60  loss=0.1185  mIoU(7)=0.695  tree_old=0.862  ground_old=0.908  tree_new_recall=0.999  (66s)
+  per-class IoU: tree=0.862, ground=0.908, person=0.407, sky=0.828, road=0.732, mountain=0.578, building=0.550, background=nan
+epoch 28/60  loss=0.1161  mIoU(7)=0.714  tree_old=0.860  ground_old=0.910  tree_new_recall=0.998  (64s)
+  per-class IoU: tree=0.860, ground=0.910, person=0.458, sky=0.826, road=0.768, mountain=0.577, building=0.596, background=nan
+epoch 29/60  loss=0.1151  mIoU(7)=0.708  tree_old=0.872  ground_old=0.916  tree_new_recall=0.999  (66s)
+  per-class IoU: tree=0.872, ground=0.916, person=0.441, sky=0.835, road=0.745, mountain=0.592, building=0.555, background=nan
+  saved best (tree_old 0.872)
+epoch 30/60  loss=0.1131  mIoU(7)=0.698  tree_old=0.865  ground_old=0.910  tree_new_recall=0.999  (66s)
+  per-class IoU: tree=0.865, ground=0.910, person=0.467, sky=0.834, road=0.755, mountain=0.592, building=0.467, background=nan
+epoch 31/60  loss=0.1110  mIoU(7)=0.694  tree_old=0.855  ground_old=0.905  tree_new_recall=0.999  (65s)
+  per-class IoU: tree=0.855, ground=0.905, person=0.433, sky=0.812, road=0.772, mountain=0.586, building=0.498, background=nan
+epoch 32/60  loss=0.1115  mIoU(7)=0.719  tree_old=0.865  ground_old=0.916  tree_new_recall=0.998  (67s)
+  per-class IoU: tree=0.865, ground=0.916, person=0.466, sky=0.833, road=0.803, mountain=0.578, building=0.570, background=nan
+epoch 33/60  loss=0.1088  mIoU(7)=0.711  tree_old=0.869  ground_old=0.916  tree_new_recall=0.999  (69s)
+  per-class IoU: tree=0.869, ground=0.916, person=0.494, sky=0.838, road=0.761, mountain=0.595, building=0.502, background=nan
+epoch 34/60  loss=0.1071  mIoU(7)=0.702  tree_old=0.865  ground_old=0.910  tree_new_recall=0.999  (70s)
+  per-class IoU: tree=0.865, ground=0.910, person=0.455, sky=0.827, road=0.753, mountain=0.562, building=0.541, background=nan
+epoch 35/60  loss=0.1064  mIoU(7)=0.696  tree_old=0.861  ground_old=0.908  tree_new_recall=1.000  (65s)
+  per-class IoU: tree=0.861, ground=0.908, person=0.476, sky=0.816, road=0.754, mountain=0.578, building=0.478, background=nan
+epoch 36/60  loss=0.1052  mIoU(7)=0.704  tree_old=0.860  ground_old=0.910  tree_new_recall=0.999  (66s)
+  per-class IoU: tree=0.860, ground=0.910, person=0.477, sky=0.829, road=0.787, mountain=0.592, building=0.470, background=nan
+epoch 37/60  loss=0.1050  mIoU(7)=0.703  tree_old=0.860  ground_old=0.908  tree_new_recall=0.999  (64s)
+  per-class IoU: tree=0.860, ground=0.908, person=0.479, sky=0.827, road=0.769, mountain=0.593, building=0.488, background=nan
+epoch 38/60  loss=0.1034  mIoU(7)=0.704  tree_old=0.863  ground_old=0.911  tree_new_recall=0.999  (63s)
+  per-class IoU: tree=0.863, ground=0.911, person=0.441, sky=0.829, road=0.778, mountain=0.587, building=0.521, background=nan
+epoch 39/60  loss=0.1025  mIoU(7)=0.713  tree_old=0.865  ground_old=0.912  tree_new_recall=0.999  (64s)
+  per-class IoU: tree=0.865, ground=0.912, person=0.449, sky=0.842, road=0.760, mountain=0.597, building=0.565, background=nan
+epoch 40/60  loss=0.1010  mIoU(7)=0.713  tree_old=0.858  ground_old=0.909  tree_new_recall=0.999  (65s)
+  per-class IoU: tree=0.858, ground=0.909, person=0.477, sky=0.820, road=0.800, mountain=0.596, building=0.530, background=nan
+epoch 41/60  loss=0.0999  mIoU(7)=0.704  tree_old=0.862  ground_old=0.911  tree_new_recall=1.000  (66s)
+  per-class IoU: tree=0.862, ground=0.911, person=0.455, sky=0.815, road=0.788, mountain=0.581, building=0.514, background=nan
+epoch 42/60  loss=0.0992  mIoU(7)=0.713  tree_old=0.866  ground_old=0.916  tree_new_recall=0.999  (66s)
+  per-class IoU: tree=0.866, ground=0.916, person=0.453, sky=0.836, road=0.804, mountain=0.595, building=0.524, background=nan
+epoch 43/60  loss=0.0980  mIoU(7)=0.717  tree_old=0.860  ground_old=0.909  tree_new_recall=1.000  (65s)
+  per-class IoU: tree=0.860, ground=0.909, person=0.460, sky=0.822, road=0.814, mountain=0.591, building=0.559, background=nan
+epoch 44/60  loss=0.0975  mIoU(7)=0.704  tree_old=0.860  ground_old=0.910  tree_new_recall=0.999  (65s)
+  per-class IoU: tree=0.860, ground=0.910, person=0.444, sky=0.830, road=0.809, mountain=0.578, building=0.495, background=nan
+epoch 45/60  loss=0.0967  mIoU(7)=0.720  tree_old=0.861  ground_old=0.910  tree_new_recall=0.999  (66s)
+  per-class IoU: tree=0.861, ground=0.910, person=0.462, sky=0.830, road=0.809, mountain=0.598, building=0.574, background=nan
+epoch 46/60  loss=0.0958  mIoU(7)=0.715  tree_old=0.859  ground_old=0.904  tree_new_recall=0.999  (64s)
+  per-class IoU: tree=0.859, ground=0.904, person=0.457, sky=0.825, road=0.787, mountain=0.600, building=0.571, background=nan
+epoch 47/60  loss=0.0953  mIoU(7)=0.717  tree_old=0.863  ground_old=0.912  tree_new_recall=0.999  (65s)
+  per-class IoU: tree=0.863, ground=0.912, person=0.466, sky=0.818, road=0.799, mountain=0.601, building=0.560, background=nan
+epoch 48/60  loss=0.0942  mIoU(7)=0.717  tree_old=0.865  ground_old=0.914  tree_new_recall=0.999  (65s)
+  per-class IoU: tree=0.865, ground=0.914, person=0.468, sky=0.829, road=0.800, mountain=0.592, building=0.551, background=nan
+epoch 49/60  loss=0.0938  mIoU(7)=0.715  tree_old=0.863  ground_old=0.911  tree_new_recall=0.999  (66s)
+  per-class IoU: tree=0.863, ground=0.911, person=0.479, sky=0.825, road=0.794, mountain=0.594, building=0.535, background=nan
+epoch 50/60  loss=0.0934  mIoU(7)=0.718  tree_old=0.862  ground_old=0.912  tree_new_recall=0.999  (65s)
+  per-class IoU: tree=0.862, ground=0.912, person=0.469, sky=0.828, road=0.812, mountain=0.590, building=0.551, background=nan
+epoch 51/60  loss=0.0931  mIoU(7)=0.717  tree_old=0.861  ground_old=0.913  tree_new_recall=0.999  (64s)
+  per-class IoU: tree=0.861, ground=0.913, person=0.460, sky=0.821, road=0.818, mountain=0.593, building=0.551, background=nan
+epoch 52/60  loss=0.0926  mIoU(7)=0.715  tree_old=0.860  ground_old=0.910  tree_new_recall=0.999  (65s)
+  per-class IoU: tree=0.860, ground=0.910, person=0.460, sky=0.824, road=0.804, mountain=0.597, building=0.547, background=nan
+epoch 53/60  loss=0.0918  mIoU(7)=0.717  tree_old=0.860  ground_old=0.910  tree_new_recall=0.999  (64s)
+  per-class IoU: tree=0.860, ground=0.910, person=0.466, sky=0.820, road=0.806, mountain=0.595, building=0.561, background=nan
+epoch 54/60  loss=0.0916  mIoU(7)=0.714  tree_old=0.861  ground_old=0.911  tree_new_recall=1.000  (67s)
+  per-class IoU: tree=0.861, ground=0.911, person=0.471, sky=0.822, road=0.807, mountain=0.588, building=0.541, background=nan
+epoch 55/60  loss=0.0912  mIoU(7)=0.719  tree_old=0.863  ground_old=0.913  tree_new_recall=0.999  (69s)
+  per-class IoU: tree=0.863, ground=0.913, person=0.476, sky=0.826, road=0.808, mountain=0.593, building=0.554, background=nan
+epoch 56/60  loss=0.0911  mIoU(7)=0.717  tree_old=0.862  ground_old=0.913  tree_new_recall=0.999  (70s)
+  per-class IoU: tree=0.862, ground=0.913, person=0.475, sky=0.823, road=0.807, mountain=0.596, building=0.545, background=nan
+epoch 57/60  loss=0.0913  mIoU(7)=0.715  tree_old=0.860  ground_old=0.910  tree_new_recall=0.999  (68s)
+  per-class IoU: tree=0.860, ground=0.910, person=0.465, sky=0.825, road=0.802, mountain=0.593, building=0.548, background=nan
+epoch 58/60  loss=0.0911  mIoU(7)=0.718  tree_old=0.863  ground_old=0.913  tree_new_recall=0.999  (65s)
+  per-class IoU: tree=0.863, ground=0.913, person=0.470, sky=0.824, road=0.813, mountain=0.594, building=0.552, background=nan
+epoch 59/60  loss=0.0907  mIoU(7)=0.716  tree_old=0.862  ground_old=0.912  tree_new_recall=0.999  (64s)
+  per-class IoU: tree=0.862, ground=0.912, person=0.467, sky=0.824, road=0.803, mountain=0.596, building=0.549, background=nan
+epoch 60/60  loss=0.0903  mIoU(7)=0.718  tree_old=0.862  ground_old=0.914  tree_new_recall=0.999  (67s)
+  per-class IoU: tree=0.862, ground=0.914, person=0.466, sky=0.825, road=0.811, mountain=0.593, building=0.558, background=nan
+=== DONE ===  best tree_old IoU: 0.872
+=== FPS BENCHMARK (RTX 3080, batch=1, 640x360) ===
+  TwinLiteNet8 @ 640x360 batch=1: 137.1 FPS
+  Jetson Orin Nano estimate: ~34-46 FPS

twinlite8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c499f31d388b377f6234db8b6417418846c73b003cc9b9fbc8369e854a823056
+size 1787561

twinlite8_best.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:933bbf0b34134823c2cf9fb9eafd71a8362e26b14fff5eca551bdf78f76badab
+size 1815544