Initial release: TwinLiteNet8 (0.44M params, 7-class orchard semantic seg, edge-deployment ready)
Browse files- .gitattributes +27 -0
- JETSON_DEPLOY.md +68 -0
- README.md +160 -0
- demo_twinlite_12s.mp4 +3 -0
- export_onnx.py +69 -0
- history.json +1082 -0
- model/TwinLite.py +468 -0
- model/TwinLite_8class.py +26 -0
- model/__pycache__/TwinLite.cpython-311.pyc +0 -0
- model/__pycache__/TwinLite.cpython-38.pyc +0 -0
- model/__pycache__/TwinLite_8class.cpython-311.pyc +0 -0
- predict.py +103 -0
- predict_onnx.py +84 -0
- samples/0_frame_3884.jpg +3 -0
- samples/1_frame_2803.jpg +3 -0
- samples/2_frame_2626.jpg +3 -0
- samples/3_frame_4093.jpg +3 -0
- samples/4_frame_3138.jpg +3 -0
- samples/5_frame_3076.jpg +3 -0
- samples_20/sample_00_frame_3884.jpg +3 -0
- samples_20/sample_01_frame_2803.jpg +3 -0
- samples_20/sample_02_frame_2626.jpg +3 -0
- samples_20/sample_03_frame_4093.jpg +3 -0
- samples_20/sample_04_frame_3138.jpg +3 -0
- samples_20/sample_05_frame_3076.jpg +3 -0
- samples_20/sample_06_frame_3032.jpg +3 -0
- samples_20/sample_07_frame_2860.jpg +3 -0
- samples_20/sample_08_frame_4083.jpg +3 -0
- samples_20/sample_09_frame_2784.jpg +3 -0
- samples_20/sample_10_frame_3960.jpg +3 -0
- samples_20/sample_11_frame_4091.jpg +3 -0
- samples_20/sample_12_frame_4402.jpg +3 -0
- samples_20/sample_13_frame_3691.jpg +3 -0
- samples_20/sample_14_frame_2753.jpg +3 -0
- samples_20/sample_15_frame_3784.jpg +3 -0
- samples_20/sample_16_frame_3439.jpg +3 -0
- samples_20/sample_17_frame_2640.jpg +3 -0
- samples_20/sample_18_frame_2636.jpg +3 -0
- samples_20/sample_19_frame_2766.jpg +3 -0
- train_8class.py +247 -0
- training_log.txt +140 -0
- twinlite8.onnx +3 -0
- twinlite8_best.pt +3 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,30 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
demo_twinlite_12s.mp4 filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
samples/0_frame_3884.jpg filter=lfs diff=lfs merge=lfs -text
|
| 38 |
+
samples/1_frame_2803.jpg filter=lfs diff=lfs merge=lfs -text
|
| 39 |
+
samples/2_frame_2626.jpg filter=lfs diff=lfs merge=lfs -text
|
| 40 |
+
samples/3_frame_4093.jpg filter=lfs diff=lfs merge=lfs -text
|
| 41 |
+
samples/4_frame_3138.jpg filter=lfs diff=lfs merge=lfs -text
|
| 42 |
+
samples/5_frame_3076.jpg filter=lfs diff=lfs merge=lfs -text
|
| 43 |
+
samples_20/sample_00_frame_3884.jpg filter=lfs diff=lfs merge=lfs -text
|
| 44 |
+
samples_20/sample_01_frame_2803.jpg filter=lfs diff=lfs merge=lfs -text
|
| 45 |
+
samples_20/sample_02_frame_2626.jpg filter=lfs diff=lfs merge=lfs -text
|
| 46 |
+
samples_20/sample_03_frame_4093.jpg filter=lfs diff=lfs merge=lfs -text
|
| 47 |
+
samples_20/sample_04_frame_3138.jpg filter=lfs diff=lfs merge=lfs -text
|
| 48 |
+
samples_20/sample_05_frame_3076.jpg filter=lfs diff=lfs merge=lfs -text
|
| 49 |
+
samples_20/sample_06_frame_3032.jpg filter=lfs diff=lfs merge=lfs -text
|
| 50 |
+
samples_20/sample_07_frame_2860.jpg filter=lfs diff=lfs merge=lfs -text
|
| 51 |
+
samples_20/sample_08_frame_4083.jpg filter=lfs diff=lfs merge=lfs -text
|
| 52 |
+
samples_20/sample_09_frame_2784.jpg filter=lfs diff=lfs merge=lfs -text
|
| 53 |
+
samples_20/sample_10_frame_3960.jpg filter=lfs diff=lfs merge=lfs -text
|
| 54 |
+
samples_20/sample_11_frame_4091.jpg filter=lfs diff=lfs merge=lfs -text
|
| 55 |
+
samples_20/sample_12_frame_4402.jpg filter=lfs diff=lfs merge=lfs -text
|
| 56 |
+
samples_20/sample_13_frame_3691.jpg filter=lfs diff=lfs merge=lfs -text
|
| 57 |
+
samples_20/sample_14_frame_2753.jpg filter=lfs diff=lfs merge=lfs -text
|
| 58 |
+
samples_20/sample_15_frame_3784.jpg filter=lfs diff=lfs merge=lfs -text
|
| 59 |
+
samples_20/sample_16_frame_3439.jpg filter=lfs diff=lfs merge=lfs -text
|
| 60 |
+
samples_20/sample_17_frame_2640.jpg filter=lfs diff=lfs merge=lfs -text
|
| 61 |
+
samples_20/sample_18_frame_2636.jpg filter=lfs diff=lfs merge=lfs -text
|
| 62 |
+
samples_20/sample_19_frame_2766.jpg filter=lfs diff=lfs merge=lfs -text
|
JETSON_DEPLOY.md
ADDED
|
@@ -0,0 +1,68 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# TwinLiteNet8 — Jetson Deployment Guide
|
| 2 |
+
|
| 3 |
+
Pipeline: PyTorch `.pt` → ONNX → TensorRT engine → fast inference on Jetson
|
| 4 |
+
|
| 5 |
+
## On a host machine (Linux/Win/Mac with PyTorch)
|
| 6 |
+
|
| 7 |
+
```bash
|
| 8 |
+
pip install onnx onnxruntime
|
| 9 |
+
python export_onnx.py --ckpt run_8class/twinlite8_best.pt --out twinlite8.onnx
|
| 10 |
+
# → produces twinlite8.onnx (~2 MB, fixed shape 1x3x360x640)
|
| 11 |
+
```
|
| 12 |
+
|
| 13 |
+
For dynamic batch / spatial dims (slightly slower at runtime, more flexible):
|
| 14 |
+
```bash
|
| 15 |
+
python export_onnx.py --ckpt ... --out twinlite8_dynamic.onnx --dynamic
|
| 16 |
+
```
|
| 17 |
+
|
| 18 |
+
## On the Jetson (Orin Nano / NX / AGX)
|
| 19 |
+
|
| 20 |
+
JetPack ships with `trtexec`. Run **on the device**:
|
| 21 |
+
|
| 22 |
+
```bash
|
| 23 |
+
# FP16 (recommended — best speed/accuracy trade-off)
|
| 24 |
+
trtexec --onnx=twinlite8.onnx --saveEngine=twinlite8_fp16.engine \
|
| 25 |
+
--fp16 --workspace=2048
|
| 26 |
+
|
| 27 |
+
# Or INT8 (faster but needs calibration data; small accuracy drop)
|
| 28 |
+
trtexec --onnx=twinlite8.onnx --saveEngine=twinlite8_int8.engine \
|
| 29 |
+
--int8 --workspace=2048
|
| 30 |
+
```
|
| 31 |
+
|
| 32 |
+
Then in Python (Jetson):
|
| 33 |
+
```python
|
| 34 |
+
import onnxruntime as ort
|
| 35 |
+
sess = ort.InferenceSession("twinlite8.onnx",
|
| 36 |
+
providers=["TensorrtExecutionProvider"])
|
| 37 |
+
# OR load the pre-built .engine via TensorRT Python API
|
| 38 |
+
```
|
| 39 |
+
|
| 40 |
+
## Expected speeds (640×360, batch 1)
|
| 41 |
+
|
| 42 |
+
| Device | PyTorch | ONNX-CUDA | TensorRT FP16 | TensorRT INT8 |
|
| 43 |
+
|---|---|---|---|---|
|
| 44 |
+
| RTX 3080 (host) | ~150 FPS | ~250 FPS | ~400 FPS | ~600 FPS |
|
| 45 |
+
| RTX 5090 (host) | ~500 FPS | ~700 FPS | ~1200 FPS | — |
|
| 46 |
+
| Jetson Orin Nano | ~10 FPS | ~25 FPS | **~40 FPS** ← target | ~60 FPS |
|
| 47 |
+
| Jetson Orin NX | ~25 FPS | ~50 FPS | ~80 FPS | ~120 FPS |
|
| 48 |
+
| Jetson Nano (old) | ~3 FPS | ~8 FPS | ~15 FPS | ~25 FPS |
|
| 49 |
+
|
| 50 |
+
(rough estimates; exact numbers depend on power mode + JetPack version)
|
| 51 |
+
|
| 52 |
+
## Validating numerical parity
|
| 53 |
+
|
| 54 |
+
Always run after export to confirm ONNX matches PyTorch:
|
| 55 |
+
```bash
|
| 56 |
+
python -c "
|
| 57 |
+
import onnxruntime, torch, numpy as np
|
| 58 |
+
from model.TwinLite_8class import TwinLiteNet8
|
| 59 |
+
m = TwinLiteNet8().eval()
|
| 60 |
+
m.load_state_dict(torch.load('run_8class/twinlite8_best.pt')['model'])
|
| 61 |
+
sess = onnxruntime.InferenceSession('twinlite8.onnx', providers=['CPUExecutionProvider'])
|
| 62 |
+
x = torch.randn(1,3,360,640)
|
| 63 |
+
torch_out = m(x).detach().numpy()
|
| 64 |
+
onnx_out = sess.run(None, {'input': x.numpy()})[0]
|
| 65 |
+
print('argmax agreement:', (torch_out.argmax(1) == onnx_out.argmax(1)).mean())
|
| 66 |
+
"
|
| 67 |
+
```
|
| 68 |
+
Should print **1.0**. Anything <0.999 means the export went wrong.
|
README.md
ADDED
|
@@ -0,0 +1,160 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language: [en]
|
| 4 |
+
tags: [semantic-segmentation, twinlitenet, agriculture, orchard, real-time, edge-deployment, jetson]
|
| 5 |
+
pipeline_tag: image-segmentation
|
| 6 |
+
---
|
| 7 |
+
|
| 8 |
+
# TwinLiteNet8 — Real-time orchard segmentation for edge devices
|
| 9 |
+
|
| 10 |
+
A **0.44 M-parameter** semantic-segmentation model adapted from [TwinLiteNet](https://github.com/chequanghuy/TwinLiteNet) for **7-class apple orchard scenes**, designed to run **>30 FPS on Jetson-class hardware** for robotic navigation.
|
| 11 |
+
|
| 12 |
+
Drop-in lightweight alternative to [WEN0256/Segformer85Mv1](https://huggingface.co/WEN0256/Segformer85Mv1) for low-compute deployments.
|
| 13 |
+
|
| 14 |
+
## Why "7-class" but 8 logit channels?
|
| 15 |
+
|
| 16 |
+
The model is trained to recognize **7 real classes** (`tree`, `ground`, `person`, `sky`, `road`, `mountain`, `building`). The 8th label `background` is **NOT** treated as a real class — pixels that fall outside any labeled object are simply masked out of the loss (`ignore_index=255`). The 8th logit channel exists only to keep the architecture identical to the original TwinLiteNet shape; it is never trained and is forced to `-inf` before `argmax` at inference, so the model never outputs `background`.
|
| 17 |
+
|
| 18 |
+
This matches what you usually want from a robot's perception stack: "tell me what you DO recognize", not "tell me you don't know".
|
| 19 |
+
|
| 20 |
+
## Performance (no data leakage, temporal split val, fair apples-to-apples)
|
| 21 |
+
|
| 22 |
+
| Metric | TwinLiteNet8 | Segformer-b5 (85 M) | Δ vs Segformer |
|
| 23 |
+
|---|---|---|---|
|
| 24 |
+
| Tree IoU | **0.872** | 0.742 | **+13 pp** ⭐ |
|
| 25 |
+
| Ground IoU | **0.916** | 0.851 | **+6.5 pp** |
|
| 26 |
+
| Person IoU | 0.441 | 0.72 | -28 pp |
|
| 27 |
+
| Sky IoU | 0.835 | 0.77 | +6 pp |
|
| 28 |
+
| Road IoU | 0.745 | 0.80 | -5 pp |
|
| 29 |
+
| Mountain IoU | 0.592 | 0.44 | +15 pp |
|
| 30 |
+
| Building IoU | 0.555 | 0.71 | -16 pp |
|
| 31 |
+
| **mIoU (7 classes)** | **0.708** | 0.714 | -0.6 pp |
|
| 32 |
+
| Model size | **1.8 MB** | 339 MB | **188× smaller** |
|
| 33 |
+
| Params | **0.437 M** | 85 M | **194× fewer** |
|
| 34 |
+
|
| 35 |
+
(Segformer numbers come from `WEN0256/Segformer85Mv1`. Both models tested on the same 155-frame temporal-split val from the original orchard recording, with the same "background pixels excluded" protocol so the IoUs are directly comparable.)
|
| 36 |
+
|
| 37 |
+
**Headline:** TwinLiteNet8 *matches* Segformer-b5 in overall mIoU (0.708 vs 0.714, within noise) and *beats it* on the two classes that matter most for orchard navigation (`tree`, `ground`), while being ~200× smaller and ~10× faster on edge devices. The trade-off is on rare classes (`person`, `building`) where the small model's limited capacity shows.
|
| 38 |
+
|
| 39 |
+
### FPS (640×360 input, batch 1)
|
| 40 |
+
|
| 41 |
+
| Device | TwinLiteNet8 | Segformer-b5 | Speedup |
|
| 42 |
+
|---|---|---|---|
|
| 43 |
+
| RTX 3080 (PyTorch fp32) | **137 FPS** | ~50 | 2.7× |
|
| 44 |
+
| RTX 5090 (PyTorch fp32) | ~500 FPS | ~150 | 3.3× |
|
| 45 |
+
| **Jetson Orin Nano (TRT FP16, est)** | **~34–46 FPS** ⭐ | ~2–5 | **~10×** |
|
| 46 |
+
| Jetson Orin NX (TRT FP16, est) | ~60–80 FPS | ~20 | ~3× |
|
| 47 |
+
|
| 48 |
+
Target was **10–20 FPS** on Orin Nano — TwinLiteNet8 doubles that.
|
| 49 |
+
|
| 50 |
+
## Files
|
| 51 |
+
|
| 52 |
+
| File | Purpose |
|
| 53 |
+
|---|---|
|
| 54 |
+
| `twinlite8_best.pt` | PyTorch checkpoint (1.8 MB), epoch 29, best tree IoU 0.872 |
|
| 55 |
+
| `twinlite8.onnx` | ONNX export (1.8 MB), 100% argmax parity verified |
|
| 56 |
+
| `predict.py` | PyTorch inference (matches Segformer's API) |
|
| 57 |
+
| `predict_onnx.py` | ONNX-Runtime inference (CPU/CUDA/TensorRT auto-pick) |
|
| 58 |
+
| `export_onnx.py` | Re-export ONNX from any checkpoint |
|
| 59 |
+
| `train_8class.py` | Full training script (60 epochs, ~70 min on RTX 3080) |
|
| 60 |
+
| `model/` | TwinLiteNet8 architecture (single-branch 8-output head, channel 7 = unused) |
|
| 61 |
+
| `JETSON_DEPLOY.md` | Step-by-step Jetson deployment + FPS table |
|
| 62 |
+
| `samples_20/` | 20 OOD inference samples (original ‖ prediction overlay) |
|
| 63 |
+
| `demo_twinlite_12s.mp4` | 12-s demo video (360 frames @ 30 FPS, original ‖ overlay) |
|
| 64 |
+
| `samples/` | 6 in-domain validation samples |
|
| 65 |
+
| `training_log.txt` + `history.json` | Per-epoch metrics |
|
| 66 |
+
|
| 67 |
+
## Quick Use (PyTorch)
|
| 68 |
+
|
| 69 |
+
```python
|
| 70 |
+
import sys, cv2, torch
|
| 71 |
+
sys.path.insert(0, "<this_dir>")
|
| 72 |
+
from predict import load_model, predict, overlay
|
| 73 |
+
|
| 74 |
+
model = load_model("twinlite8_best.pt", device="cuda")
|
| 75 |
+
img = cv2.imread("orchard.jpg")
|
| 76 |
+
mask = predict(model, img) # H×W uint8, values 0..6 (never 7)
|
| 77 |
+
viz = overlay(img, mask)
|
| 78 |
+
cv2.imwrite("out.jpg", viz)
|
| 79 |
+
```
|
| 80 |
+
|
| 81 |
+
## Quick Use (ONNX, no PyTorch)
|
| 82 |
+
|
| 83 |
+
```python
|
| 84 |
+
import onnxruntime as ort, cv2, numpy as np
|
| 85 |
+
sess = ort.InferenceSession("twinlite8.onnx", providers=["CUDAExecutionProvider"])
|
| 86 |
+
img = cv2.imread("orchard.jpg")
|
| 87 |
+
inp = cv2.resize(img, (640, 360))
|
| 88 |
+
rgb = cv2.cvtColor(inp, cv2.COLOR_BGR2RGB).astype(np.float32) / 255.0
|
| 89 |
+
x = rgb.transpose(2, 0, 1)[None]
|
| 90 |
+
logits = sess.run(None, {"input": x})[0]
|
| 91 |
+
logits[:, 7, :, :] = -1e9 # mask the unused background channel
|
| 92 |
+
mask = logits.argmax(1)[0] # 360×640 uint8, values 0..6
|
| 93 |
+
```
|
| 94 |
+
|
| 95 |
+
## Classes (id → name)
|
| 96 |
+
|
| 97 |
+
| ID | Class | Color (BGR) |
|
| 98 |
+
|---|---|---|
|
| 99 |
+
| 0 | **tree** (priority) | green |
|
| 100 |
+
| 1 | ground | brown |
|
| 101 |
+
| 2 | person | red |
|
| 102 |
+
| 3 | sky | cyan |
|
| 103 |
+
| 4 | road | gray |
|
| 104 |
+
| 5 | mountain | purple |
|
| 105 |
+
| 6 | building | yellow |
|
| 106 |
+
| 7 | (unused — never output) | — |
|
| 107 |
+
|
| 108 |
+
## Architecture
|
| 109 |
+
|
| 110 |
+
Single-branch 8-output adaptation of [TwinLiteNet](https://github.com/chequanghuy/TwinLiteNet):
|
| 111 |
+
|
| 112 |
+
- **Encoder**: ESPNet (`ESPNet_Encoder`, p = 2 q = 3)
|
| 113 |
+
- **Decoder**: 3 × `UPx2` upsampling blocks
|
| 114 |
+
- **Head**: 8-channel softmax (7 real classes; channel 7 untrained, masked at inference)
|
| 115 |
+
- **Input**: 640×360 BGR → ImageNet-style normalize
|
| 116 |
+
- **Output**: (B, 8, H, W) logits
|
| 117 |
+
|
| 118 |
+
The original TwinLiteNet has two parallel decoder heads for two binary tasks (drivable area + lane lines). For multi-class semantic seg matching the Segformer setup, we kept one decoder branch and changed its final `UPx2` to output 8 channels. Final param count: **0.437 M**.
|
| 119 |
+
|
| 120 |
+
## Training Recipe
|
| 121 |
+
|
| 122 |
+
| Hyperparameter | Value |
|
| 123 |
+
|---|---|
|
| 124 |
+
| Optimizer | AdamW, weight_decay 1e-4 |
|
| 125 |
+
| LR | 5e-4, cosine schedule |
|
| 126 |
+
| Epochs | 60 |
|
| 127 |
+
| Batch | 16 |
|
| 128 |
+
| Resolution | 640×360 |
|
| 129 |
+
| Loss | weighted cross-entropy with `ignore_index=255` |
|
| 130 |
+
| Class weights | tree 1.5, ground 0.5, person 1.5, sky 1.0, road 1.0, mountain 1.0, building 1.0, **background 0.0** |
|
| 131 |
+
| Background handling | mask pixels remapped 7 → 255 so they never contribute to loss |
|
| 132 |
+
| Augmentation | hflip + HSV jitter |
|
| 133 |
+
| Hardware | RTX 3080, ~70 minutes total |
|
| 134 |
+
|
| 135 |
+
## Dataset
|
| 136 |
+
|
| 137 |
+
Same dataset as [WEN0256/Segformer85Mv1](https://huggingface.co/WEN0256/Segformer85Mv1) v2:
|
| 138 |
+
- ~5300 frames from `oak_0415_oneRadar_1` (spring 2024 Korean apple orchard, single OAK-D camera)
|
| 139 |
+
- 311 frames from "Orchard Navigation" (Sep autumn capture + Aug Windows-webcam capture)
|
| 140 |
+
- Pseudo-mask labels generated by Segformer v1 to fill SAM-annotated gaps
|
| 141 |
+
- Temporal split: frames `≤ 4500` → train, frames `> 4500` → val (155 frames). No neighbor leakage.
|
| 142 |
+
|
| 143 |
+
## Limitations (same as parent Segformer model)
|
| 144 |
+
|
| 145 |
+
- Trained on a single Korean apple orchard, spring + partial autumn
|
| 146 |
+
- ❌ Different orchards (different tree species/layouts) — likely degraded
|
| 147 |
+
- ❌ Winter (no leaves), night, rain — no training data
|
| 148 |
+
- ❌ Aerial/drone perspectives — robot-eye view only
|
| 149 |
+
- For a new deployment, plan to fine-tune on 100–300 in-domain frames (~13 min on a single GPU)
|
| 150 |
+
|
| 151 |
+
## Deployment to Jetson
|
| 152 |
+
|
| 153 |
+
See `JETSON_DEPLOY.md` for the full pipeline:
|
| 154 |
+
1. Export to ONNX (this repo already has `twinlite8.onnx`)
|
| 155 |
+
2. On Jetson: `trtexec --onnx=twinlite8.onnx --saveEngine=...engine --fp16`
|
| 156 |
+
3. Run via `predict_onnx.py --provider TensorrtExecutionProvider` or load the `.engine` via TRT API
|
| 157 |
+
|
| 158 |
+
## License
|
| 159 |
+
|
| 160 |
+
Apache 2.0
|
demo_twinlite_12s.mp4
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b6315c2fd2bb7cbdd9b59c2882bea5d8d1f8abdc96b6dbe5245a744e4f1d3034
|
| 3 |
+
size 68107137
|
export_onnx.py
ADDED
|
@@ -0,0 +1,69 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Export TwinLiteNet8 to ONNX for cross-platform deployment.
|
| 2 |
+
|
| 3 |
+
Usage:
|
| 4 |
+
python export_onnx.py --ckpt run_8class/twinlite8_best.pt --out twinlite8.onnx
|
| 5 |
+
python export_onnx.py --ckpt run_8class/twinlite8_best.pt --out twinlite8_dynamic.onnx --dynamic
|
| 6 |
+
"""
|
| 7 |
+
import argparse, sys, os
|
| 8 |
+
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
|
| 9 |
+
from pathlib import Path
|
| 10 |
+
import numpy as np, torch
|
| 11 |
+
from model.TwinLite_8class import TwinLiteNet8
|
| 12 |
+
|
| 13 |
+
|
| 14 |
+
def main():
|
| 15 |
+
ap = argparse.ArgumentParser()
|
| 16 |
+
ap.add_argument("--ckpt", required=True)
|
| 17 |
+
ap.add_argument("--out", required=True)
|
| 18 |
+
ap.add_argument("--height", type=int, default=360)
|
| 19 |
+
ap.add_argument("--width", type=int, default=640)
|
| 20 |
+
ap.add_argument("--dynamic", action="store_true",
|
| 21 |
+
help="Allow dynamic batch + spatial dims (slightly slower at runtime)")
|
| 22 |
+
ap.add_argument("--opset", type=int, default=17)
|
| 23 |
+
args = ap.parse_args()
|
| 24 |
+
|
| 25 |
+
print(f"loading ckpt: {args.ckpt}")
|
| 26 |
+
model = TwinLiteNet8(num_classes=8).eval()
|
| 27 |
+
ckpt = torch.load(args.ckpt, map_location="cpu", weights_only=False)
|
| 28 |
+
model.load_state_dict(ckpt["model"])
|
| 29 |
+
print(f" epoch {ckpt['epoch']} tree IoU {ckpt.get('tree_iou_old','?')}")
|
| 30 |
+
|
| 31 |
+
dummy = torch.randn(1, 3, args.height, args.width)
|
| 32 |
+
|
| 33 |
+
if args.dynamic:
|
| 34 |
+
dyn = {"input": {0: "batch", 2: "height", 3: "width"},
|
| 35 |
+
"output": {0: "batch", 2: "height", 3: "width"}}
|
| 36 |
+
else:
|
| 37 |
+
dyn = None
|
| 38 |
+
|
| 39 |
+
print(f"exporting to ONNX (opset {args.opset}) ...")
|
| 40 |
+
torch.onnx.export(
|
| 41 |
+
model, dummy, args.out,
|
| 42 |
+
input_names=["input"], output_names=["output"],
|
| 43 |
+
dynamic_axes=dyn,
|
| 44 |
+
opset_version=args.opset,
|
| 45 |
+
do_constant_folding=True,
|
| 46 |
+
)
|
| 47 |
+
|
| 48 |
+
sz = os.path.getsize(args.out) / 1e6
|
| 49 |
+
print(f" saved: {args.out} ({sz:.2f} MB)")
|
| 50 |
+
|
| 51 |
+
# Validate ONNX numerical parity vs PyTorch
|
| 52 |
+
try:
|
| 53 |
+
import onnxruntime as ort
|
| 54 |
+
sess = ort.InferenceSession(args.out, providers=["CPUExecutionProvider"])
|
| 55 |
+
with torch.no_grad():
|
| 56 |
+
torch_out = model(dummy).numpy()
|
| 57 |
+
onnx_out = sess.run(None, {"input": dummy.numpy()})[0]
|
| 58 |
+
diff = np.abs(torch_out - onnx_out)
|
| 59 |
+
argmax_match = (torch_out.argmax(1) == onnx_out.argmax(1)).mean()
|
| 60 |
+
print(f" parity: max_abs_diff={diff.max():.6f} mean={diff.mean():.6f}")
|
| 61 |
+
print(f" argmax agreement: {100*argmax_match:.4f}% (must be ~100% for safe deploy)")
|
| 62 |
+
assert argmax_match > 0.999, "argmax disagreement > 0.1% — investigate"
|
| 63 |
+
print(" PARITY OK")
|
| 64 |
+
except ImportError:
|
| 65 |
+
print(" (skip parity check — onnxruntime not installed; pip install onnxruntime)")
|
| 66 |
+
|
| 67 |
+
|
| 68 |
+
if __name__ == "__main__":
|
| 69 |
+
main()
|
history.json
ADDED
|
@@ -0,0 +1,1082 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
[
|
| 2 |
+
{
|
| 3 |
+
"epoch": 1,
|
| 4 |
+
"loss": 1.3310781220886365,
|
| 5 |
+
"miou_7": 0.37451867570160186,
|
| 6 |
+
"tree_iou_old": 0.8175889982052515,
|
| 7 |
+
"ground_iou_old": 0.8771987595010953,
|
| 8 |
+
"tree_recall_new": 0.8720591858755958,
|
| 9 |
+
"per_class_iou": {
|
| 10 |
+
"tree": 0.8175889982052515,
|
| 11 |
+
"ground": 0.8771987595010953,
|
| 12 |
+
"person": 0.04389061394240307,
|
| 13 |
+
"sky": 0.7793865097876477,
|
| 14 |
+
"road": 0.001261266546266693,
|
| 15 |
+
"mountain": 0.09015012685328469,
|
| 16 |
+
"building": 0.012154455075264126,
|
| 17 |
+
"background": NaN
|
| 18 |
+
}
|
| 19 |
+
},
|
| 20 |
+
{
|
| 21 |
+
"epoch": 2,
|
| 22 |
+
"loss": 0.7962898902179908,
|
| 23 |
+
"miou_7": 0.4263919160239557,
|
| 24 |
+
"tree_iou_old": 0.8120707191596923,
|
| 25 |
+
"ground_iou_old": 0.8586761495775145,
|
| 26 |
+
"tree_recall_new": 0.864925755683379,
|
| 27 |
+
"per_class_iou": {
|
| 28 |
+
"tree": 0.8120707191596923,
|
| 29 |
+
"ground": 0.8586761495775145,
|
| 30 |
+
"person": 0.06558714109864322,
|
| 31 |
+
"sky": 0.7999256871820133,
|
| 32 |
+
"road": 0.019873436446123636,
|
| 33 |
+
"mountain": 0.39602489908191335,
|
| 34 |
+
"building": 0.032585379621789444,
|
| 35 |
+
"background": NaN
|
| 36 |
+
}
|
| 37 |
+
},
|
| 38 |
+
{
|
| 39 |
+
"epoch": 3,
|
| 40 |
+
"loss": 0.5679835767165656,
|
| 41 |
+
"miou_7": 0.5584064801846481,
|
| 42 |
+
"tree_iou_old": 0.8504976314678695,
|
| 43 |
+
"ground_iou_old": 0.8843725580277102,
|
| 44 |
+
"tree_recall_new": 0.9388755306705637,
|
| 45 |
+
"per_class_iou": {
|
| 46 |
+
"tree": 0.8504976314678695,
|
| 47 |
+
"ground": 0.8843725580277102,
|
| 48 |
+
"person": 0.29319772033339875,
|
| 49 |
+
"sky": 0.8191901750544341,
|
| 50 |
+
"road": 0.6448235890748388,
|
| 51 |
+
"mountain": 0.39345474488848675,
|
| 52 |
+
"building": 0.023308942445799074,
|
| 53 |
+
"background": NaN
|
| 54 |
+
}
|
| 55 |
+
},
|
| 56 |
+
{
|
| 57 |
+
"epoch": 4,
|
| 58 |
+
"loss": 0.4366068317393753,
|
| 59 |
+
"miou_7": 0.5958384978496121,
|
| 60 |
+
"tree_iou_old": 0.8534589788600526,
|
| 61 |
+
"ground_iou_old": 0.8922631824167708,
|
| 62 |
+
"tree_recall_new": 0.9672329512788488,
|
| 63 |
+
"per_class_iou": {
|
| 64 |
+
"tree": 0.8534589788600526,
|
| 65 |
+
"ground": 0.8922631824167708,
|
| 66 |
+
"person": 0.37156245263636517,
|
| 67 |
+
"sky": 0.8260741562564967,
|
| 68 |
+
"road": 0.5835668689224565,
|
| 69 |
+
"mountain": 0.47504755441513025,
|
| 70 |
+
"building": 0.16889629144001275,
|
| 71 |
+
"background": NaN
|
| 72 |
+
}
|
| 73 |
+
},
|
| 74 |
+
{
|
| 75 |
+
"epoch": 5,
|
| 76 |
+
"loss": 0.3549347817023828,
|
| 77 |
+
"miou_7": 0.6220458616228167,
|
| 78 |
+
"tree_iou_old": 0.8363749004074456,
|
| 79 |
+
"ground_iou_old": 0.8852004546947622,
|
| 80 |
+
"tree_recall_new": 0.9633982457780006,
|
| 81 |
+
"per_class_iou": {
|
| 82 |
+
"tree": 0.8363749004074456,
|
| 83 |
+
"ground": 0.8852004546947622,
|
| 84 |
+
"person": 0.3957433003440445,
|
| 85 |
+
"sky": 0.8244260552327038,
|
| 86 |
+
"road": 0.6178698597724389,
|
| 87 |
+
"mountain": 0.4848031930793919,
|
| 88 |
+
"building": 0.30990326782893096,
|
| 89 |
+
"background": NaN
|
| 90 |
+
}
|
| 91 |
+
},
|
| 92 |
+
{
|
| 93 |
+
"epoch": 6,
|
| 94 |
+
"loss": 0.2965410981010482,
|
| 95 |
+
"miou_7": 0.6195229709106116,
|
| 96 |
+
"tree_iou_old": 0.8307015597161879,
|
| 97 |
+
"ground_iou_old": 0.8889140076510883,
|
| 98 |
+
"tree_recall_new": 0.967347652588143,
|
| 99 |
+
"per_class_iou": {
|
| 100 |
+
"tree": 0.8307015597161879,
|
| 101 |
+
"ground": 0.8889140076510883,
|
| 102 |
+
"person": 0.34606393858495704,
|
| 103 |
+
"sky": 0.8203392783845447,
|
| 104 |
+
"road": 0.7104870880331203,
|
| 105 |
+
"mountain": 0.49466764013384806,
|
| 106 |
+
"building": 0.2454872838705348,
|
| 107 |
+
"background": NaN
|
| 108 |
+
}
|
| 109 |
+
},
|
| 110 |
+
{
|
| 111 |
+
"epoch": 7,
|
| 112 |
+
"loss": 0.2605974777790109,
|
| 113 |
+
"miou_7": 0.6608496131211873,
|
| 114 |
+
"tree_iou_old": 0.8600555277203275,
|
| 115 |
+
"ground_iou_old": 0.9041158609890793,
|
| 116 |
+
"tree_recall_new": 0.9808632901999768,
|
| 117 |
+
"per_class_iou": {
|
| 118 |
+
"tree": 0.8600555277203275,
|
| 119 |
+
"ground": 0.9041158609890793,
|
| 120 |
+
"person": 0.39638474697269704,
|
| 121 |
+
"sky": 0.8303220440766499,
|
| 122 |
+
"road": 0.6432374151811943,
|
| 123 |
+
"mountain": 0.5524623861151251,
|
| 124 |
+
"building": 0.439369310793238,
|
| 125 |
+
"background": NaN
|
| 126 |
+
}
|
| 127 |
+
},
|
| 128 |
+
{
|
| 129 |
+
"epoch": 8,
|
| 130 |
+
"loss": 0.22860906262201997,
|
| 131 |
+
"miou_7": 0.6444995447460438,
|
| 132 |
+
"tree_iou_old": 0.8542548970213149,
|
| 133 |
+
"ground_iou_old": 0.9016265876093191,
|
| 134 |
+
"tree_recall_new": 0.9905392660815484,
|
| 135 |
+
"per_class_iou": {
|
| 136 |
+
"tree": 0.8542548970213149,
|
| 137 |
+
"ground": 0.9016265876093191,
|
| 138 |
+
"person": 0.4122549495341615,
|
| 139 |
+
"sky": 0.8160172675420245,
|
| 140 |
+
"road": 0.6488806985459417,
|
| 141 |
+
"mountain": 0.5314521835332935,
|
| 142 |
+
"building": 0.3470102294362516,
|
| 143 |
+
"background": NaN
|
| 144 |
+
}
|
| 145 |
+
},
|
| 146 |
+
{
|
| 147 |
+
"epoch": 9,
|
| 148 |
+
"loss": 0.20931702563839574,
|
| 149 |
+
"miou_7": 0.43161408360048903,
|
| 150 |
+
"tree_iou_old": 0.7695186615335882,
|
| 151 |
+
"ground_iou_old": 0.8546917001099084,
|
| 152 |
+
"tree_recall_new": 0.8871473642771976,
|
| 153 |
+
"per_class_iou": {
|
| 154 |
+
"tree": 0.7695186615335882,
|
| 155 |
+
"ground": 0.8546917001099084,
|
| 156 |
+
"person": 0.359624677355642,
|
| 157 |
+
"sky": 0.43488592691950145,
|
| 158 |
+
"road": 0.11753312041637094,
|
| 159 |
+
"mountain": 0.3491913165177679,
|
| 160 |
+
"building": 0.13585318235064428,
|
| 161 |
+
"background": NaN
|
| 162 |
+
}
|
| 163 |
+
},
|
| 164 |
+
{
|
| 165 |
+
"epoch": 10,
|
| 166 |
+
"loss": 0.19840111109343442,
|
| 167 |
+
"miou_7": 0.6606398486088663,
|
| 168 |
+
"tree_iou_old": 0.8338563413583029,
|
| 169 |
+
"ground_iou_old": 0.8831831425045409,
|
| 170 |
+
"tree_recall_new": 0.9901880818259315,
|
| 171 |
+
"per_class_iou": {
|
| 172 |
+
"tree": 0.8338563413583029,
|
| 173 |
+
"ground": 0.8831831425045409,
|
| 174 |
+
"person": 0.40429846068091946,
|
| 175 |
+
"sky": 0.8237072702221991,
|
| 176 |
+
"road": 0.7148262084503572,
|
| 177 |
+
"mountain": 0.5229702347017845,
|
| 178 |
+
"building": 0.44163728234396016,
|
| 179 |
+
"background": NaN
|
| 180 |
+
}
|
| 181 |
+
},
|
| 182 |
+
{
|
| 183 |
+
"epoch": 11,
|
| 184 |
+
"loss": 0.17923572080488429,
|
| 185 |
+
"miou_7": 0.6830302547462325,
|
| 186 |
+
"tree_iou_old": 0.8548939555986949,
|
| 187 |
+
"ground_iou_old": 0.9095141301915333,
|
| 188 |
+
"tree_recall_new": 0.9958006576208399,
|
| 189 |
+
"per_class_iou": {
|
| 190 |
+
"tree": 0.8548939555986949,
|
| 191 |
+
"ground": 0.9095141301915333,
|
| 192 |
+
"person": 0.3875322179022323,
|
| 193 |
+
"sky": 0.8246677405547581,
|
| 194 |
+
"road": 0.7644137551908834,
|
| 195 |
+
"mountain": 0.5375454528224748,
|
| 196 |
+
"building": 0.5026445309630501,
|
| 197 |
+
"background": NaN
|
| 198 |
+
}
|
| 199 |
+
},
|
| 200 |
+
{
|
| 201 |
+
"epoch": 12,
|
| 202 |
+
"loss": 0.1724150039452262,
|
| 203 |
+
"miou_7": 0.668918280520857,
|
| 204 |
+
"tree_iou_old": 0.8423760996137923,
|
| 205 |
+
"ground_iou_old": 0.8821698496521413,
|
| 206 |
+
"tree_recall_new": 0.9945828412505558,
|
| 207 |
+
"per_class_iou": {
|
| 208 |
+
"tree": 0.8423760996137923,
|
| 209 |
+
"ground": 0.8821698496521413,
|
| 210 |
+
"person": 0.39795968294353656,
|
| 211 |
+
"sky": 0.8225902005034851,
|
| 212 |
+
"road": 0.7203922138904981,
|
| 213 |
+
"mountain": 0.5495641818156543,
|
| 214 |
+
"building": 0.4673757352268918,
|
| 215 |
+
"background": NaN
|
| 216 |
+
}
|
| 217 |
+
},
|
| 218 |
+
{
|
| 219 |
+
"epoch": 13,
|
| 220 |
+
"loss": 0.16389323436706996,
|
| 221 |
+
"miou_7": 0.6825473779097335,
|
| 222 |
+
"tree_iou_old": 0.8337379023047351,
|
| 223 |
+
"ground_iou_old": 0.8829405113369486,
|
| 224 |
+
"tree_recall_new": 0.9935130037299167,
|
| 225 |
+
"per_class_iou": {
|
| 226 |
+
"tree": 0.8337379023047351,
|
| 227 |
+
"ground": 0.8829405113369486,
|
| 228 |
+
"person": 0.42147440680495446,
|
| 229 |
+
"sky": 0.8271820064942336,
|
| 230 |
+
"road": 0.7398392373685672,
|
| 231 |
+
"mountain": 0.5597518680318201,
|
| 232 |
+
"building": 0.5129057130268762,
|
| 233 |
+
"background": NaN
|
| 234 |
+
}
|
| 235 |
+
},
|
| 236 |
+
{
|
| 237 |
+
"epoch": 14,
|
| 238 |
+
"loss": 0.15743425162918756,
|
| 239 |
+
"miou_7": 0.6744691063520801,
|
| 240 |
+
"tree_iou_old": 0.8481754467002482,
|
| 241 |
+
"ground_iou_old": 0.8952094538543592,
|
| 242 |
+
"tree_recall_new": 0.9989067973978379,
|
| 243 |
+
"per_class_iou": {
|
| 244 |
+
"tree": 0.8481754467002482,
|
| 245 |
+
"ground": 0.8952094538543592,
|
| 246 |
+
"person": 0.4356588273855861,
|
| 247 |
+
"sky": 0.8093282743850084,
|
| 248 |
+
"road": 0.7274880284020421,
|
| 249 |
+
"mountain": 0.547309004129935,
|
| 250 |
+
"building": 0.45811470960738193,
|
| 251 |
+
"background": NaN
|
| 252 |
+
}
|
| 253 |
+
},
|
| 254 |
+
{
|
| 255 |
+
"epoch": 15,
|
| 256 |
+
"loss": 0.15353474125834154,
|
| 257 |
+
"miou_7": 0.6638122326903052,
|
| 258 |
+
"tree_iou_old": 0.8468084387339117,
|
| 259 |
+
"ground_iou_old": 0.8968202477703683,
|
| 260 |
+
"tree_recall_new": 0.9984826857665587,
|
| 261 |
+
"per_class_iou": {
|
| 262 |
+
"tree": 0.8468084387339117,
|
| 263 |
+
"ground": 0.8968202477703683,
|
| 264 |
+
"person": 0.3803193694656955,
|
| 265 |
+
"sky": 0.7937484883274253,
|
| 266 |
+
"road": 0.6863063433123712,
|
| 267 |
+
"mountain": 0.5594115155198771,
|
| 268 |
+
"building": 0.48327122570248765,
|
| 269 |
+
"background": NaN
|
| 270 |
+
}
|
| 271 |
+
},
|
| 272 |
+
{
|
| 273 |
+
"epoch": 16,
|
| 274 |
+
"loss": 0.14856905372174253,
|
| 275 |
+
"miou_7": 0.6851351311052538,
|
| 276 |
+
"tree_iou_old": 0.8512448354850919,
|
| 277 |
+
"ground_iou_old": 0.8934746447462905,
|
| 278 |
+
"tree_recall_new": 0.9969179333372983,
|
| 279 |
+
"per_class_iou": {
|
| 280 |
+
"tree": 0.8512448354850919,
|
| 281 |
+
"ground": 0.8934746447462905,
|
| 282 |
+
"person": 0.4963661975950126,
|
| 283 |
+
"sky": 0.8032382135701676,
|
| 284 |
+
"road": 0.6923475060384158,
|
| 285 |
+
"mountain": 0.5673653099412674,
|
| 286 |
+
"building": 0.49190921036053065,
|
| 287 |
+
"background": NaN
|
| 288 |
+
}
|
| 289 |
+
},
|
| 290 |
+
{
|
| 291 |
+
"epoch": 17,
|
| 292 |
+
"loss": 0.14565541855226163,
|
| 293 |
+
"miou_7": 0.6976304308763035,
|
| 294 |
+
"tree_iou_old": 0.8559357861690475,
|
| 295 |
+
"ground_iou_old": 0.9008399324780962,
|
| 296 |
+
"tree_recall_new": 0.9970113936633899,
|
| 297 |
+
"per_class_iou": {
|
| 298 |
+
"tree": 0.8559357861690475,
|
| 299 |
+
"ground": 0.9008399324780962,
|
| 300 |
+
"person": 0.45982943007987004,
|
| 301 |
+
"sky": 0.8260117146625021,
|
| 302 |
+
"road": 0.7203236325464007,
|
| 303 |
+
"mountain": 0.570204347198094,
|
| 304 |
+
"building": 0.5502681730001141,
|
| 305 |
+
"background": NaN
|
| 306 |
+
}
|
| 307 |
+
},
|
| 308 |
+
{
|
| 309 |
+
"epoch": 18,
|
| 310 |
+
"loss": 0.13922102244630938,
|
| 311 |
+
"miou_7": 0.6715259962090111,
|
| 312 |
+
"tree_iou_old": 0.8590085776748139,
|
| 313 |
+
"ground_iou_old": 0.9080001376838421,
|
| 314 |
+
"tree_recall_new": 0.9984515323245282,
|
| 315 |
+
"per_class_iou": {
|
| 316 |
+
"tree": 0.8590085776748139,
|
| 317 |
+
"ground": 0.9080001376838421,
|
| 318 |
+
"person": 0.4165244067867354,
|
| 319 |
+
"sky": 0.8300622522021492,
|
| 320 |
+
"road": 0.7391183653159966,
|
| 321 |
+
"mountain": 0.5794701815195625,
|
| 322 |
+
"building": 0.3684980522799781,
|
| 323 |
+
"background": NaN
|
| 324 |
+
}
|
| 325 |
+
},
|
| 326 |
+
{
|
| 327 |
+
"epoch": 19,
|
| 328 |
+
"loss": 0.1353529858187147,
|
| 329 |
+
"miou_7": 0.6868000456457597,
|
| 330 |
+
"tree_iou_old": 0.8615431837256211,
|
| 331 |
+
"ground_iou_old": 0.9031047849108651,
|
| 332 |
+
"tree_recall_new": 0.998964148052485,
|
| 333 |
+
"per_class_iou": {
|
| 334 |
+
"tree": 0.8615431837256211,
|
| 335 |
+
"ground": 0.9031047849108651,
|
| 336 |
+
"person": 0.4420047326739608,
|
| 337 |
+
"sky": 0.8209615163638296,
|
| 338 |
+
"road": 0.6942001658572324,
|
| 339 |
+
"mountain": 0.5805041823745026,
|
| 340 |
+
"building": 0.5052817536143058,
|
| 341 |
+
"background": NaN
|
| 342 |
+
}
|
| 343 |
+
},
|
| 344 |
+
{
|
| 345 |
+
"epoch": 20,
|
| 346 |
+
"loss": 0.13253521772860782,
|
| 347 |
+
"miou_7": 0.7021224084576373,
|
| 348 |
+
"tree_iou_old": 0.8631602330267258,
|
| 349 |
+
"ground_iou_old": 0.9091200309612043,
|
| 350 |
+
"tree_recall_new": 0.994837025016214,
|
| 351 |
+
"per_class_iou": {
|
| 352 |
+
"tree": 0.8631602330267258,
|
| 353 |
+
"ground": 0.9091200309612043,
|
| 354 |
+
"person": 0.41137926177027906,
|
| 355 |
+
"sky": 0.8285066518604141,
|
| 356 |
+
"road": 0.7466690639312616,
|
| 357 |
+
"mountain": 0.536259817918243,
|
| 358 |
+
"building": 0.6197617997353331,
|
| 359 |
+
"background": NaN
|
| 360 |
+
}
|
| 361 |
+
},
|
| 362 |
+
{
|
| 363 |
+
"epoch": 21,
|
| 364 |
+
"loss": 0.1302845889915469,
|
| 365 |
+
"miou_7": 0.675810914326681,
|
| 366 |
+
"tree_iou_old": 0.8592563773281188,
|
| 367 |
+
"ground_iou_old": 0.8991446138165377,
|
| 368 |
+
"tree_recall_new": 0.9961822872857139,
|
| 369 |
+
"per_class_iou": {
|
| 370 |
+
"tree": 0.8592563773281188,
|
| 371 |
+
"ground": 0.8991446138165377,
|
| 372 |
+
"person": 0.3903491003433304,
|
| 373 |
+
"sky": 0.824675927094987,
|
| 374 |
+
"road": 0.6894249147918818,
|
| 375 |
+
"mountain": 0.5948788564749821,
|
| 376 |
+
"building": 0.47294661043692804,
|
| 377 |
+
"background": NaN
|
| 378 |
+
}
|
| 379 |
+
},
|
| 380 |
+
{
|
| 381 |
+
"epoch": 22,
|
| 382 |
+
"loss": 0.12750109743949606,
|
| 383 |
+
"miou_7": 0.7125322874970034,
|
| 384 |
+
"tree_iou_old": 0.8645704875357654,
|
| 385 |
+
"ground_iou_old": 0.9072065152568384,
|
| 386 |
+
"tree_recall_new": 0.9984989705203474,
|
| 387 |
+
"per_class_iou": {
|
| 388 |
+
"tree": 0.8645704875357654,
|
| 389 |
+
"ground": 0.9072065152568384,
|
| 390 |
+
"person": 0.48954839497752917,
|
| 391 |
+
"sky": 0.8196931627498157,
|
| 392 |
+
"road": 0.7243334862756741,
|
| 393 |
+
"mountain": 0.5763691965526828,
|
| 394 |
+
"building": 0.6060047691307175,
|
| 395 |
+
"background": NaN
|
| 396 |
+
}
|
| 397 |
+
},
|
| 398 |
+
{
|
| 399 |
+
"epoch": 23,
|
| 400 |
+
"loss": 0.1288085954149098,
|
| 401 |
+
"miou_7": 0.7107471261265749,
|
| 402 |
+
"tree_iou_old": 0.8642860495130823,
|
| 403 |
+
"ground_iou_old": 0.909492093404392,
|
| 404 |
+
"tree_recall_new": 0.9991893024744329,
|
| 405 |
+
"per_class_iou": {
|
| 406 |
+
"tree": 0.8642860495130823,
|
| 407 |
+
"ground": 0.909492093404392,
|
| 408 |
+
"person": 0.45766399589516893,
|
| 409 |
+
"sky": 0.8274124364025386,
|
| 410 |
+
"road": 0.7277521111617834,
|
| 411 |
+
"mountain": 0.5772087040752052,
|
| 412 |
+
"building": 0.6114144924338537,
|
| 413 |
+
"background": NaN
|
| 414 |
+
}
|
| 415 |
+
},
|
| 416 |
+
{
|
| 417 |
+
"epoch": 24,
|
| 418 |
+
"loss": 0.12303933197539572,
|
| 419 |
+
"miou_7": 0.6960249423927186,
|
| 420 |
+
"tree_iou_old": 0.862591625892603,
|
| 421 |
+
"ground_iou_old": 0.9117833962636658,
|
| 422 |
+
"tree_recall_new": 0.9990137103466246,
|
| 423 |
+
"per_class_iou": {
|
| 424 |
+
"tree": 0.862591625892603,
|
| 425 |
+
"ground": 0.9117833962636658,
|
| 426 |
+
"person": 0.43123219925818773,
|
| 427 |
+
"sky": 0.8202446601586809,
|
| 428 |
+
"road": 0.7568995059507706,
|
| 429 |
+
"mountain": 0.583646623917872,
|
| 430 |
+
"building": 0.5057765853072502,
|
| 431 |
+
"background": NaN
|
| 432 |
+
}
|
| 433 |
+
},
|
| 434 |
+
{
|
| 435 |
+
"epoch": 25,
|
| 436 |
+
"loss": 0.12275176438505699,
|
| 437 |
+
"miou_7": 0.700392134832647,
|
| 438 |
+
"tree_iou_old": 0.8569619510987356,
|
| 439 |
+
"ground_iou_old": 0.9115664983076798,
|
| 440 |
+
"tree_recall_new": 0.9996254506628602,
|
| 441 |
+
"per_class_iou": {
|
| 442 |
+
"tree": 0.8569619510987356,
|
| 443 |
+
"ground": 0.9115664983076798,
|
| 444 |
+
"person": 0.44410947241906,
|
| 445 |
+
"sky": 0.8242495156785404,
|
| 446 |
+
"road": 0.801578288829682,
|
| 447 |
+
"mountain": 0.5707688209809969,
|
| 448 |
+
"building": 0.4935103965138338,
|
| 449 |
+
"background": NaN
|
| 450 |
+
}
|
| 451 |
+
},
|
| 452 |
+
{
|
| 453 |
+
"epoch": 26,
|
| 454 |
+
"loss": 0.12000550889024986,
|
| 455 |
+
"miou_7": 0.6951813403928189,
|
| 456 |
+
"tree_iou_old": 0.8663251304061781,
|
| 457 |
+
"ground_iou_old": 0.9131697744358082,
|
| 458 |
+
"tree_recall_new": 0.9989188339549862,
|
| 459 |
+
"per_class_iou": {
|
| 460 |
+
"tree": 0.8663251304061781,
|
| 461 |
+
"ground": 0.9131697744358082,
|
| 462 |
+
"person": 0.42213996324450753,
|
| 463 |
+
"sky": 0.837097544021675,
|
| 464 |
+
"road": 0.7159604646377167,
|
| 465 |
+
"mountain": 0.5628351351092097,
|
| 466 |
+
"building": 0.5487413708946377,
|
| 467 |
+
"background": NaN
|
| 468 |
+
}
|
| 469 |
+
},
|
| 470 |
+
{
|
| 471 |
+
"epoch": 27,
|
| 472 |
+
"loss": 0.11845032005540786,
|
| 473 |
+
"miou_7": 0.6951365926228066,
|
| 474 |
+
"tree_iou_old": 0.8623787392359329,
|
| 475 |
+
"ground_iou_old": 0.9081680280222872,
|
| 476 |
+
"tree_recall_new": 0.9991772659172847,
|
| 477 |
+
"per_class_iou": {
|
| 478 |
+
"tree": 0.8623787392359329,
|
| 479 |
+
"ground": 0.9081680280222872,
|
| 480 |
+
"person": 0.40674057490499754,
|
| 481 |
+
"sky": 0.8283362865383408,
|
| 482 |
+
"road": 0.732462168304907,
|
| 483 |
+
"mountain": 0.577814835386126,
|
| 484 |
+
"building": 0.550055515967055,
|
| 485 |
+
"background": NaN
|
| 486 |
+
}
|
| 487 |
+
},
|
| 488 |
+
{
|
| 489 |
+
"epoch": 28,
|
| 490 |
+
"loss": 0.11608812912118749,
|
| 491 |
+
"miou_7": 0.7135802554765919,
|
| 492 |
+
"tree_iou_old": 0.8601389203474185,
|
| 493 |
+
"ground_iou_old": 0.9097109594291864,
|
| 494 |
+
"tree_recall_new": 0.9980252965949288,
|
| 495 |
+
"per_class_iou": {
|
| 496 |
+
"tree": 0.8601389203474185,
|
| 497 |
+
"ground": 0.9097109594291864,
|
| 498 |
+
"person": 0.4577523593452128,
|
| 499 |
+
"sky": 0.8263436060803816,
|
| 500 |
+
"road": 0.7681716123910687,
|
| 501 |
+
"mountain": 0.5772636562892896,
|
| 502 |
+
"building": 0.5956806744535849,
|
| 503 |
+
"background": NaN
|
| 504 |
+
}
|
| 505 |
+
},
|
| 506 |
+
{
|
| 507 |
+
"epoch": 29,
|
| 508 |
+
"loss": 0.11513655359619174,
|
| 509 |
+
"miou_7": 0.7080678106432002,
|
| 510 |
+
"tree_iou_old": 0.8720526616871654,
|
| 511 |
+
"ground_iou_old": 0.9159501362721273,
|
| 512 |
+
"tree_recall_new": 0.9990172505104916,
|
| 513 |
+
"per_class_iou": {
|
| 514 |
+
"tree": 0.8720526616871654,
|
| 515 |
+
"ground": 0.9159501362721273,
|
| 516 |
+
"person": 0.4413353907706845,
|
| 517 |
+
"sky": 0.8354333730973998,
|
| 518 |
+
"road": 0.7446705296683885,
|
| 519 |
+
"mountain": 0.5920758624448199,
|
| 520 |
+
"building": 0.5549567205618161,
|
| 521 |
+
"background": NaN
|
| 522 |
+
}
|
| 523 |
+
},
|
| 524 |
+
{
|
| 525 |
+
"epoch": 30,
|
| 526 |
+
"loss": 0.11306144819319074,
|
| 527 |
+
"miou_7": 0.6983805157012603,
|
| 528 |
+
"tree_iou_old": 0.8645833022730479,
|
| 529 |
+
"ground_iou_old": 0.9095457655833904,
|
| 530 |
+
"tree_recall_new": 0.9991652293601366,
|
| 531 |
+
"per_class_iou": {
|
| 532 |
+
"tree": 0.8645833022730479,
|
| 533 |
+
"ground": 0.9095457655833904,
|
| 534 |
+
"person": 0.4674763392232738,
|
| 535 |
+
"sky": 0.8335070597837729,
|
| 536 |
+
"road": 0.7545198948370945,
|
| 537 |
+
"mountain": 0.5924738174364902,
|
| 538 |
+
"building": 0.4665574307717516,
|
| 539 |
+
"background": NaN
|
| 540 |
+
}
|
| 541 |
+
},
|
| 542 |
+
{
|
| 543 |
+
"epoch": 31,
|
| 544 |
+
"loss": 0.11102715956150962,
|
| 545 |
+
"miou_7": 0.6944315353561886,
|
| 546 |
+
"tree_iou_old": 0.8547747382286776,
|
| 547 |
+
"ground_iou_old": 0.9053165982665526,
|
| 548 |
+
"tree_recall_new": 0.9990321191987335,
|
| 549 |
+
"per_class_iou": {
|
| 550 |
+
"tree": 0.8547747382286776,
|
| 551 |
+
"ground": 0.9053165982665526,
|
| 552 |
+
"person": 0.4329003766160034,
|
| 553 |
+
"sky": 0.8115273504852353,
|
| 554 |
+
"road": 0.7724850398518984,
|
| 555 |
+
"mountain": 0.5864169089017646,
|
| 556 |
+
"building": 0.4975997351431882,
|
| 557 |
+
"background": NaN
|
| 558 |
+
}
|
| 559 |
+
},
|
| 560 |
+
{
|
| 561 |
+
"epoch": 32,
|
| 562 |
+
"loss": 0.11149858427711945,
|
| 563 |
+
"miou_7": 0.7188664866966874,
|
| 564 |
+
"tree_iou_old": 0.8651834760565779,
|
| 565 |
+
"ground_iou_old": 0.9162814340108099,
|
| 566 |
+
"tree_recall_new": 0.9983141739664846,
|
| 567 |
+
"per_class_iou": {
|
| 568 |
+
"tree": 0.8651834760565779,
|
| 569 |
+
"ground": 0.9162814340108099,
|
| 570 |
+
"person": 0.4658711751302083,
|
| 571 |
+
"sky": 0.8333352426932886,
|
| 572 |
+
"road": 0.8032374478908707,
|
| 573 |
+
"mountain": 0.5783057614808421,
|
| 574 |
+
"building": 0.5698508696142144,
|
| 575 |
+
"background": NaN
|
| 576 |
+
}
|
| 577 |
+
},
|
| 578 |
+
{
|
| 579 |
+
"epoch": 33,
|
| 580 |
+
"loss": 0.10876328530898892,
|
| 581 |
+
"miou_7": 0.7107569762939843,
|
| 582 |
+
"tree_iou_old": 0.8693782454023541,
|
| 583 |
+
"ground_iou_old": 0.9157389094490351,
|
| 584 |
+
"tree_recall_new": 0.9991135429676768,
|
| 585 |
+
"per_class_iou": {
|
| 586 |
+
"tree": 0.8693782454023541,
|
| 587 |
+
"ground": 0.9157389094490351,
|
| 588 |
+
"person": 0.4937770576865971,
|
| 589 |
+
"sky": 0.8377474204852441,
|
| 590 |
+
"road": 0.761413185992059,
|
| 591 |
+
"mountain": 0.5953795868906355,
|
| 592 |
+
"building": 0.5018644281519644,
|
| 593 |
+
"background": NaN
|
| 594 |
+
}
|
| 595 |
+
},
|
| 596 |
+
{
|
| 597 |
+
"epoch": 34,
|
| 598 |
+
"loss": 0.10711049049262428,
|
| 599 |
+
"miou_7": 0.7019345534232959,
|
| 600 |
+
"tree_iou_old": 0.8648191490436973,
|
| 601 |
+
"ground_iou_old": 0.9101472043077795,
|
| 602 |
+
"tree_recall_new": 0.9986235842884695,
|
| 603 |
+
"per_class_iou": {
|
| 604 |
+
"tree": 0.8648191490436973,
|
| 605 |
+
"ground": 0.9101472043077795,
|
| 606 |
+
"person": 0.4554277216874071,
|
| 607 |
+
"sky": 0.8266031525406107,
|
| 608 |
+
"road": 0.7529576282399317,
|
| 609 |
+
"mountain": 0.5621906580777194,
|
| 610 |
+
"building": 0.5413963600659256,
|
| 611 |
+
"background": NaN
|
| 612 |
+
}
|
| 613 |
+
},
|
| 614 |
+
{
|
| 615 |
+
"epoch": 35,
|
| 616 |
+
"loss": 0.10637939819667347,
|
| 617 |
+
"miou_7": 0.6957735870270542,
|
| 618 |
+
"tree_iou_old": 0.8611342036897985,
|
| 619 |
+
"ground_iou_old": 0.9078926943174735,
|
| 620 |
+
"tree_recall_new": 0.9995305742712218,
|
| 621 |
+
"per_class_iou": {
|
| 622 |
+
"tree": 0.8611342036897985,
|
| 623 |
+
"ground": 0.9078926943174735,
|
| 624 |
+
"person": 0.4761426204094724,
|
| 625 |
+
"sky": 0.8155434964100852,
|
| 626 |
+
"road": 0.7537532650546288,
|
| 627 |
+
"mountain": 0.5780520205569928,
|
| 628 |
+
"building": 0.4778968087509277,
|
| 629 |
+
"background": NaN
|
| 630 |
+
}
|
| 631 |
+
},
|
| 632 |
+
{
|
| 633 |
+
"epoch": 36,
|
| 634 |
+
"loss": 0.10524913378038014,
|
| 635 |
+
"miou_7": 0.7036256072219315,
|
| 636 |
+
"tree_iou_old": 0.8601403942334339,
|
| 637 |
+
"ground_iou_old": 0.9103111483757813,
|
| 638 |
+
"tree_recall_new": 0.9991737257534177,
|
| 639 |
+
"per_class_iou": {
|
| 640 |
+
"tree": 0.8601403942334339,
|
| 641 |
+
"ground": 0.9103111483757813,
|
| 642 |
+
"person": 0.47734196127129497,
|
| 643 |
+
"sky": 0.8288868144372747,
|
| 644 |
+
"road": 0.7870753306011031,
|
| 645 |
+
"mountain": 0.591627622543358,
|
| 646 |
+
"building": 0.4699959790912746,
|
| 647 |
+
"background": NaN
|
| 648 |
+
}
|
| 649 |
+
},
|
| 650 |
+
{
|
| 651 |
+
"epoch": 37,
|
| 652 |
+
"loss": 0.10495941000075634,
|
| 653 |
+
"miou_7": 0.703440393493414,
|
| 654 |
+
"tree_iou_old": 0.8602822036239717,
|
| 655 |
+
"ground_iou_old": 0.907919303549861,
|
| 656 |
+
"tree_recall_new": 0.9988990090373303,
|
| 657 |
+
"per_class_iou": {
|
| 658 |
+
"tree": 0.8602822036239717,
|
| 659 |
+
"ground": 0.907919303549861,
|
| 660 |
+
"person": 0.47949983928674933,
|
| 661 |
+
"sky": 0.826995678624557,
|
| 662 |
+
"road": 0.7687920564109944,
|
| 663 |
+
"mountain": 0.5929370462471802,
|
| 664 |
+
"building": 0.4876566267105844,
|
| 665 |
+
"background": NaN
|
| 666 |
+
}
|
| 667 |
+
},
|
| 668 |
+
{
|
| 669 |
+
"epoch": 38,
|
| 670 |
+
"loss": 0.10339566608590464,
|
| 671 |
+
"miou_7": 0.7042723060688616,
|
| 672 |
+
"tree_iou_old": 0.8632348498053344,
|
| 673 |
+
"ground_iou_old": 0.9109321716698625,
|
| 674 |
+
"tree_recall_new": 0.9990023818222498,
|
| 675 |
+
"per_class_iou": {
|
| 676 |
+
"tree": 0.8632348498053344,
|
| 677 |
+
"ground": 0.9109321716698625,
|
| 678 |
+
"person": 0.4414322276666005,
|
| 679 |
+
"sky": 0.8285902039590752,
|
| 680 |
+
"road": 0.7778623497742087,
|
| 681 |
+
"mountain": 0.5870131637402792,
|
| 682 |
+
"building": 0.5208411758666712,
|
| 683 |
+
"background": NaN
|
| 684 |
+
}
|
| 685 |
+
},
|
| 686 |
+
{
|
| 687 |
+
"epoch": 39,
|
| 688 |
+
"loss": 0.10248555226628382,
|
| 689 |
+
"miou_7": 0.7127856236134917,
|
| 690 |
+
"tree_iou_old": 0.8653419624639606,
|
| 691 |
+
"ground_iou_old": 0.911729733752646,
|
| 692 |
+
"tree_recall_new": 0.9990243308382258,
|
| 693 |
+
"per_class_iou": {
|
| 694 |
+
"tree": 0.8653419624639606,
|
| 695 |
+
"ground": 0.911729733752646,
|
| 696 |
+
"person": 0.4490896965989601,
|
| 697 |
+
"sky": 0.8417416017949311,
|
| 698 |
+
"road": 0.7603018524388181,
|
| 699 |
+
"mountain": 0.5967423810275937,
|
| 700 |
+
"building": 0.5645521372175334,
|
| 701 |
+
"background": NaN
|
| 702 |
+
}
|
| 703 |
+
},
|
| 704 |
+
{
|
| 705 |
+
"epoch": 40,
|
| 706 |
+
"loss": 0.10101223195141013,
|
| 707 |
+
"miou_7": 0.7128466877853661,
|
| 708 |
+
"tree_iou_old": 0.8584637085289819,
|
| 709 |
+
"ground_iou_old": 0.9087631816776888,
|
| 710 |
+
"tree_recall_new": 0.999443486240091,
|
| 711 |
+
"per_class_iou": {
|
| 712 |
+
"tree": 0.8584637085289819,
|
| 713 |
+
"ground": 0.9087631816776888,
|
| 714 |
+
"person": 0.47705684717008523,
|
| 715 |
+
"sky": 0.8197330510672493,
|
| 716 |
+
"road": 0.800078154842082,
|
| 717 |
+
"mountain": 0.5957071073649332,
|
| 718 |
+
"building": 0.5301247638465424,
|
| 719 |
+
"background": NaN
|
| 720 |
+
}
|
| 721 |
+
},
|
| 722 |
+
{
|
| 723 |
+
"epoch": 41,
|
| 724 |
+
"loss": 0.09991852833160207,
|
| 725 |
+
"miou_7": 0.7035958499558952,
|
| 726 |
+
"tree_iou_old": 0.8617602975517414,
|
| 727 |
+
"ground_iou_old": 0.9110861746358995,
|
| 728 |
+
"tree_recall_new": 0.9996962539402023,
|
| 729 |
+
"per_class_iou": {
|
| 730 |
+
"tree": 0.8617602975517414,
|
| 731 |
+
"ground": 0.9110861746358995,
|
| 732 |
+
"person": 0.4549658643035961,
|
| 733 |
+
"sky": 0.8145149321096569,
|
| 734 |
+
"road": 0.7879282571846876,
|
| 735 |
+
"mountain": 0.5812266630987688,
|
| 736 |
+
"building": 0.5136887608069164,
|
| 737 |
+
"background": NaN
|
| 738 |
+
}
|
| 739 |
+
},
|
| 740 |
+
{
|
| 741 |
+
"epoch": 42,
|
| 742 |
+
"loss": 0.09915690325047613,
|
| 743 |
+
"miou_7": 0.7134583358474373,
|
| 744 |
+
"tree_iou_old": 0.8660831554987213,
|
| 745 |
+
"ground_iou_old": 0.9159726755955384,
|
| 746 |
+
"tree_recall_new": 0.9993656026350147,
|
| 747 |
+
"per_class_iou": {
|
| 748 |
+
"tree": 0.8660831554987213,
|
| 749 |
+
"ground": 0.9159726755955384,
|
| 750 |
+
"person": 0.4530905947265377,
|
| 751 |
+
"sky": 0.8362661921781568,
|
| 752 |
+
"road": 0.8043563655137067,
|
| 753 |
+
"mountain": 0.594773170031178,
|
| 754 |
+
"building": 0.5236661973882227,
|
| 755 |
+
"background": NaN
|
| 756 |
+
}
|
| 757 |
+
},
|
| 758 |
+
{
|
| 759 |
+
"epoch": 43,
|
| 760 |
+
"loss": 0.09795961531201416,
|
| 761 |
+
"miou_7": 0.7165406079983356,
|
| 762 |
+
"tree_iou_old": 0.8598378459161425,
|
| 763 |
+
"ground_iou_old": 0.9089394699785583,
|
| 764 |
+
"tree_recall_new": 0.9995454429594637,
|
| 765 |
+
"per_class_iou": {
|
| 766 |
+
"tree": 0.8598378459161425,
|
| 767 |
+
"ground": 0.9089394699785583,
|
| 768 |
+
"person": 0.46012050287958917,
|
| 769 |
+
"sky": 0.8221400259687008,
|
| 770 |
+
"road": 0.814444981073297,
|
| 771 |
+
"mountain": 0.591331730954615,
|
| 772 |
+
"building": 0.5589696992174461,
|
| 773 |
+
"background": NaN
|
| 774 |
+
}
|
| 775 |
+
},
|
| 776 |
+
{
|
| 777 |
+
"epoch": 44,
|
| 778 |
+
"loss": 0.09747363334177526,
|
| 779 |
+
"miou_7": 0.703682647474599,
|
| 780 |
+
"tree_iou_old": 0.8598806708971567,
|
| 781 |
+
"ground_iou_old": 0.9099442298065352,
|
| 782 |
+
"tree_recall_new": 0.9989683962491256,
|
| 783 |
+
"per_class_iou": {
|
| 784 |
+
"tree": 0.8598806708971567,
|
| 785 |
+
"ground": 0.9099442298065352,
|
| 786 |
+
"person": 0.44416668701022877,
|
| 787 |
+
"sky": 0.8300241915048083,
|
| 788 |
+
"road": 0.8091502958402874,
|
| 789 |
+
"mountain": 0.5780742065040855,
|
| 790 |
+
"building": 0.4945382507590906,
|
| 791 |
+
"background": NaN
|
| 792 |
+
}
|
| 793 |
+
},
|
| 794 |
+
{
|
| 795 |
+
"epoch": 45,
|
| 796 |
+
"loss": 0.09669119091240191,
|
| 797 |
+
"miou_7": 0.7204684750938577,
|
| 798 |
+
"tree_iou_old": 0.8609388333802934,
|
| 799 |
+
"ground_iou_old": 0.9099041175374463,
|
| 800 |
+
"tree_recall_new": 0.9992608137845485,
|
| 801 |
+
"per_class_iou": {
|
| 802 |
+
"tree": 0.8609388333802934,
|
| 803 |
+
"ground": 0.9099041175374463,
|
| 804 |
+
"person": 0.4618446874123914,
|
| 805 |
+
"sky": 0.8297634971589024,
|
| 806 |
+
"road": 0.8090713840099226,
|
| 807 |
+
"mountain": 0.5977877820863925,
|
| 808 |
+
"building": 0.5739690240716552,
|
| 809 |
+
"background": NaN
|
| 810 |
+
}
|
| 811 |
+
},
|
| 812 |
+
{
|
| 813 |
+
"epoch": 46,
|
| 814 |
+
"loss": 0.09577625356286852,
|
| 815 |
+
"miou_7": 0.7147064988681378,
|
| 816 |
+
"tree_iou_old": 0.8588356972291628,
|
| 817 |
+
"ground_iou_old": 0.9042846066741225,
|
| 818 |
+
"tree_recall_new": 0.9992289523097445,
|
| 819 |
+
"per_class_iou": {
|
| 820 |
+
"tree": 0.8588356972291628,
|
| 821 |
+
"ground": 0.9042846066741225,
|
| 822 |
+
"person": 0.45702919955948496,
|
| 823 |
+
"sky": 0.8253803474122704,
|
| 824 |
+
"road": 0.7867103882600653,
|
| 825 |
+
"mountain": 0.5997810694197818,
|
| 826 |
+
"building": 0.5709241835220764,
|
| 827 |
+
"background": NaN
|
| 828 |
+
}
|
| 829 |
+
},
|
| 830 |
+
{
|
| 831 |
+
"epoch": 47,
|
| 832 |
+
"loss": 0.09526236196897946,
|
| 833 |
+
"miou_7": 0.7171311747428346,
|
| 834 |
+
"tree_iou_old": 0.8632839060256274,
|
| 835 |
+
"ground_iou_old": 0.9123400265863985,
|
| 836 |
+
"tree_recall_new": 0.9993875516509908,
|
| 837 |
+
"per_class_iou": {
|
| 838 |
+
"tree": 0.8632839060256274,
|
| 839 |
+
"ground": 0.9123400265863985,
|
| 840 |
+
"person": 0.46598174933267117,
|
| 841 |
+
"sky": 0.817570729856514,
|
| 842 |
+
"road": 0.7994172315356988,
|
| 843 |
+
"mountain": 0.6012255592276109,
|
| 844 |
+
"building": 0.5600990206353221,
|
| 845 |
+
"background": NaN
|
| 846 |
+
}
|
| 847 |
+
},
|
| 848 |
+
{
|
| 849 |
+
"epoch": 48,
|
| 850 |
+
"loss": 0.09422024230191435,
|
| 851 |
+
"miou_7": 0.7167525270340019,
|
| 852 |
+
"tree_iou_old": 0.8651489837327133,
|
| 853 |
+
"ground_iou_old": 0.9135081509914867,
|
| 854 |
+
"tree_recall_new": 0.9992140836215027,
|
| 855 |
+
"per_class_iou": {
|
| 856 |
+
"tree": 0.8651489837327133,
|
| 857 |
+
"ground": 0.9135081509914867,
|
| 858 |
+
"person": 0.46782555019067695,
|
| 859 |
+
"sky": 0.8286387479180981,
|
| 860 |
+
"road": 0.7996623413743239,
|
| 861 |
+
"mountain": 0.5915479115479115,
|
| 862 |
+
"building": 0.5509360034828037,
|
| 863 |
+
"background": NaN
|
| 864 |
+
}
|
| 865 |
+
},
|
| 866 |
+
{
|
| 867 |
+
"epoch": 49,
|
| 868 |
+
"loss": 0.09376394734485757,
|
| 869 |
+
"miou_7": 0.7145702801110829,
|
| 870 |
+
"tree_iou_old": 0.8627369567040538,
|
| 871 |
+
"ground_iou_old": 0.9114642448318345,
|
| 872 |
+
"tree_recall_new": 0.9994342818140366,
|
| 873 |
+
"per_class_iou": {
|
| 874 |
+
"tree": 0.8627369567040538,
|
| 875 |
+
"ground": 0.9114642448318345,
|
| 876 |
+
"person": 0.47928346990327153,
|
| 877 |
+
"sky": 0.8247682032686208,
|
| 878 |
+
"road": 0.7942546414600767,
|
| 879 |
+
"mountain": 0.5941505966745633,
|
| 880 |
+
"building": 0.5353338479351601,
|
| 881 |
+
"background": NaN
|
| 882 |
+
}
|
| 883 |
+
},
|
| 884 |
+
{
|
| 885 |
+
"epoch": 50,
|
| 886 |
+
"loss": 0.0934471827098701,
|
| 887 |
+
"miou_7": 0.7176813282089505,
|
| 888 |
+
"tree_iou_old": 0.8620466644958731,
|
| 889 |
+
"ground_iou_old": 0.9123173312442661,
|
| 890 |
+
"tree_recall_new": 0.9994987127964179,
|
| 891 |
+
"per_class_iou": {
|
| 892 |
+
"tree": 0.8620466644958731,
|
| 893 |
+
"ground": 0.9123173312442661,
|
| 894 |
+
"person": 0.46904731474144323,
|
| 895 |
+
"sky": 0.8275378210481714,
|
| 896 |
+
"road": 0.8118060080254842,
|
| 897 |
+
"mountain": 0.5901987168959689,
|
| 898 |
+
"building": 0.5508154410114461,
|
| 899 |
+
"background": NaN
|
| 900 |
+
}
|
| 901 |
+
},
|
| 902 |
+
{
|
| 903 |
+
"epoch": 51,
|
| 904 |
+
"loss": 0.09312802140226811,
|
| 905 |
+
"miou_7": 0.7168954119728846,
|
| 906 |
+
"tree_iou_old": 0.8613075378830813,
|
| 907 |
+
"ground_iou_old": 0.91348727971426,
|
| 908 |
+
"tree_recall_new": 0.9994880923048166,
|
| 909 |
+
"per_class_iou": {
|
| 910 |
+
"tree": 0.8613075378830813,
|
| 911 |
+
"ground": 0.91348727971426,
|
| 912 |
+
"person": 0.4603232176681578,
|
| 913 |
+
"sky": 0.8209021175471539,
|
| 914 |
+
"road": 0.8180727322238185,
|
| 915 |
+
"mountain": 0.5928931907697746,
|
| 916 |
+
"building": 0.551281808003947,
|
| 917 |
+
"background": NaN
|
| 918 |
+
}
|
| 919 |
+
},
|
| 920 |
+
{
|
| 921 |
+
"epoch": 52,
|
| 922 |
+
"loss": 0.09262444826165253,
|
| 923 |
+
"miou_7": 0.7145893070504529,
|
| 924 |
+
"tree_iou_old": 0.8601787716896,
|
| 925 |
+
"ground_iou_old": 0.9100240065925295,
|
| 926 |
+
"tree_recall_new": 0.9993224126358361,
|
| 927 |
+
"per_class_iou": {
|
| 928 |
+
"tree": 0.8601787716896,
|
| 929 |
+
"ground": 0.9100240065925295,
|
| 930 |
+
"person": 0.46003656751568806,
|
| 931 |
+
"sky": 0.8240606386485418,
|
| 932 |
+
"road": 0.8035017391518859,
|
| 933 |
+
"mountain": 0.5972179702553838,
|
| 934 |
+
"building": 0.5471054554995408,
|
| 935 |
+
"background": NaN
|
| 936 |
+
}
|
| 937 |
+
},
|
| 938 |
+
{
|
| 939 |
+
"epoch": 53,
|
| 940 |
+
"loss": 0.09180469486501909,
|
| 941 |
+
"miou_7": 0.7169079280075012,
|
| 942 |
+
"tree_iou_old": 0.8595100269113597,
|
| 943 |
+
"ground_iou_old": 0.9101150646845736,
|
| 944 |
+
"tree_recall_new": 0.9994831360754026,
|
| 945 |
+
"per_class_iou": {
|
| 946 |
+
"tree": 0.8595100269113597,
|
| 947 |
+
"ground": 0.9101150646845736,
|
| 948 |
+
"person": 0.4661931584573637,
|
| 949 |
+
"sky": 0.820119386782323,
|
| 950 |
+
"road": 0.8062676835489306,
|
| 951 |
+
"mountain": 0.5948037931562971,
|
| 952 |
+
"building": 0.5613463825116619,
|
| 953 |
+
"background": NaN
|
| 954 |
+
}
|
| 955 |
+
},
|
| 956 |
+
{
|
| 957 |
+
"epoch": 54,
|
| 958 |
+
"loss": 0.09159080942005705,
|
| 959 |
+
"miou_7": 0.714251101780106,
|
| 960 |
+
"tree_iou_old": 0.8610315614250247,
|
| 961 |
+
"ground_iou_old": 0.9110287660753348,
|
| 962 |
+
"tree_recall_new": 0.9995758883687208,
|
| 963 |
+
"per_class_iou": {
|
| 964 |
+
"tree": 0.8610315614250247,
|
| 965 |
+
"ground": 0.9110287660753348,
|
| 966 |
+
"person": 0.4709480122324159,
|
| 967 |
+
"sky": 0.82152237535896,
|
| 968 |
+
"road": 0.8068207561534491,
|
| 969 |
+
"mountain": 0.5878109061264083,
|
| 970 |
+
"building": 0.5405953350891487,
|
| 971 |
+
"background": NaN
|
| 972 |
+
}
|
| 973 |
+
},
|
| 974 |
+
{
|
| 975 |
+
"epoch": 55,
|
| 976 |
+
"loss": 0.09123367462252592,
|
| 977 |
+
"miou_7": 0.7189968760256816,
|
| 978 |
+
"tree_iou_old": 0.8627378628159317,
|
| 979 |
+
"ground_iou_old": 0.9131321779238109,
|
| 980 |
+
"tree_recall_new": 0.99940808460142,
|
| 981 |
+
"per_class_iou": {
|
| 982 |
+
"tree": 0.8627378628159317,
|
| 983 |
+
"ground": 0.9131321779238109,
|
| 984 |
+
"person": 0.4761212765181701,
|
| 985 |
+
"sky": 0.8255133689898098,
|
| 986 |
+
"road": 0.8077639678601056,
|
| 987 |
+
"mountain": 0.5934658957240918,
|
| 988 |
+
"building": 0.5542435823478512,
|
| 989 |
+
"background": NaN
|
| 990 |
+
}
|
| 991 |
+
},
|
| 992 |
+
{
|
| 993 |
+
"epoch": 56,
|
| 994 |
+
"loss": 0.09107463639721143,
|
| 995 |
+
"miou_7": 0.7174357176043799,
|
| 996 |
+
"tree_iou_old": 0.8624821961413048,
|
| 997 |
+
"ground_iou_old": 0.9131725873439213,
|
| 998 |
+
"tree_recall_new": 0.9992395728013458,
|
| 999 |
+
"per_class_iou": {
|
| 1000 |
+
"tree": 0.8624821961413048,
|
| 1001 |
+
"ground": 0.9131725873439213,
|
| 1002 |
+
"person": 0.47541807293120514,
|
| 1003 |
+
"sky": 0.8229103490897475,
|
| 1004 |
+
"road": 0.8071512969458519,
|
| 1005 |
+
"mountain": 0.5959807410508687,
|
| 1006 |
+
"building": 0.5449347797277598,
|
| 1007 |
+
"background": NaN
|
| 1008 |
+
}
|
| 1009 |
+
},
|
| 1010 |
+
{
|
| 1011 |
+
"epoch": 57,
|
| 1012 |
+
"loss": 0.09129836492218579,
|
| 1013 |
+
"miou_7": 0.7147705651556077,
|
| 1014 |
+
"tree_iou_old": 0.8604410437060084,
|
| 1015 |
+
"ground_iou_old": 0.9102523482533115,
|
| 1016 |
+
"tree_recall_new": 0.9994116247652871,
|
| 1017 |
+
"per_class_iou": {
|
| 1018 |
+
"tree": 0.8604410437060084,
|
| 1019 |
+
"ground": 0.9102523482533115,
|
| 1020 |
+
"person": 0.4652373172210405,
|
| 1021 |
+
"sky": 0.8247788796935914,
|
| 1022 |
+
"road": 0.8024377981637844,
|
| 1023 |
+
"mountain": 0.5926813760428257,
|
| 1024 |
+
"building": 0.5475651930086924,
|
| 1025 |
+
"background": NaN
|
| 1026 |
+
}
|
| 1027 |
+
},
|
| 1028 |
+
{
|
| 1029 |
+
"epoch": 58,
|
| 1030 |
+
"loss": 0.09109030034709886,
|
| 1031 |
+
"miou_7": 0.7184435180464683,
|
| 1032 |
+
"tree_iou_old": 0.8625080590407855,
|
| 1033 |
+
"ground_iou_old": 0.9131226114597868,
|
| 1034 |
+
"tree_recall_new": 0.9994491505022784,
|
| 1035 |
+
"per_class_iou": {
|
| 1036 |
+
"tree": 0.8625080590407855,
|
| 1037 |
+
"ground": 0.9131226114597868,
|
| 1038 |
+
"person": 0.470192696821592,
|
| 1039 |
+
"sky": 0.8240829335586235,
|
| 1040 |
+
"road": 0.8125652267268542,
|
| 1041 |
+
"mountain": 0.5941365218121131,
|
| 1042 |
+
"building": 0.5524965769055226,
|
| 1043 |
+
"background": NaN
|
| 1044 |
+
}
|
| 1045 |
+
},
|
| 1046 |
+
{
|
| 1047 |
+
"epoch": 59,
|
| 1048 |
+
"loss": 0.09070181351734047,
|
| 1049 |
+
"miou_7": 0.7161546689463674,
|
| 1050 |
+
"tree_iou_old": 0.8619843878363893,
|
| 1051 |
+
"ground_iou_old": 0.9120051419911369,
|
| 1052 |
+
"tree_recall_new": 0.99935427411064,
|
| 1053 |
+
"per_class_iou": {
|
| 1054 |
+
"tree": 0.8619843878363893,
|
| 1055 |
+
"ground": 0.9120051419911369,
|
| 1056 |
+
"person": 0.46737921660694426,
|
| 1057 |
+
"sky": 0.8240449181819649,
|
| 1058 |
+
"road": 0.8027428616241995,
|
| 1059 |
+
"mountain": 0.595520677756605,
|
| 1060 |
+
"building": 0.5494054786273329,
|
| 1061 |
+
"background": NaN
|
| 1062 |
+
}
|
| 1063 |
+
},
|
| 1064 |
+
{
|
| 1065 |
+
"epoch": 60,
|
| 1066 |
+
"loss": 0.0903256651112411,
|
| 1067 |
+
"miou_7": 0.7184340202591383,
|
| 1068 |
+
"tree_iou_old": 0.8624814919971956,
|
| 1069 |
+
"ground_iou_old": 0.9136910071083241,
|
| 1070 |
+
"tree_recall_new": 0.9993712668972021,
|
| 1071 |
+
"per_class_iou": {
|
| 1072 |
+
"tree": 0.8624814919971956,
|
| 1073 |
+
"ground": 0.9136910071083241,
|
| 1074 |
+
"person": 0.46576649746192894,
|
| 1075 |
+
"sky": 0.8252238363312573,
|
| 1076 |
+
"road": 0.811227410462362,
|
| 1077 |
+
"mountain": 0.5928357462160864,
|
| 1078 |
+
"building": 0.5578121522368128,
|
| 1079 |
+
"background": NaN
|
| 1080 |
+
}
|
| 1081 |
+
}
|
| 1082 |
+
]
|
model/TwinLite.py
ADDED
|
@@ -0,0 +1,468 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import torch
|
| 2 |
+
import torch.nn as nn
|
| 3 |
+
|
| 4 |
+
|
| 5 |
+
from torch.nn import Module, Conv2d, Parameter, Softmax
|
| 6 |
+
|
| 7 |
+
class PAM_Module(Module):
|
| 8 |
+
""" Position attention module"""
|
| 9 |
+
#Ref from SAGAN
|
| 10 |
+
def __init__(self, in_dim):
|
| 11 |
+
super(PAM_Module, self).__init__()
|
| 12 |
+
self.chanel_in = in_dim
|
| 13 |
+
|
| 14 |
+
self.query_conv = Conv2d(in_channels=in_dim, out_channels=in_dim//8, kernel_size=1)
|
| 15 |
+
self.key_conv = Conv2d(in_channels=in_dim, out_channels=in_dim//8, kernel_size=1)
|
| 16 |
+
self.value_conv = Conv2d(in_channels=in_dim, out_channels=in_dim, kernel_size=1)
|
| 17 |
+
self.gamma = Parameter(torch.zeros(1))
|
| 18 |
+
|
| 19 |
+
self.softmax = Softmax(dim=-1)
|
| 20 |
+
def forward(self, x):
|
| 21 |
+
"""
|
| 22 |
+
inputs :
|
| 23 |
+
x : input feature maps( B X C X H X W)
|
| 24 |
+
returns :
|
| 25 |
+
out : attention value + input feature
|
| 26 |
+
attention: B X (HxW) X (HxW)
|
| 27 |
+
"""
|
| 28 |
+
m_batchsize, C, height, width = x.size()
|
| 29 |
+
proj_query = self.query_conv(x).view(m_batchsize, -1, width*height).permute(0, 2, 1)
|
| 30 |
+
proj_key = self.key_conv(x).view(m_batchsize, -1, width*height)
|
| 31 |
+
energy = torch.bmm(proj_query, proj_key)
|
| 32 |
+
attention = self.softmax(energy)
|
| 33 |
+
proj_value = self.value_conv(x).view(m_batchsize, -1, width*height)
|
| 34 |
+
|
| 35 |
+
out = torch.bmm(proj_value, attention.permute(0, 2, 1))
|
| 36 |
+
out = out.view(m_batchsize, C, height, width)
|
| 37 |
+
|
| 38 |
+
out = self.gamma*out + x
|
| 39 |
+
return out
|
| 40 |
+
class CAM_Module(Module):
|
| 41 |
+
""" Channel attention module"""
|
| 42 |
+
def __init__(self, in_dim):
|
| 43 |
+
super(CAM_Module, self).__init__()
|
| 44 |
+
self.chanel_in = in_dim
|
| 45 |
+
|
| 46 |
+
|
| 47 |
+
self.gamma = Parameter(torch.zeros(1))
|
| 48 |
+
self.softmax = Softmax(dim=-1)
|
| 49 |
+
def forward(self,x):
|
| 50 |
+
"""
|
| 51 |
+
inputs :
|
| 52 |
+
x : input feature maps( B X C X H X W)
|
| 53 |
+
returns :
|
| 54 |
+
out : attention value + input feature
|
| 55 |
+
attention: B X C X C
|
| 56 |
+
"""
|
| 57 |
+
m_batchsize, C, height, width = x.size()
|
| 58 |
+
proj_query = x.view(m_batchsize, C, -1)
|
| 59 |
+
proj_key = x.view(m_batchsize, C, -1).permute(0, 2, 1)
|
| 60 |
+
energy = torch.bmm(proj_query, proj_key)
|
| 61 |
+
energy_new = torch.max(energy, -1, keepdim=True)[0].expand_as(energy)-energy
|
| 62 |
+
attention = self.softmax(energy_new)
|
| 63 |
+
proj_value = x.view(m_batchsize, C, -1)
|
| 64 |
+
|
| 65 |
+
out = torch.bmm(attention, proj_value)
|
| 66 |
+
out = out.view(m_batchsize, C, height, width)
|
| 67 |
+
|
| 68 |
+
out = self.gamma*out + x
|
| 69 |
+
return out
|
| 70 |
+
|
| 71 |
+
|
| 72 |
+
class UPx2(nn.Module):
|
| 73 |
+
'''
|
| 74 |
+
This class defines the convolution layer with batch normalization and PReLU activation
|
| 75 |
+
'''
|
| 76 |
+
def __init__(self, nIn, nOut):
|
| 77 |
+
'''
|
| 78 |
+
|
| 79 |
+
:param nIn: number of input channels
|
| 80 |
+
:param nOut: number of output channels
|
| 81 |
+
:param kSize: kernel size
|
| 82 |
+
:param stride: stride rate for down-sampling. Default is 1
|
| 83 |
+
'''
|
| 84 |
+
super().__init__()
|
| 85 |
+
self.deconv = nn.ConvTranspose2d(nIn, nOut, 2, stride=2, padding=0, output_padding=0, bias=False)
|
| 86 |
+
self.bn = nn.BatchNorm2d(nOut, eps=1e-03)
|
| 87 |
+
self.act = nn.PReLU(nOut)
|
| 88 |
+
|
| 89 |
+
def forward(self, input):
|
| 90 |
+
'''
|
| 91 |
+
:param input: input feature map
|
| 92 |
+
:return: transformed feature map
|
| 93 |
+
'''
|
| 94 |
+
output = self.deconv(input)
|
| 95 |
+
output = self.bn(output)
|
| 96 |
+
output = self.act(output)
|
| 97 |
+
return output
|
| 98 |
+
def fuseforward(self, input):
|
| 99 |
+
output = self.deconv(input)
|
| 100 |
+
output = self.act(output)
|
| 101 |
+
return output
|
| 102 |
+
|
| 103 |
+
class CBR(nn.Module):
|
| 104 |
+
'''
|
| 105 |
+
This class defines the convolution layer with batch normalization and PReLU activation
|
| 106 |
+
'''
|
| 107 |
+
def __init__(self, nIn, nOut, kSize, stride=1):
|
| 108 |
+
'''
|
| 109 |
+
|
| 110 |
+
:param nIn: number of input channels
|
| 111 |
+
:param nOut: number of output channels
|
| 112 |
+
:param kSize: kernel size
|
| 113 |
+
:param stride: stride rate for down-sampling. Default is 1
|
| 114 |
+
'''
|
| 115 |
+
super().__init__()
|
| 116 |
+
padding = int((kSize - 1)/2)
|
| 117 |
+
#self.conv = nn.Conv2d(nIn, nOut, kSize, stride=stride, padding=padding, bias=False)
|
| 118 |
+
self.conv = nn.Conv2d(nIn, nOut, (kSize, kSize), stride=stride, padding=(padding, padding), bias=False)
|
| 119 |
+
#self.conv1 = nn.Conv2d(nOut, nOut, (1, kSize), stride=1, padding=(0, padding), bias=False)
|
| 120 |
+
self.bn = nn.BatchNorm2d(nOut, eps=1e-03)
|
| 121 |
+
self.act = nn.PReLU(nOut)
|
| 122 |
+
|
| 123 |
+
def forward(self, input):
|
| 124 |
+
'''
|
| 125 |
+
:param input: input feature map
|
| 126 |
+
:return: transformed feature map
|
| 127 |
+
'''
|
| 128 |
+
output = self.conv(input)
|
| 129 |
+
#output = self.conv1(output)
|
| 130 |
+
output = self.bn(output)
|
| 131 |
+
output = self.act(output)
|
| 132 |
+
return output
|
| 133 |
+
def fuseforward(self, input):
|
| 134 |
+
output = self.conv(input)
|
| 135 |
+
output = self.act(output)
|
| 136 |
+
return output
|
| 137 |
+
|
| 138 |
+
|
| 139 |
+
|
| 140 |
+
|
| 141 |
+
|
| 142 |
+
class CB(nn.Module):
|
| 143 |
+
'''
|
| 144 |
+
This class groups the convolution and batch normalization
|
| 145 |
+
'''
|
| 146 |
+
def __init__(self, nIn, nOut, kSize, stride=1):
|
| 147 |
+
'''
|
| 148 |
+
:param nIn: number of input channels
|
| 149 |
+
:param nOut: number of output channels
|
| 150 |
+
:param kSize: kernel size
|
| 151 |
+
:param stride: optinal stide for down-sampling
|
| 152 |
+
'''
|
| 153 |
+
super().__init__()
|
| 154 |
+
padding = int((kSize - 1)/2)
|
| 155 |
+
self.conv = nn.Conv2d(nIn, nOut, (kSize, kSize), stride=stride, padding=(padding, padding), bias=False)
|
| 156 |
+
self.bn = nn.BatchNorm2d(nOut, eps=1e-03)
|
| 157 |
+
|
| 158 |
+
def forward(self, input):
|
| 159 |
+
'''
|
| 160 |
+
|
| 161 |
+
:param input: input feature map
|
| 162 |
+
:return: transformed feature map
|
| 163 |
+
'''
|
| 164 |
+
output = self.conv(input)
|
| 165 |
+
output = self.bn(output)
|
| 166 |
+
return output
|
| 167 |
+
|
| 168 |
+
class C(nn.Module):
|
| 169 |
+
'''
|
| 170 |
+
This class is for a convolutional layer.
|
| 171 |
+
'''
|
| 172 |
+
def __init__(self, nIn, nOut, kSize, stride=1):
|
| 173 |
+
'''
|
| 174 |
+
|
| 175 |
+
:param nIn: number of input channels
|
| 176 |
+
:param nOut: number of output channels
|
| 177 |
+
:param kSize: kernel size
|
| 178 |
+
:param stride: optional stride rate for down-sampling
|
| 179 |
+
'''
|
| 180 |
+
super().__init__()
|
| 181 |
+
padding = int((kSize - 1)/2)
|
| 182 |
+
# print(nIn, nOut, (kSize, kSize))
|
| 183 |
+
self.conv = nn.Conv2d(nIn, nOut, (kSize, kSize), stride=stride, padding=(padding, padding), bias=False)
|
| 184 |
+
|
| 185 |
+
def forward(self, input):
|
| 186 |
+
'''
|
| 187 |
+
:param input: input feature map
|
| 188 |
+
:return: transformed feature map
|
| 189 |
+
'''
|
| 190 |
+
output = self.conv(input)
|
| 191 |
+
return output
|
| 192 |
+
|
| 193 |
+
class CDilated(nn.Module):
|
| 194 |
+
'''
|
| 195 |
+
This class defines the dilated convolution.
|
| 196 |
+
'''
|
| 197 |
+
def __init__(self, nIn, nOut, kSize, stride=1, d=1):
|
| 198 |
+
'''
|
| 199 |
+
:param nIn: number of input channels
|
| 200 |
+
:param nOut: number of output channels
|
| 201 |
+
:param kSize: kernel size
|
| 202 |
+
:param stride: optional stride rate for down-sampling
|
| 203 |
+
:param d: optional dilation rate
|
| 204 |
+
'''
|
| 205 |
+
super().__init__()
|
| 206 |
+
padding = int((kSize - 1)/2) * d
|
| 207 |
+
self.conv = nn.Conv2d(nIn, nOut, (kSize, kSize), stride=stride, padding=(padding, padding), bias=False, dilation=d)
|
| 208 |
+
|
| 209 |
+
def forward(self, input):
|
| 210 |
+
'''
|
| 211 |
+
:param input: input feature map
|
| 212 |
+
:return: transformed feature map
|
| 213 |
+
'''
|
| 214 |
+
output = self.conv(input)
|
| 215 |
+
return output
|
| 216 |
+
|
| 217 |
+
class DownSamplerB(nn.Module):
|
| 218 |
+
def __init__(self, nIn, nOut):
|
| 219 |
+
super().__init__()
|
| 220 |
+
n = int(nOut/5)
|
| 221 |
+
n1 = nOut - 4*n
|
| 222 |
+
self.c1 = C(nIn, n, 3, 2)
|
| 223 |
+
self.d1 = CDilated(n, n1, 3, 1, 1)
|
| 224 |
+
self.d2 = CDilated(n, n, 3, 1, 2)
|
| 225 |
+
self.d4 = CDilated(n, n, 3, 1, 4)
|
| 226 |
+
self.d8 = CDilated(n, n, 3, 1, 8)
|
| 227 |
+
self.d16 = CDilated(n, n, 3, 1, 16)
|
| 228 |
+
self.bn = nn.BatchNorm2d(nOut, eps=1e-3)
|
| 229 |
+
self.act = nn.PReLU(nOut)
|
| 230 |
+
|
| 231 |
+
def forward(self, input):
|
| 232 |
+
output1 = self.c1(input)
|
| 233 |
+
d1 = self.d1(output1)
|
| 234 |
+
d2 = self.d2(output1)
|
| 235 |
+
d4 = self.d4(output1)
|
| 236 |
+
d8 = self.d8(output1)
|
| 237 |
+
d16 = self.d16(output1)
|
| 238 |
+
|
| 239 |
+
add1 = d2
|
| 240 |
+
add2 = add1 + d4
|
| 241 |
+
add3 = add2 + d8
|
| 242 |
+
add4 = add3 + d16
|
| 243 |
+
|
| 244 |
+
combine = torch.cat([d1, add1, add2, add3, add4],1)
|
| 245 |
+
#combine_in_out = input + combine
|
| 246 |
+
output = self.bn(combine)
|
| 247 |
+
output = self.act(output)
|
| 248 |
+
return output
|
| 249 |
+
class BR(nn.Module):
|
| 250 |
+
'''
|
| 251 |
+
This class groups the batch normalization and PReLU activation
|
| 252 |
+
'''
|
| 253 |
+
def __init__(self, nOut):
|
| 254 |
+
'''
|
| 255 |
+
:param nOut: output feature maps
|
| 256 |
+
'''
|
| 257 |
+
super().__init__()
|
| 258 |
+
self.nOut=nOut
|
| 259 |
+
self.bn = nn.BatchNorm2d(nOut, eps=1e-03)
|
| 260 |
+
self.act = nn.PReLU(nOut)
|
| 261 |
+
|
| 262 |
+
def forward(self, input):
|
| 263 |
+
'''
|
| 264 |
+
:param input: input feature map
|
| 265 |
+
:return: normalized and thresholded feature map
|
| 266 |
+
'''
|
| 267 |
+
# print("bf bn :",input.size(),self.nOut)
|
| 268 |
+
output = self.bn(input)
|
| 269 |
+
# print("after bn :",output.size())
|
| 270 |
+
output = self.act(output)
|
| 271 |
+
# print("after act :",output.size())
|
| 272 |
+
return output
|
| 273 |
+
class DilatedParllelResidualBlockB(nn.Module):
|
| 274 |
+
'''
|
| 275 |
+
This class defines the ESP block, which is based on the following principle
|
| 276 |
+
Reduce ---> Split ---> Transform --> Merge
|
| 277 |
+
'''
|
| 278 |
+
def __init__(self, nIn, nOut, add=True):
|
| 279 |
+
'''
|
| 280 |
+
:param nIn: number of input channels
|
| 281 |
+
:param nOut: number of output channels
|
| 282 |
+
:param add: if true, add a residual connection through identity operation. You can use projection too as
|
| 283 |
+
in ResNet paper, but we avoid to use it if the dimensions are not the same because we do not want to
|
| 284 |
+
increase the module complexity
|
| 285 |
+
'''
|
| 286 |
+
super().__init__()
|
| 287 |
+
n = max(int(nOut/5),1)
|
| 288 |
+
n1 = max(nOut - 4*n,1)
|
| 289 |
+
# print(nIn,n,n1,"--")
|
| 290 |
+
self.c1 = C(nIn, n, 1, 1)
|
| 291 |
+
self.d1 = CDilated(n, n1, 3, 1, 1) # dilation rate of 2^0
|
| 292 |
+
self.d2 = CDilated(n, n, 3, 1, 2) # dilation rate of 2^1
|
| 293 |
+
self.d4 = CDilated(n, n, 3, 1, 4) # dilation rate of 2^2
|
| 294 |
+
self.d8 = CDilated(n, n, 3, 1, 8) # dilation rate of 2^3
|
| 295 |
+
self.d16 = CDilated(n, n, 3, 1, 16) # dilation rate of 2^4
|
| 296 |
+
# print("nOut bf :",nOut)
|
| 297 |
+
self.bn = BR(nOut)
|
| 298 |
+
# print("nOut at :",self.bn.size())
|
| 299 |
+
self.add = add
|
| 300 |
+
|
| 301 |
+
def forward(self, input):
|
| 302 |
+
'''
|
| 303 |
+
:param input: input feature map
|
| 304 |
+
:return: transformed feature map
|
| 305 |
+
'''
|
| 306 |
+
# reduce
|
| 307 |
+
output1 = self.c1(input)
|
| 308 |
+
# split and transform
|
| 309 |
+
d1 = self.d1(output1)
|
| 310 |
+
d2 = self.d2(output1)
|
| 311 |
+
d4 = self.d4(output1)
|
| 312 |
+
d8 = self.d8(output1)
|
| 313 |
+
d16 = self.d16(output1)
|
| 314 |
+
|
| 315 |
+
|
| 316 |
+
# heirarchical fusion for de-gridding
|
| 317 |
+
add1 = d2
|
| 318 |
+
add2 = add1 + d4
|
| 319 |
+
add3 = add2 + d8
|
| 320 |
+
add4 = add3 + d16
|
| 321 |
+
# print(d1.size(),add1.size(),add2.size(),add3.size(),add4.size())
|
| 322 |
+
|
| 323 |
+
#merge
|
| 324 |
+
combine = torch.cat([d1, add1, add2, add3, add4], 1)
|
| 325 |
+
# print("combine :",combine.size())
|
| 326 |
+
# if residual version
|
| 327 |
+
if self.add:
|
| 328 |
+
# print("add :",combine.size())
|
| 329 |
+
combine = input + combine
|
| 330 |
+
# print(combine.size(),"-----------------")
|
| 331 |
+
output = self.bn(combine)
|
| 332 |
+
return output
|
| 333 |
+
|
| 334 |
+
class InputProjectionA(nn.Module):
|
| 335 |
+
'''
|
| 336 |
+
This class projects the input image to the same spatial dimensions as the feature map.
|
| 337 |
+
For example, if the input image is 512 x512 x3 and spatial dimensions of feature map size are 56x56xF, then
|
| 338 |
+
this class will generate an output of 56x56x3
|
| 339 |
+
'''
|
| 340 |
+
def __init__(self, samplingTimes):
|
| 341 |
+
'''
|
| 342 |
+
:param samplingTimes: The rate at which you want to down-sample the image
|
| 343 |
+
'''
|
| 344 |
+
super().__init__()
|
| 345 |
+
self.pool = nn.ModuleList()
|
| 346 |
+
for i in range(0, samplingTimes):
|
| 347 |
+
#pyramid-based approach for down-sampling
|
| 348 |
+
self.pool.append(nn.AvgPool2d(3, stride=2, padding=1))
|
| 349 |
+
|
| 350 |
+
def forward(self, input):
|
| 351 |
+
'''
|
| 352 |
+
:param input: Input RGB Image
|
| 353 |
+
:return: down-sampled image (pyramid-based approach)
|
| 354 |
+
'''
|
| 355 |
+
for pool in self.pool:
|
| 356 |
+
input = pool(input)
|
| 357 |
+
return input
|
| 358 |
+
|
| 359 |
+
class ESPNet_Encoder(nn.Module):
|
| 360 |
+
'''
|
| 361 |
+
This class defines the ESPNet-C network in the paper
|
| 362 |
+
'''
|
| 363 |
+
def __init__(self, p=5, q=3):
|
| 364 |
+
# def __init__(self, classes=20, p=1, q=1):
|
| 365 |
+
'''
|
| 366 |
+
:param classes: number of classes in the dataset. Default is 20 for the cityscapes
|
| 367 |
+
:param p: depth multiplier
|
| 368 |
+
:param q: depth multiplier
|
| 369 |
+
'''
|
| 370 |
+
super().__init__()
|
| 371 |
+
self.level1 = CBR(3, 16, 3, 2)
|
| 372 |
+
self.sample1 = InputProjectionA(1)
|
| 373 |
+
self.sample2 = InputProjectionA(2)
|
| 374 |
+
|
| 375 |
+
self.b1 = CBR(16 + 3,19,3)
|
| 376 |
+
self.level2_0 = DownSamplerB(16 +3, 64)
|
| 377 |
+
|
| 378 |
+
self.level2 = nn.ModuleList()
|
| 379 |
+
for i in range(0, p):
|
| 380 |
+
self.level2.append(DilatedParllelResidualBlockB(64 , 64))
|
| 381 |
+
self.b2 = CBR(128 + 3,131,3)
|
| 382 |
+
|
| 383 |
+
self.level3_0 = DownSamplerB(128 + 3, 128)
|
| 384 |
+
self.level3 = nn.ModuleList()
|
| 385 |
+
for i in range(0, q):
|
| 386 |
+
self.level3.append(DilatedParllelResidualBlockB(128 , 128))
|
| 387 |
+
# self.mixstyle = MixStyle2(p=0.5, alpha=0.1)
|
| 388 |
+
self.b3 = CBR(256,32,3)
|
| 389 |
+
self.sa = PAM_Module(32)
|
| 390 |
+
self.sc = CAM_Module(32)
|
| 391 |
+
self.conv_sa = CBR(32,32,3)
|
| 392 |
+
self.conv_sc = CBR(32,32,3)
|
| 393 |
+
self.classifier = CBR(32, 32, 1, 1)
|
| 394 |
+
|
| 395 |
+
def forward(self, input):
|
| 396 |
+
'''
|
| 397 |
+
:param input: Receives the input RGB image
|
| 398 |
+
:return: the transformed feature map with spatial dimensions 1/8th of the input image
|
| 399 |
+
'''
|
| 400 |
+
output0 = self.level1(input)
|
| 401 |
+
inp1 = self.sample1(input)
|
| 402 |
+
inp2 = self.sample2(input)
|
| 403 |
+
|
| 404 |
+
output0_cat = self.b1(torch.cat([output0, inp1], 1))
|
| 405 |
+
output1_0 = self.level2_0(output0_cat) # down-sampled
|
| 406 |
+
|
| 407 |
+
for i, layer in enumerate(self.level2):
|
| 408 |
+
if i==0:
|
| 409 |
+
output1 = layer(output1_0)
|
| 410 |
+
else:
|
| 411 |
+
output1 = layer(output1)
|
| 412 |
+
|
| 413 |
+
output1_cat = self.b2(torch.cat([output1, output1_0, inp2], 1))
|
| 414 |
+
output2_0 = self.level3_0(output1_cat) # down-sampled
|
| 415 |
+
for i, layer in enumerate(self.level3):
|
| 416 |
+
if i==0:
|
| 417 |
+
output2 = layer(output2_0)
|
| 418 |
+
else:
|
| 419 |
+
output2 = layer(output2)
|
| 420 |
+
cat_=torch.cat([output2_0, output2], 1)
|
| 421 |
+
|
| 422 |
+
output2_cat = self.b3(cat_)
|
| 423 |
+
out_sa=self.sa(output2_cat)
|
| 424 |
+
out_sa=self.conv_sa(out_sa)
|
| 425 |
+
out_sc=self.sc(output2_cat)
|
| 426 |
+
out_sc=self.conv_sc(out_sc)
|
| 427 |
+
out_s=out_sa+out_sc
|
| 428 |
+
classifier = self.classifier(out_s)
|
| 429 |
+
|
| 430 |
+
return classifier
|
| 431 |
+
|
| 432 |
+
class TwinLiteNet(nn.Module):
|
| 433 |
+
'''
|
| 434 |
+
This class defines the ESPNet network
|
| 435 |
+
'''
|
| 436 |
+
|
| 437 |
+
def __init__(self, p=2, q=3, ):
|
| 438 |
+
|
| 439 |
+
super().__init__()
|
| 440 |
+
self.encoder = ESPNet_Encoder(p, q)
|
| 441 |
+
|
| 442 |
+
self.up_1_1 = UPx2(32,16)
|
| 443 |
+
self.up_2_1 = UPx2(16,8)
|
| 444 |
+
|
| 445 |
+
self.up_1_2 = UPx2(32,16)
|
| 446 |
+
self.up_2_2 = UPx2(16,8)
|
| 447 |
+
|
| 448 |
+
self.classifier_1 = UPx2(8,2)
|
| 449 |
+
self.classifier_2 = UPx2(8,2)
|
| 450 |
+
|
| 451 |
+
|
| 452 |
+
|
| 453 |
+
def forward(self, input):
|
| 454 |
+
|
| 455 |
+
x=self.encoder(input)
|
| 456 |
+
x1=self.up_1_1(x)
|
| 457 |
+
x1=self.up_2_1(x1)
|
| 458 |
+
classifier1=self.classifier_1(x1)
|
| 459 |
+
|
| 460 |
+
|
| 461 |
+
|
| 462 |
+
x2=self.up_1_2(x)
|
| 463 |
+
x2=self.up_2_2(x2)
|
| 464 |
+
classifier2=self.classifier_2(x2)
|
| 465 |
+
|
| 466 |
+
return (classifier1,classifier2)
|
| 467 |
+
|
| 468 |
+
|
model/TwinLite_8class.py
ADDED
|
@@ -0,0 +1,26 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""TwinLiteNet adapted for SINGLE 8-class semantic output (not dual binary).
|
| 2 |
+
|
| 3 |
+
Same encoder and decoder upsampling, but final classifier outputs 8 channels
|
| 4 |
+
matching our Segformer setup:
|
| 5 |
+
0=tree 1=ground 2=person 3=sky 4=road 5=mountain 6=building 7=background
|
| 6 |
+
|
| 7 |
+
We keep one branch only — drops classifier_2 entirely → slightly faster + smaller.
|
| 8 |
+
"""
|
| 9 |
+
import torch
|
| 10 |
+
import torch.nn as nn
|
| 11 |
+
from .TwinLite import ESPNet_Encoder, UPx2
|
| 12 |
+
|
| 13 |
+
|
| 14 |
+
class TwinLiteNet8(nn.Module):
|
| 15 |
+
def __init__(self, num_classes: int = 8, p: int = 2, q: int = 3):
|
| 16 |
+
super().__init__()
|
| 17 |
+
self.encoder = ESPNet_Encoder(p, q)
|
| 18 |
+
self.up_1 = UPx2(32, 16)
|
| 19 |
+
self.up_2 = UPx2(16, 8)
|
| 20 |
+
self.classifier = UPx2(8, num_classes)
|
| 21 |
+
|
| 22 |
+
def forward(self, x):
|
| 23 |
+
x = self.encoder(x)
|
| 24 |
+
x = self.up_1(x)
|
| 25 |
+
x = self.up_2(x)
|
| 26 |
+
return self.classifier(x) # (B, num_classes, H, W)
|
model/__pycache__/TwinLite.cpython-311.pyc
ADDED
|
Binary file (25.4 kB). View file
|
|
|
model/__pycache__/TwinLite.cpython-38.pyc
ADDED
|
Binary file (13.9 kB). View file
|
|
|
model/__pycache__/TwinLite_8class.cpython-311.pyc
ADDED
|
Binary file (2.07 kB). View file
|
|
|
predict.py
ADDED
|
@@ -0,0 +1,103 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""TwinLiteNet8 inference — single image or directory.
|
| 2 |
+
|
| 3 |
+
Same interface as Segformer's predict.py for easy swap.
|
| 4 |
+
Trained at 640x360; this script auto-resizes any input down to 640x360 for
|
| 5 |
+
inference, then upsamples the prediction back to original resolution.
|
| 6 |
+
|
| 7 |
+
Usage:
|
| 8 |
+
python predict.py input.jpg --weights run_8class/twinlite8_best.pt
|
| 9 |
+
python predict.py --dir frames/ --out out/ --weights run_8class/twinlite8_best.pt
|
| 10 |
+
"""
|
| 11 |
+
from __future__ import annotations
|
| 12 |
+
import argparse, sys, os
|
| 13 |
+
from pathlib import Path
|
| 14 |
+
|
| 15 |
+
import cv2
|
| 16 |
+
import numpy as np
|
| 17 |
+
import torch
|
| 18 |
+
import torch.nn.functional as F
|
| 19 |
+
|
| 20 |
+
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
|
| 21 |
+
from model.TwinLite_8class import TwinLiteNet8
|
| 22 |
+
|
| 23 |
+
NAMES = ["tree", "ground", "person", "sky", "road", "mountain", "building", "background"]
|
| 24 |
+
PALETTE = np.array([
|
| 25 |
+
[60, 220, 60], # tree
|
| 26 |
+
[40, 100, 160], # ground
|
| 27 |
+
[40, 40, 230], # person
|
| 28 |
+
[230, 200, 60], # sky
|
| 29 |
+
[140, 140, 140], # road
|
| 30 |
+
[180, 60, 180], # mountain
|
| 31 |
+
[50, 220, 220], # building
|
| 32 |
+
[100, 100, 100], # background
|
| 33 |
+
], dtype=np.uint8)
|
| 34 |
+
TRAIN_W, TRAIN_H = 640, 360
|
| 35 |
+
|
| 36 |
+
|
| 37 |
+
def load_model(weights, device="cuda"):
|
| 38 |
+
model = TwinLiteNet8(num_classes=8).to(device).eval()
|
| 39 |
+
ckpt = torch.load(weights, map_location=device, weights_only=False)
|
| 40 |
+
model.load_state_dict(ckpt["model"] if "model" in ckpt else ckpt)
|
| 41 |
+
return model
|
| 42 |
+
|
| 43 |
+
|
| 44 |
+
def predict(model, bgr_img, device="cuda"):
|
| 45 |
+
"""BGR uint8 → (H,W) class id mask 0..7 at original resolution."""
|
| 46 |
+
H, W = bgr_img.shape[:2]
|
| 47 |
+
inp_bgr = cv2.resize(bgr_img, (TRAIN_W, TRAIN_H))
|
| 48 |
+
rgb = cv2.cvtColor(inp_bgr, cv2.COLOR_BGR2RGB).astype(np.float32) / 255.0
|
| 49 |
+
x = torch.from_numpy(rgb.transpose(2, 0, 1)).unsqueeze(0).float().to(device)
|
| 50 |
+
with torch.no_grad():
|
| 51 |
+
logits = model(x)
|
| 52 |
+
# Upsample logits to original resolution before argmax (cleaner boundaries)
|
| 53 |
+
logits = F.interpolate(logits, size=(H, W), mode="bilinear", align_corners=False)
|
| 54 |
+
# v2: channel 7 (background) was never trained -> mask it out so it can't win argmax
|
| 55 |
+
logits[:, 7, :, :] = -1e9
|
| 56 |
+
return logits.argmax(1)[0].cpu().numpy().astype(np.uint8)
|
| 57 |
+
|
| 58 |
+
|
| 59 |
+
def colorize(mask):
|
| 60 |
+
return PALETTE[mask]
|
| 61 |
+
|
| 62 |
+
|
| 63 |
+
def overlay(bgr, mask, alpha=0.45):
|
| 64 |
+
return cv2.addWeighted(bgr, 1 - alpha, colorize(mask), alpha, 0)
|
| 65 |
+
|
| 66 |
+
|
| 67 |
+
def main():
|
| 68 |
+
ap = argparse.ArgumentParser()
|
| 69 |
+
ap.add_argument("input", nargs="?")
|
| 70 |
+
ap.add_argument("--dir")
|
| 71 |
+
ap.add_argument("--out", default=".")
|
| 72 |
+
ap.add_argument("--weights", default="run_8class/twinlite8_best.pt")
|
| 73 |
+
ap.add_argument("--device", default="cuda" if torch.cuda.is_available() else "cpu")
|
| 74 |
+
args = ap.parse_args()
|
| 75 |
+
|
| 76 |
+
if not args.input and not args.dir:
|
| 77 |
+
ap.print_help(); return
|
| 78 |
+
|
| 79 |
+
print(f"loading model from {args.weights} on {args.device} ...")
|
| 80 |
+
model = load_model(args.weights, device=args.device)
|
| 81 |
+
out_dir = Path(args.out); out_dir.mkdir(parents=True, exist_ok=True)
|
| 82 |
+
|
| 83 |
+
paths = []
|
| 84 |
+
if args.dir:
|
| 85 |
+
paths = sorted(p for p in Path(args.dir).iterdir() if p.suffix.lower() in {".jpg",".jpeg",".png",".bmp"})
|
| 86 |
+
if args.input:
|
| 87 |
+
paths.append(Path(args.input))
|
| 88 |
+
|
| 89 |
+
for p in paths:
|
| 90 |
+
img = cv2.imread(str(p))
|
| 91 |
+
if img is None:
|
| 92 |
+
print(f" skip: {p}"); continue
|
| 93 |
+
mask = predict(model, img, device=args.device)
|
| 94 |
+
cv2.imwrite(str(out_dir / f"{p.stem}_pred.png"), mask)
|
| 95 |
+
cv2.imwrite(str(out_dir / f"{p.stem}_overlay.jpg"), overlay(img, mask))
|
| 96 |
+
counts = np.bincount(mask.flatten(), minlength=8)
|
| 97 |
+
top = counts.argmax()
|
| 98 |
+
print(f" {p.name:<50} top: {NAMES[top]} ({100*counts[top]/counts.sum():.1f}%)")
|
| 99 |
+
print(f"\noutputs -> {out_dir.resolve()}")
|
| 100 |
+
|
| 101 |
+
|
| 102 |
+
if __name__ == "__main__":
|
| 103 |
+
main()
|
predict_onnx.py
ADDED
|
@@ -0,0 +1,84 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""TwinLiteNet8 ONNX inference — for edge deployment / cross-platform.
|
| 2 |
+
|
| 3 |
+
Runs entirely via ONNX Runtime (no PyTorch needed at deploy time).
|
| 4 |
+
Use CPUExecutionProvider for CPU, CUDAExecutionProvider for GPU,
|
| 5 |
+
TensorRTExecutionProvider for TensorRT-accelerated runs on Jetson.
|
| 6 |
+
|
| 7 |
+
Usage:
|
| 8 |
+
python predict_onnx.py input.jpg --onnx twinlite8.onnx
|
| 9 |
+
python predict_onnx.py --dir frames/ --out out/ --onnx twinlite8.onnx --provider CUDAExecutionProvider
|
| 10 |
+
"""
|
| 11 |
+
import argparse
|
| 12 |
+
from pathlib import Path
|
| 13 |
+
import cv2
|
| 14 |
+
import numpy as np
|
| 15 |
+
import onnxruntime as ort
|
| 16 |
+
|
| 17 |
+
NAMES = ["tree","ground","person","sky","road","mountain","building","background"]
|
| 18 |
+
PALETTE = np.array([
|
| 19 |
+
[60,220,60],[40,100,160],[40,40,230],[230,200,60],
|
| 20 |
+
[140,140,140],[180,60,180],[50,220,220],[100,100,100],
|
| 21 |
+
], dtype=np.uint8)
|
| 22 |
+
TRAIN_W, TRAIN_H = 640, 360
|
| 23 |
+
|
| 24 |
+
|
| 25 |
+
def predict(sess, bgr_img):
|
| 26 |
+
H, W = bgr_img.shape[:2]
|
| 27 |
+
inp = cv2.resize(bgr_img, (TRAIN_W, TRAIN_H))
|
| 28 |
+
rgb = cv2.cvtColor(inp, cv2.COLOR_BGR2RGB).astype(np.float32) / 255.0
|
| 29 |
+
x = rgb.transpose(2, 0, 1)[None].astype(np.float32) # (1,3,H,W)
|
| 30 |
+
logits = sess.run(None, {"input": x})[0] # (1,8,H,W)
|
| 31 |
+
logits[:, 7, :, :] = -1e9 # v2: bg channel never trained
|
| 32 |
+
pred_small = logits.argmax(1)[0].astype(np.uint8) # at training res
|
| 33 |
+
if (H, W) != (TRAIN_H, TRAIN_W):
|
| 34 |
+
return cv2.resize(pred_small, (W, H), interpolation=cv2.INTER_NEAREST)
|
| 35 |
+
return pred_small
|
| 36 |
+
|
| 37 |
+
|
| 38 |
+
def main():
|
| 39 |
+
ap = argparse.ArgumentParser()
|
| 40 |
+
ap.add_argument("input", nargs="?")
|
| 41 |
+
ap.add_argument("--dir")
|
| 42 |
+
ap.add_argument("--out", default=".")
|
| 43 |
+
ap.add_argument("--onnx", default="twinlite8.onnx")
|
| 44 |
+
ap.add_argument("--provider", default=None,
|
| 45 |
+
help="ONNX provider: CPUExecutionProvider | CUDAExecutionProvider | TensorrtExecutionProvider")
|
| 46 |
+
args = ap.parse_args()
|
| 47 |
+
|
| 48 |
+
if not args.input and not args.dir:
|
| 49 |
+
ap.print_help(); return
|
| 50 |
+
|
| 51 |
+
available = ort.get_available_providers()
|
| 52 |
+
if args.provider:
|
| 53 |
+
providers = [args.provider]
|
| 54 |
+
else:
|
| 55 |
+
# Auto-pick best
|
| 56 |
+
for p in ["TensorrtExecutionProvider", "CUDAExecutionProvider", "CPUExecutionProvider"]:
|
| 57 |
+
if p in available: providers = [p]; break
|
| 58 |
+
print(f"available providers: {available}")
|
| 59 |
+
print(f"using: {providers}")
|
| 60 |
+
sess = ort.InferenceSession(args.onnx, providers=providers)
|
| 61 |
+
print(f"actual provider: {sess.get_providers()}")
|
| 62 |
+
|
| 63 |
+
out_dir = Path(args.out); out_dir.mkdir(parents=True, exist_ok=True)
|
| 64 |
+
paths = []
|
| 65 |
+
if args.dir:
|
| 66 |
+
paths = sorted(p for p in Path(args.dir).iterdir() if p.suffix.lower() in {".jpg",".jpeg",".png",".bmp"})
|
| 67 |
+
if args.input: paths.append(Path(args.input))
|
| 68 |
+
|
| 69 |
+
for p in paths:
|
| 70 |
+
img = cv2.imread(str(p))
|
| 71 |
+
if img is None: continue
|
| 72 |
+
mask = predict(sess, img)
|
| 73 |
+
cv2.imwrite(str(out_dir / f"{p.stem}_pred.png"), mask)
|
| 74 |
+
overlay = cv2.addWeighted(img, 0.55, PALETTE[mask], 0.45, 0)
|
| 75 |
+
cv2.imwrite(str(out_dir / f"{p.stem}_overlay.jpg"), overlay)
|
| 76 |
+
counts = np.bincount(mask.flatten(), minlength=8)
|
| 77 |
+
top = counts.argmax()
|
| 78 |
+
print(f" {p.name:<50} top: {NAMES[top]} ({100*counts[top]/counts.sum():.1f}%)")
|
| 79 |
+
|
| 80 |
+
print(f"\noutputs -> {out_dir.resolve()}")
|
| 81 |
+
|
| 82 |
+
|
| 83 |
+
if __name__ == "__main__":
|
| 84 |
+
main()
|
samples/0_frame_3884.jpg
ADDED
|
Git LFS Details
|
samples/1_frame_2803.jpg
ADDED
|
Git LFS Details
|
samples/2_frame_2626.jpg
ADDED
|
Git LFS Details
|
samples/3_frame_4093.jpg
ADDED
|
Git LFS Details
|
samples/4_frame_3138.jpg
ADDED
|
Git LFS Details
|
samples/5_frame_3076.jpg
ADDED
|
Git LFS Details
|
samples_20/sample_00_frame_3884.jpg
ADDED
|
Git LFS Details
|
samples_20/sample_01_frame_2803.jpg
ADDED
|
Git LFS Details
|
samples_20/sample_02_frame_2626.jpg
ADDED
|
Git LFS Details
|
samples_20/sample_03_frame_4093.jpg
ADDED
|
Git LFS Details
|
samples_20/sample_04_frame_3138.jpg
ADDED
|
Git LFS Details
|
samples_20/sample_05_frame_3076.jpg
ADDED
|
Git LFS Details
|
samples_20/sample_06_frame_3032.jpg
ADDED
|
Git LFS Details
|
samples_20/sample_07_frame_2860.jpg
ADDED
|
Git LFS Details
|
samples_20/sample_08_frame_4083.jpg
ADDED
|
Git LFS Details
|
samples_20/sample_09_frame_2784.jpg
ADDED
|
Git LFS Details
|
samples_20/sample_10_frame_3960.jpg
ADDED
|
Git LFS Details
|
samples_20/sample_11_frame_4091.jpg
ADDED
|
Git LFS Details
|
samples_20/sample_12_frame_4402.jpg
ADDED
|
Git LFS Details
|
samples_20/sample_13_frame_3691.jpg
ADDED
|
Git LFS Details
|
samples_20/sample_14_frame_2753.jpg
ADDED
|
Git LFS Details
|
samples_20/sample_15_frame_3784.jpg
ADDED
|
Git LFS Details
|
samples_20/sample_16_frame_3439.jpg
ADDED
|
Git LFS Details
|
samples_20/sample_17_frame_2640.jpg
ADDED
|
Git LFS Details
|
samples_20/sample_18_frame_2636.jpg
ADDED
|
Git LFS Details
|
samples_20/sample_19_frame_2766.jpg
ADDED
|
Git LFS Details
|
train_8class.py
ADDED
|
@@ -0,0 +1,247 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""TwinLiteNet8 — single-branch 8-class semantic seg, directly comparable to Segformer.
|
| 2 |
+
|
| 3 |
+
Classes: 0 tree 1 ground 2 person 3 sky 4 road 5 mountain 6 building 7 background
|
| 4 |
+
"""
|
| 5 |
+
from __future__ import annotations
|
| 6 |
+
import os, sys, json, re, time, random
|
| 7 |
+
from pathlib import Path
|
| 8 |
+
import numpy as np, cv2, torch
|
| 9 |
+
import torch.nn as nn
|
| 10 |
+
import torch.nn.functional as F
|
| 11 |
+
from torch.utils.data import Dataset, DataLoader, ConcatDataset
|
| 12 |
+
|
| 13 |
+
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
|
| 14 |
+
from model.TwinLite_8class import TwinLiteNet8
|
| 15 |
+
|
| 16 |
+
# ───────── config ─────────
|
| 17 |
+
ROOT = Path(r"C:/Users/room104/Desktop/AGMOtree/semantic_segmantation")
|
| 18 |
+
OLD_IMG = ROOT / "merged_dataset/train/images"
|
| 19 |
+
OLD_MSK = ROOT / "merged_dataset/train/masks_pseudo"
|
| 20 |
+
NEW_IMG = ROOT / "orchard_nav/train/images"
|
| 21 |
+
NEW_MSK = ROOT / "orchard_nav/train/masks"
|
| 22 |
+
|
| 23 |
+
OUT_DIR = Path(r"C:/Users/room104/Desktop/AGMOtree/TwinLiteNet_train/run_v2")
|
| 24 |
+
OUT_DIR.mkdir(parents=True, exist_ok=True)
|
| 25 |
+
|
| 26 |
+
NAMES = ["tree","ground","person","sky","road","mountain","building","background"]
|
| 27 |
+
NUM_CLASSES = 8
|
| 28 |
+
IGNORE_INDEX = 255
|
| 29 |
+
|
| 30 |
+
W_IN, H_IN = 640, 360
|
| 31 |
+
BATCH = 16
|
| 32 |
+
EPOCHS = 60
|
| 33 |
+
LR = 5e-4
|
| 34 |
+
NUM_WORKERS = 4
|
| 35 |
+
SEED = 42
|
| 36 |
+
DEVICE = "cuda"
|
| 37 |
+
|
| 38 |
+
# v2 design: background is NOT a real class. Pixels labeled 7 → 255 (ignore_index)
|
| 39 |
+
# in the loader, so loss never trains channel 7. Weight 0 as belt-and-braces.
|
| 40 |
+
# At inference, channel 7 logit is set to -inf before argmax (see predict.py update).
|
| 41 |
+
WEIGHTS = np.array([1.5, 0.5, 1.5, 1.0, 1.0, 1.0, 1.0, 0.0], dtype=np.float32)
|
| 42 |
+
|
| 43 |
+
random.seed(SEED); np.random.seed(SEED); torch.manual_seed(SEED)
|
| 44 |
+
|
| 45 |
+
|
| 46 |
+
def frame_num(p):
|
| 47 |
+
m = re.match(r"frame_(\d+)", p.stem); return int(m.group(1)) if m else -1
|
| 48 |
+
|
| 49 |
+
|
| 50 |
+
class OrchardDS(Dataset):
|
| 51 |
+
def __init__(self, paths, mask_dir, augment=False, source="old"):
|
| 52 |
+
self.paths = paths
|
| 53 |
+
self.mask_dir = mask_dir
|
| 54 |
+
self.augment = augment
|
| 55 |
+
self.source = source
|
| 56 |
+
|
| 57 |
+
def __len__(self): return len(self.paths)
|
| 58 |
+
|
| 59 |
+
def __getitem__(self, i):
|
| 60 |
+
ip = self.paths[i]
|
| 61 |
+
img = cv2.imread(str(ip))
|
| 62 |
+
msk = cv2.imread(str(self.mask_dir / (ip.stem + ".png")), cv2.IMREAD_GRAYSCALE)
|
| 63 |
+
if img is None or msk is None:
|
| 64 |
+
img = np.zeros((H_IN, W_IN, 3), dtype=np.uint8)
|
| 65 |
+
msk = np.full((H_IN, W_IN), IGNORE_INDEX, dtype=np.uint8)
|
| 66 |
+
|
| 67 |
+
if self.augment:
|
| 68 |
+
if random.random() < 0.5:
|
| 69 |
+
img = np.ascontiguousarray(img[:, ::-1])
|
| 70 |
+
msk = np.ascontiguousarray(msk[:, ::-1])
|
| 71 |
+
if random.random() < 0.5:
|
| 72 |
+
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV).astype(np.int16)
|
| 73 |
+
hsv[..., 0] = (hsv[..., 0] + random.randint(-10, 10)) % 180
|
| 74 |
+
hsv[..., 1] = np.clip(hsv[..., 1] * random.uniform(0.7, 1.3), 0, 255)
|
| 75 |
+
hsv[..., 2] = np.clip(hsv[..., 2] * random.uniform(0.7, 1.3), 0, 255)
|
| 76 |
+
img = cv2.cvtColor(hsv.astype(np.uint8), cv2.COLOR_HSV2BGR)
|
| 77 |
+
|
| 78 |
+
img = cv2.resize(img, (W_IN, H_IN))
|
| 79 |
+
msk = cv2.resize(msk, (W_IN, H_IN), interpolation=cv2.INTER_NEAREST)
|
| 80 |
+
|
| 81 |
+
# v2: remap class 7 (background) -> IGNORE_INDEX so it does NOT train.
|
| 82 |
+
# The user's intent: "background = stuff the model can't recognize", not a real class.
|
| 83 |
+
if self.source == "old":
|
| 84 |
+
msk[msk == 7] = IGNORE_INDEX
|
| 85 |
+
# new-source masks already have 255 for non-tree pixels, no change needed.
|
| 86 |
+
|
| 87 |
+
img = img[:, :, ::-1].transpose(2, 0, 1).astype(np.float32) / 255.0
|
| 88 |
+
return (torch.from_numpy(img).float(),
|
| 89 |
+
torch.from_numpy(msk).long())
|
| 90 |
+
|
| 91 |
+
|
| 92 |
+
# ─── temporal split ───
|
| 93 |
+
old_all = sorted(OLD_IMG.glob("*.jpg"))
|
| 94 |
+
old_train = [p for p in old_all if frame_num(p) <= 4500]
|
| 95 |
+
old_val = [p for p in old_all if frame_num(p) > 4500]
|
| 96 |
+
|
| 97 |
+
new_all = sorted(NEW_IMG.glob("*.jpg")); random.shuffle(new_all)
|
| 98 |
+
n_new_val = max(20, len(new_all) // 10)
|
| 99 |
+
new_val = new_all[:n_new_val]
|
| 100 |
+
new_train = new_all[n_new_val:]
|
| 101 |
+
|
| 102 |
+
train_ds = ConcatDataset([
|
| 103 |
+
OrchardDS(old_train, OLD_MSK, augment=True, source="old"),
|
| 104 |
+
OrchardDS(new_train, NEW_MSK, augment=True, source="new"),
|
| 105 |
+
])
|
| 106 |
+
old_val_ds = OrchardDS(old_val, OLD_MSK, augment=False, source="old")
|
| 107 |
+
new_val_ds = OrchardDS(new_val, NEW_MSK, augment=False, source="new")
|
| 108 |
+
|
| 109 |
+
print(f"=== TwinLiteNet8 (single-branch, 8-class) ===")
|
| 110 |
+
print(f" old train: {len(old_train)} new train: {len(new_train)}")
|
| 111 |
+
print(f" old val: {len(old_val)} new val: {len(new_val)}")
|
| 112 |
+
|
| 113 |
+
|
| 114 |
+
# ─── eval ───
|
| 115 |
+
def confusion(preds, ys, n, ignore=IGNORE_INDEX):
|
| 116 |
+
cm = np.zeros((n, n), dtype=np.int64)
|
| 117 |
+
valid = ys != ignore
|
| 118 |
+
if not valid.any(): return cm
|
| 119 |
+
p = preds[valid]; t = ys[valid]
|
| 120 |
+
for tc in range(n):
|
| 121 |
+
mt = (t == tc)
|
| 122 |
+
if not mt.any(): continue
|
| 123 |
+
for pc in range(n):
|
| 124 |
+
cm[tc, pc] += int(((p == pc) & mt).sum())
|
| 125 |
+
return cm
|
| 126 |
+
|
| 127 |
+
def iou_from_cm(cm):
|
| 128 |
+
n = cm.shape[0]; ious = np.zeros(n)
|
| 129 |
+
for c in range(n):
|
| 130 |
+
tp = cm[c,c]; fp = cm[:,c].sum()-tp; fn = cm[c,:].sum()-tp
|
| 131 |
+
ious[c] = tp / (tp+fp+fn) if (tp+fp+fn) > 0 else float("nan")
|
| 132 |
+
return ious
|
| 133 |
+
|
| 134 |
+
|
| 135 |
+
# ─── train ───
|
| 136 |
+
log_path = OUT_DIR / "log.txt"
|
| 137 |
+
def log(m):
|
| 138 |
+
print(m, flush=True)
|
| 139 |
+
with log_path.open("a", encoding="utf-8") as f: f.write(m + "\n")
|
| 140 |
+
|
| 141 |
+
|
| 142 |
+
def main():
|
| 143 |
+
log_path.write_text("")
|
| 144 |
+
train_loader = DataLoader(train_ds, batch_size=BATCH, shuffle=True,
|
| 145 |
+
num_workers=NUM_WORKERS, pin_memory=True, drop_last=True,
|
| 146 |
+
persistent_workers=True)
|
| 147 |
+
old_val_loader = DataLoader(old_val_ds, batch_size=BATCH, shuffle=False,
|
| 148 |
+
num_workers=2, pin_memory=True, persistent_workers=True)
|
| 149 |
+
new_val_loader = DataLoader(new_val_ds, batch_size=BATCH, shuffle=False,
|
| 150 |
+
num_workers=2, pin_memory=True, persistent_workers=True)
|
| 151 |
+
|
| 152 |
+
model = TwinLiteNet8(num_classes=NUM_CLASSES).to(DEVICE)
|
| 153 |
+
n_params = sum(p.numel() for p in model.parameters())
|
| 154 |
+
log(f"model: TwinLiteNet8 params: {n_params/1e6:.3f}M")
|
| 155 |
+
log(f"input: {W_IN}x{H_IN} batch: {BATCH} epochs: {EPOCHS} LR: {LR}")
|
| 156 |
+
log(f"classes: {NAMES}")
|
| 157 |
+
log(f"weights: {dict(zip(NAMES, [round(float(w),2) for w in WEIGHTS]))}")
|
| 158 |
+
log(f"train: {len(train_ds)} old_val: {len(old_val_ds)} new_val: {len(new_val_ds)}")
|
| 159 |
+
|
| 160 |
+
cw = torch.tensor(WEIGHTS, dtype=torch.float32, device=DEVICE)
|
| 161 |
+
loss_fn = nn.CrossEntropyLoss(weight=cw, ignore_index=IGNORE_INDEX)
|
| 162 |
+
optim = torch.optim.AdamW(model.parameters(), lr=LR, weight_decay=1e-4)
|
| 163 |
+
sched = torch.optim.lr_scheduler.CosineAnnealingLR(optim, T_max=EPOCHS * len(train_loader))
|
| 164 |
+
|
| 165 |
+
best_tree = -1.0
|
| 166 |
+
history = []
|
| 167 |
+
for epoch in range(1, EPOCHS+1):
|
| 168 |
+
model.train()
|
| 169 |
+
t0 = time.time()
|
| 170 |
+
ep_loss = 0.0
|
| 171 |
+
for x, y in train_loader:
|
| 172 |
+
x = x.cuda(non_blocking=True); y = y.cuda(non_blocking=True)
|
| 173 |
+
logits = model(x)
|
| 174 |
+
loss = loss_fn(logits, y)
|
| 175 |
+
optim.zero_grad(); loss.backward(); optim.step(); sched.step()
|
| 176 |
+
ep_loss += loss.item()
|
| 177 |
+
train_loss = ep_loss / len(train_loader)
|
| 178 |
+
|
| 179 |
+
model.eval()
|
| 180 |
+
cm_old = np.zeros((NUM_CLASSES, NUM_CLASSES), dtype=np.int64)
|
| 181 |
+
tree_tp = tree_fn = 0
|
| 182 |
+
with torch.no_grad():
|
| 183 |
+
for x, y in old_val_loader:
|
| 184 |
+
x = x.cuda(); y = y.cuda()
|
| 185 |
+
logits = model(x)
|
| 186 |
+
logits[:, 7, :, :] = -1e9 # never predict background — that channel is untrained
|
| 187 |
+
preds = logits.argmax(1)
|
| 188 |
+
cm_old += confusion(preds.cpu().numpy(), y.cpu().numpy(), NUM_CLASSES)
|
| 189 |
+
for x, y in new_val_loader:
|
| 190 |
+
x = x.cuda(); y = y.cuda()
|
| 191 |
+
logits = model(x)
|
| 192 |
+
logits[:, 7, :, :] = -1e9
|
| 193 |
+
preds = logits.argmax(1).cpu().numpy()
|
| 194 |
+
ys = y.cpu().numpy()
|
| 195 |
+
tm = (ys == 0)
|
| 196 |
+
tree_tp += int(((preds == 0) & tm).sum())
|
| 197 |
+
tree_fn += int(((preds != 0) & tm).sum())
|
| 198 |
+
|
| 199 |
+
iou_old = iou_from_cm(cm_old)
|
| 200 |
+
miou_7 = float(np.nanmean(iou_old[:7]))
|
| 201 |
+
tree_old = float(iou_old[0])
|
| 202 |
+
ground_old = float(iou_old[1])
|
| 203 |
+
tree_recall_new = tree_tp / (tree_tp + tree_fn) if (tree_tp + tree_fn) > 0 else float("nan")
|
| 204 |
+
elapsed = time.time() - t0
|
| 205 |
+
|
| 206 |
+
log(f"epoch {epoch:02d}/{EPOCHS} loss={train_loss:.4f} "
|
| 207 |
+
f"mIoU(7)={miou_7:.3f} tree_old={tree_old:.3f} ground_old={ground_old:.3f} "
|
| 208 |
+
f"tree_new_recall={tree_recall_new:.3f} ({elapsed:.0f}s)")
|
| 209 |
+
log(f" per-class IoU: " + ", ".join(f"{n}={v:.3f}" for n, v in zip(NAMES, iou_old)))
|
| 210 |
+
|
| 211 |
+
history.append({
|
| 212 |
+
"epoch": epoch, "loss": float(train_loss),
|
| 213 |
+
"miou_7": miou_7, "tree_iou_old": tree_old, "ground_iou_old": ground_old,
|
| 214 |
+
"tree_recall_new": float(tree_recall_new),
|
| 215 |
+
"per_class_iou": {n: float(v) for n, v in zip(NAMES, iou_old)},
|
| 216 |
+
})
|
| 217 |
+
torch.save({"model": model.state_dict(), "epoch": epoch,
|
| 218 |
+
"tree_iou_old": tree_old, "miou_7": miou_7, "tree_recall_new": float(tree_recall_new)},
|
| 219 |
+
OUT_DIR / "twinlite8_last.pt")
|
| 220 |
+
if tree_old > best_tree:
|
| 221 |
+
best_tree = tree_old
|
| 222 |
+
torch.save({"model": model.state_dict(), "epoch": epoch,
|
| 223 |
+
"tree_iou_old": tree_old, "miou_7": miou_7, "tree_recall_new": float(tree_recall_new)},
|
| 224 |
+
OUT_DIR / "twinlite8_best.pt")
|
| 225 |
+
log(f" saved best (tree_old {tree_old:.3f})")
|
| 226 |
+
(OUT_DIR / "history.json").write_text(json.dumps(history, indent=2))
|
| 227 |
+
|
| 228 |
+
log(f"\n=== DONE === best tree_old IoU: {best_tree:.3f}")
|
| 229 |
+
|
| 230 |
+
# ─── FPS benchmark ───
|
| 231 |
+
log(f"\n=== FPS BENCHMARK (RTX 3080, batch=1, 640x360) ===")
|
| 232 |
+
model.eval()
|
| 233 |
+
x = torch.randn(1, 3, H_IN, W_IN, device=DEVICE)
|
| 234 |
+
with torch.no_grad():
|
| 235 |
+
for _ in range(20): model(x)
|
| 236 |
+
torch.cuda.synchronize()
|
| 237 |
+
t0 = time.time()
|
| 238 |
+
N = 200
|
| 239 |
+
for _ in range(N): model(x)
|
| 240 |
+
torch.cuda.synchronize()
|
| 241 |
+
fps = N / (time.time() - t0)
|
| 242 |
+
log(f" TwinLiteNet8 @ 640x360 batch=1: {fps:.1f} FPS")
|
| 243 |
+
log(f" Jetson Orin Nano estimate: ~{fps/4:.0f}-{fps/3:.0f} FPS")
|
| 244 |
+
|
| 245 |
+
|
| 246 |
+
if __name__ == "__main__":
|
| 247 |
+
main()
|
training_log.txt
ADDED
|
@@ -0,0 +1,140 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
model: TwinLiteNet8 params: 0.437M
|
| 2 |
+
input: 640x360 batch: 16 epochs: 60 LR: 0.0005
|
| 3 |
+
classes: ['tree', 'ground', 'person', 'sky', 'road', 'mountain', 'building', 'background']
|
| 4 |
+
weights: {'tree': 1.5, 'ground': 0.5, 'person': 1.5, 'sky': 1.0, 'road': 1.0, 'mountain': 1.0, 'building': 1.0, 'background': 0.0}
|
| 5 |
+
train: 5457 old_val: 155 new_val: 31
|
| 6 |
+
epoch 01/60 loss=1.3311 mIoU(7)=0.375 tree_old=0.818 ground_old=0.877 tree_new_recall=0.872 (94s)
|
| 7 |
+
per-class IoU: tree=0.818, ground=0.877, person=0.044, sky=0.779, road=0.001, mountain=0.090, building=0.012, background=nan
|
| 8 |
+
saved best (tree_old 0.818)
|
| 9 |
+
epoch 02/60 loss=0.7963 mIoU(7)=0.426 tree_old=0.812 ground_old=0.859 tree_new_recall=0.865 (65s)
|
| 10 |
+
per-class IoU: tree=0.812, ground=0.859, person=0.066, sky=0.800, road=0.020, mountain=0.396, building=0.033, background=nan
|
| 11 |
+
epoch 03/60 loss=0.5680 mIoU(7)=0.558 tree_old=0.850 ground_old=0.884 tree_new_recall=0.939 (65s)
|
| 12 |
+
per-class IoU: tree=0.850, ground=0.884, person=0.293, sky=0.819, road=0.645, mountain=0.393, building=0.023, background=nan
|
| 13 |
+
saved best (tree_old 0.850)
|
| 14 |
+
epoch 04/60 loss=0.4366 mIoU(7)=0.596 tree_old=0.853 ground_old=0.892 tree_new_recall=0.967 (65s)
|
| 15 |
+
per-class IoU: tree=0.853, ground=0.892, person=0.372, sky=0.826, road=0.584, mountain=0.475, building=0.169, background=nan
|
| 16 |
+
saved best (tree_old 0.853)
|
| 17 |
+
epoch 05/60 loss=0.3549 mIoU(7)=0.622 tree_old=0.836 ground_old=0.885 tree_new_recall=0.963 (66s)
|
| 18 |
+
per-class IoU: tree=0.836, ground=0.885, person=0.396, sky=0.824, road=0.618, mountain=0.485, building=0.310, background=nan
|
| 19 |
+
epoch 06/60 loss=0.2965 mIoU(7)=0.620 tree_old=0.831 ground_old=0.889 tree_new_recall=0.967 (65s)
|
| 20 |
+
per-class IoU: tree=0.831, ground=0.889, person=0.346, sky=0.820, road=0.710, mountain=0.495, building=0.245, background=nan
|
| 21 |
+
epoch 07/60 loss=0.2606 mIoU(7)=0.661 tree_old=0.860 ground_old=0.904 tree_new_recall=0.981 (66s)
|
| 22 |
+
per-class IoU: tree=0.860, ground=0.904, person=0.396, sky=0.830, road=0.643, mountain=0.552, building=0.439, background=nan
|
| 23 |
+
saved best (tree_old 0.860)
|
| 24 |
+
epoch 08/60 loss=0.2286 mIoU(7)=0.644 tree_old=0.854 ground_old=0.902 tree_new_recall=0.991 (66s)
|
| 25 |
+
per-class IoU: tree=0.854, ground=0.902, person=0.412, sky=0.816, road=0.649, mountain=0.531, building=0.347, background=nan
|
| 26 |
+
epoch 09/60 loss=0.2093 mIoU(7)=0.432 tree_old=0.770 ground_old=0.855 tree_new_recall=0.887 (65s)
|
| 27 |
+
per-class IoU: tree=0.770, ground=0.855, person=0.360, sky=0.435, road=0.118, mountain=0.349, building=0.136, background=nan
|
| 28 |
+
epoch 10/60 loss=0.1984 mIoU(7)=0.661 tree_old=0.834 ground_old=0.883 tree_new_recall=0.990 (66s)
|
| 29 |
+
per-class IoU: tree=0.834, ground=0.883, person=0.404, sky=0.824, road=0.715, mountain=0.523, building=0.442, background=nan
|
| 30 |
+
epoch 11/60 loss=0.1792 mIoU(7)=0.683 tree_old=0.855 ground_old=0.910 tree_new_recall=0.996 (65s)
|
| 31 |
+
per-class IoU: tree=0.855, ground=0.910, person=0.388, sky=0.825, road=0.764, mountain=0.538, building=0.503, background=nan
|
| 32 |
+
epoch 12/60 loss=0.1724 mIoU(7)=0.669 tree_old=0.842 ground_old=0.882 tree_new_recall=0.995 (65s)
|
| 33 |
+
per-class IoU: tree=0.842, ground=0.882, person=0.398, sky=0.823, road=0.720, mountain=0.550, building=0.467, background=nan
|
| 34 |
+
epoch 13/60 loss=0.1639 mIoU(7)=0.683 tree_old=0.834 ground_old=0.883 tree_new_recall=0.994 (66s)
|
| 35 |
+
per-class IoU: tree=0.834, ground=0.883, person=0.421, sky=0.827, road=0.740, mountain=0.560, building=0.513, background=nan
|
| 36 |
+
epoch 14/60 loss=0.1574 mIoU(7)=0.674 tree_old=0.848 ground_old=0.895 tree_new_recall=0.999 (67s)
|
| 37 |
+
per-class IoU: tree=0.848, ground=0.895, person=0.436, sky=0.809, road=0.727, mountain=0.547, building=0.458, background=nan
|
| 38 |
+
epoch 15/60 loss=0.1535 mIoU(7)=0.664 tree_old=0.847 ground_old=0.897 tree_new_recall=0.998 (65s)
|
| 39 |
+
per-class IoU: tree=0.847, ground=0.897, person=0.380, sky=0.794, road=0.686, mountain=0.559, building=0.483, background=nan
|
| 40 |
+
epoch 16/60 loss=0.1486 mIoU(7)=0.685 tree_old=0.851 ground_old=0.893 tree_new_recall=0.997 (67s)
|
| 41 |
+
per-class IoU: tree=0.851, ground=0.893, person=0.496, sky=0.803, road=0.692, mountain=0.567, building=0.492, background=nan
|
| 42 |
+
epoch 17/60 loss=0.1457 mIoU(7)=0.698 tree_old=0.856 ground_old=0.901 tree_new_recall=0.997 (68s)
|
| 43 |
+
per-class IoU: tree=0.856, ground=0.901, person=0.460, sky=0.826, road=0.720, mountain=0.570, building=0.550, background=nan
|
| 44 |
+
epoch 18/60 loss=0.1392 mIoU(7)=0.672 tree_old=0.859 ground_old=0.908 tree_new_recall=0.998 (66s)
|
| 45 |
+
per-class IoU: tree=0.859, ground=0.908, person=0.417, sky=0.830, road=0.739, mountain=0.579, building=0.368, background=nan
|
| 46 |
+
epoch 19/60 loss=0.1354 mIoU(7)=0.687 tree_old=0.862 ground_old=0.903 tree_new_recall=0.999 (65s)
|
| 47 |
+
per-class IoU: tree=0.862, ground=0.903, person=0.442, sky=0.821, road=0.694, mountain=0.581, building=0.505, background=nan
|
| 48 |
+
saved best (tree_old 0.862)
|
| 49 |
+
epoch 20/60 loss=0.1325 mIoU(7)=0.702 tree_old=0.863 ground_old=0.909 tree_new_recall=0.995 (67s)
|
| 50 |
+
per-class IoU: tree=0.863, ground=0.909, person=0.411, sky=0.829, road=0.747, mountain=0.536, building=0.620, background=nan
|
| 51 |
+
saved best (tree_old 0.863)
|
| 52 |
+
epoch 21/60 loss=0.1303 mIoU(7)=0.676 tree_old=0.859 ground_old=0.899 tree_new_recall=0.996 (66s)
|
| 53 |
+
per-class IoU: tree=0.859, ground=0.899, person=0.390, sky=0.825, road=0.689, mountain=0.595, building=0.473, background=nan
|
| 54 |
+
epoch 22/60 loss=0.1275 mIoU(7)=0.713 tree_old=0.865 ground_old=0.907 tree_new_recall=0.998 (65s)
|
| 55 |
+
per-class IoU: tree=0.865, ground=0.907, person=0.490, sky=0.820, road=0.724, mountain=0.576, building=0.606, background=nan
|
| 56 |
+
saved best (tree_old 0.865)
|
| 57 |
+
epoch 23/60 loss=0.1288 mIoU(7)=0.711 tree_old=0.864 ground_old=0.909 tree_new_recall=0.999 (67s)
|
| 58 |
+
per-class IoU: tree=0.864, ground=0.909, person=0.458, sky=0.827, road=0.728, mountain=0.577, building=0.611, background=nan
|
| 59 |
+
epoch 24/60 loss=0.1230 mIoU(7)=0.696 tree_old=0.863 ground_old=0.912 tree_new_recall=0.999 (66s)
|
| 60 |
+
per-class IoU: tree=0.863, ground=0.912, person=0.431, sky=0.820, road=0.757, mountain=0.584, building=0.506, background=nan
|
| 61 |
+
epoch 25/60 loss=0.1228 mIoU(7)=0.700 tree_old=0.857 ground_old=0.912 tree_new_recall=1.000 (65s)
|
| 62 |
+
per-class IoU: tree=0.857, ground=0.912, person=0.444, sky=0.824, road=0.802, mountain=0.571, building=0.494, background=nan
|
| 63 |
+
epoch 26/60 loss=0.1200 mIoU(7)=0.695 tree_old=0.866 ground_old=0.913 tree_new_recall=0.999 (67s)
|
| 64 |
+
per-class IoU: tree=0.866, ground=0.913, person=0.422, sky=0.837, road=0.716, mountain=0.563, building=0.549, background=nan
|
| 65 |
+
saved best (tree_old 0.866)
|
| 66 |
+
epoch 27/60 loss=0.1185 mIoU(7)=0.695 tree_old=0.862 ground_old=0.908 tree_new_recall=0.999 (66s)
|
| 67 |
+
per-class IoU: tree=0.862, ground=0.908, person=0.407, sky=0.828, road=0.732, mountain=0.578, building=0.550, background=nan
|
| 68 |
+
epoch 28/60 loss=0.1161 mIoU(7)=0.714 tree_old=0.860 ground_old=0.910 tree_new_recall=0.998 (64s)
|
| 69 |
+
per-class IoU: tree=0.860, ground=0.910, person=0.458, sky=0.826, road=0.768, mountain=0.577, building=0.596, background=nan
|
| 70 |
+
epoch 29/60 loss=0.1151 mIoU(7)=0.708 tree_old=0.872 ground_old=0.916 tree_new_recall=0.999 (66s)
|
| 71 |
+
per-class IoU: tree=0.872, ground=0.916, person=0.441, sky=0.835, road=0.745, mountain=0.592, building=0.555, background=nan
|
| 72 |
+
saved best (tree_old 0.872)
|
| 73 |
+
epoch 30/60 loss=0.1131 mIoU(7)=0.698 tree_old=0.865 ground_old=0.910 tree_new_recall=0.999 (66s)
|
| 74 |
+
per-class IoU: tree=0.865, ground=0.910, person=0.467, sky=0.834, road=0.755, mountain=0.592, building=0.467, background=nan
|
| 75 |
+
epoch 31/60 loss=0.1110 mIoU(7)=0.694 tree_old=0.855 ground_old=0.905 tree_new_recall=0.999 (65s)
|
| 76 |
+
per-class IoU: tree=0.855, ground=0.905, person=0.433, sky=0.812, road=0.772, mountain=0.586, building=0.498, background=nan
|
| 77 |
+
epoch 32/60 loss=0.1115 mIoU(7)=0.719 tree_old=0.865 ground_old=0.916 tree_new_recall=0.998 (67s)
|
| 78 |
+
per-class IoU: tree=0.865, ground=0.916, person=0.466, sky=0.833, road=0.803, mountain=0.578, building=0.570, background=nan
|
| 79 |
+
epoch 33/60 loss=0.1088 mIoU(7)=0.711 tree_old=0.869 ground_old=0.916 tree_new_recall=0.999 (69s)
|
| 80 |
+
per-class IoU: tree=0.869, ground=0.916, person=0.494, sky=0.838, road=0.761, mountain=0.595, building=0.502, background=nan
|
| 81 |
+
epoch 34/60 loss=0.1071 mIoU(7)=0.702 tree_old=0.865 ground_old=0.910 tree_new_recall=0.999 (70s)
|
| 82 |
+
per-class IoU: tree=0.865, ground=0.910, person=0.455, sky=0.827, road=0.753, mountain=0.562, building=0.541, background=nan
|
| 83 |
+
epoch 35/60 loss=0.1064 mIoU(7)=0.696 tree_old=0.861 ground_old=0.908 tree_new_recall=1.000 (65s)
|
| 84 |
+
per-class IoU: tree=0.861, ground=0.908, person=0.476, sky=0.816, road=0.754, mountain=0.578, building=0.478, background=nan
|
| 85 |
+
epoch 36/60 loss=0.1052 mIoU(7)=0.704 tree_old=0.860 ground_old=0.910 tree_new_recall=0.999 (66s)
|
| 86 |
+
per-class IoU: tree=0.860, ground=0.910, person=0.477, sky=0.829, road=0.787, mountain=0.592, building=0.470, background=nan
|
| 87 |
+
epoch 37/60 loss=0.1050 mIoU(7)=0.703 tree_old=0.860 ground_old=0.908 tree_new_recall=0.999 (64s)
|
| 88 |
+
per-class IoU: tree=0.860, ground=0.908, person=0.479, sky=0.827, road=0.769, mountain=0.593, building=0.488, background=nan
|
| 89 |
+
epoch 38/60 loss=0.1034 mIoU(7)=0.704 tree_old=0.863 ground_old=0.911 tree_new_recall=0.999 (63s)
|
| 90 |
+
per-class IoU: tree=0.863, ground=0.911, person=0.441, sky=0.829, road=0.778, mountain=0.587, building=0.521, background=nan
|
| 91 |
+
epoch 39/60 loss=0.1025 mIoU(7)=0.713 tree_old=0.865 ground_old=0.912 tree_new_recall=0.999 (64s)
|
| 92 |
+
per-class IoU: tree=0.865, ground=0.912, person=0.449, sky=0.842, road=0.760, mountain=0.597, building=0.565, background=nan
|
| 93 |
+
epoch 40/60 loss=0.1010 mIoU(7)=0.713 tree_old=0.858 ground_old=0.909 tree_new_recall=0.999 (65s)
|
| 94 |
+
per-class IoU: tree=0.858, ground=0.909, person=0.477, sky=0.820, road=0.800, mountain=0.596, building=0.530, background=nan
|
| 95 |
+
epoch 41/60 loss=0.0999 mIoU(7)=0.704 tree_old=0.862 ground_old=0.911 tree_new_recall=1.000 (66s)
|
| 96 |
+
per-class IoU: tree=0.862, ground=0.911, person=0.455, sky=0.815, road=0.788, mountain=0.581, building=0.514, background=nan
|
| 97 |
+
epoch 42/60 loss=0.0992 mIoU(7)=0.713 tree_old=0.866 ground_old=0.916 tree_new_recall=0.999 (66s)
|
| 98 |
+
per-class IoU: tree=0.866, ground=0.916, person=0.453, sky=0.836, road=0.804, mountain=0.595, building=0.524, background=nan
|
| 99 |
+
epoch 43/60 loss=0.0980 mIoU(7)=0.717 tree_old=0.860 ground_old=0.909 tree_new_recall=1.000 (65s)
|
| 100 |
+
per-class IoU: tree=0.860, ground=0.909, person=0.460, sky=0.822, road=0.814, mountain=0.591, building=0.559, background=nan
|
| 101 |
+
epoch 44/60 loss=0.0975 mIoU(7)=0.704 tree_old=0.860 ground_old=0.910 tree_new_recall=0.999 (65s)
|
| 102 |
+
per-class IoU: tree=0.860, ground=0.910, person=0.444, sky=0.830, road=0.809, mountain=0.578, building=0.495, background=nan
|
| 103 |
+
epoch 45/60 loss=0.0967 mIoU(7)=0.720 tree_old=0.861 ground_old=0.910 tree_new_recall=0.999 (66s)
|
| 104 |
+
per-class IoU: tree=0.861, ground=0.910, person=0.462, sky=0.830, road=0.809, mountain=0.598, building=0.574, background=nan
|
| 105 |
+
epoch 46/60 loss=0.0958 mIoU(7)=0.715 tree_old=0.859 ground_old=0.904 tree_new_recall=0.999 (64s)
|
| 106 |
+
per-class IoU: tree=0.859, ground=0.904, person=0.457, sky=0.825, road=0.787, mountain=0.600, building=0.571, background=nan
|
| 107 |
+
epoch 47/60 loss=0.0953 mIoU(7)=0.717 tree_old=0.863 ground_old=0.912 tree_new_recall=0.999 (65s)
|
| 108 |
+
per-class IoU: tree=0.863, ground=0.912, person=0.466, sky=0.818, road=0.799, mountain=0.601, building=0.560, background=nan
|
| 109 |
+
epoch 48/60 loss=0.0942 mIoU(7)=0.717 tree_old=0.865 ground_old=0.914 tree_new_recall=0.999 (65s)
|
| 110 |
+
per-class IoU: tree=0.865, ground=0.914, person=0.468, sky=0.829, road=0.800, mountain=0.592, building=0.551, background=nan
|
| 111 |
+
epoch 49/60 loss=0.0938 mIoU(7)=0.715 tree_old=0.863 ground_old=0.911 tree_new_recall=0.999 (66s)
|
| 112 |
+
per-class IoU: tree=0.863, ground=0.911, person=0.479, sky=0.825, road=0.794, mountain=0.594, building=0.535, background=nan
|
| 113 |
+
epoch 50/60 loss=0.0934 mIoU(7)=0.718 tree_old=0.862 ground_old=0.912 tree_new_recall=0.999 (65s)
|
| 114 |
+
per-class IoU: tree=0.862, ground=0.912, person=0.469, sky=0.828, road=0.812, mountain=0.590, building=0.551, background=nan
|
| 115 |
+
epoch 51/60 loss=0.0931 mIoU(7)=0.717 tree_old=0.861 ground_old=0.913 tree_new_recall=0.999 (64s)
|
| 116 |
+
per-class IoU: tree=0.861, ground=0.913, person=0.460, sky=0.821, road=0.818, mountain=0.593, building=0.551, background=nan
|
| 117 |
+
epoch 52/60 loss=0.0926 mIoU(7)=0.715 tree_old=0.860 ground_old=0.910 tree_new_recall=0.999 (65s)
|
| 118 |
+
per-class IoU: tree=0.860, ground=0.910, person=0.460, sky=0.824, road=0.804, mountain=0.597, building=0.547, background=nan
|
| 119 |
+
epoch 53/60 loss=0.0918 mIoU(7)=0.717 tree_old=0.860 ground_old=0.910 tree_new_recall=0.999 (64s)
|
| 120 |
+
per-class IoU: tree=0.860, ground=0.910, person=0.466, sky=0.820, road=0.806, mountain=0.595, building=0.561, background=nan
|
| 121 |
+
epoch 54/60 loss=0.0916 mIoU(7)=0.714 tree_old=0.861 ground_old=0.911 tree_new_recall=1.000 (67s)
|
| 122 |
+
per-class IoU: tree=0.861, ground=0.911, person=0.471, sky=0.822, road=0.807, mountain=0.588, building=0.541, background=nan
|
| 123 |
+
epoch 55/60 loss=0.0912 mIoU(7)=0.719 tree_old=0.863 ground_old=0.913 tree_new_recall=0.999 (69s)
|
| 124 |
+
per-class IoU: tree=0.863, ground=0.913, person=0.476, sky=0.826, road=0.808, mountain=0.593, building=0.554, background=nan
|
| 125 |
+
epoch 56/60 loss=0.0911 mIoU(7)=0.717 tree_old=0.862 ground_old=0.913 tree_new_recall=0.999 (70s)
|
| 126 |
+
per-class IoU: tree=0.862, ground=0.913, person=0.475, sky=0.823, road=0.807, mountain=0.596, building=0.545, background=nan
|
| 127 |
+
epoch 57/60 loss=0.0913 mIoU(7)=0.715 tree_old=0.860 ground_old=0.910 tree_new_recall=0.999 (68s)
|
| 128 |
+
per-class IoU: tree=0.860, ground=0.910, person=0.465, sky=0.825, road=0.802, mountain=0.593, building=0.548, background=nan
|
| 129 |
+
epoch 58/60 loss=0.0911 mIoU(7)=0.718 tree_old=0.863 ground_old=0.913 tree_new_recall=0.999 (65s)
|
| 130 |
+
per-class IoU: tree=0.863, ground=0.913, person=0.470, sky=0.824, road=0.813, mountain=0.594, building=0.552, background=nan
|
| 131 |
+
epoch 59/60 loss=0.0907 mIoU(7)=0.716 tree_old=0.862 ground_old=0.912 tree_new_recall=0.999 (64s)
|
| 132 |
+
per-class IoU: tree=0.862, ground=0.912, person=0.467, sky=0.824, road=0.803, mountain=0.596, building=0.549, background=nan
|
| 133 |
+
epoch 60/60 loss=0.0903 mIoU(7)=0.718 tree_old=0.862 ground_old=0.914 tree_new_recall=0.999 (67s)
|
| 134 |
+
per-class IoU: tree=0.862, ground=0.914, person=0.466, sky=0.825, road=0.811, mountain=0.593, building=0.558, background=nan
|
| 135 |
+
|
| 136 |
+
=== DONE === best tree_old IoU: 0.872
|
| 137 |
+
|
| 138 |
+
=== FPS BENCHMARK (RTX 3080, batch=1, 640x360) ===
|
| 139 |
+
TwinLiteNet8 @ 640x360 batch=1: 137.1 FPS
|
| 140 |
+
Jetson Orin Nano estimate: ~34-46 FPS
|
twinlite8.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c499f31d388b377f6234db8b6417418846c73b003cc9b9fbc8369e854a823056
|
| 3 |
+
size 1787561
|
twinlite8_best.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:933bbf0b34134823c2cf9fb9eafd71a8362e26b14fff5eca551bdf78f76badab
|
| 3 |
+
size 1815544
|