BOUNG
/

CLFT-Sparse-AKS

+---
+license: mit
+tags:
+  - semantic-segmentation
+  - camera-lidar-fusion
+  - autonomous-driving
+  - waymo
+  - pytorch
+datasets:
+  - waymo
+language:
+  - en
+---
+# CLFT-Sparse-AKS
+**Camera-LiDAR Fusion Transformer with Sparse Adaptive Kernel Selection**
+## Model Description
+CLFT-Sparse-AKS is a multi-modal semantic segmentation model that fuses camera (RGB) and LiDAR data for autonomous driving applications.
+### Key Features
+- **Sparse Adaptive Kernel Selection** [3, 5, 7, 9] - Distance-based kernel size selection
+- **Semantic-Guided Depth Supervision** - Direct supervision for kernel prediction
+- **SS2D (State Space 2D)** - Mamba-based global context aggregation
+- **CUDA Graph Optimization** - Efficient sparse attention processing
+## Performance (Waymo Dataset)
+| Condition | Vehicle IoU | Human IoU |
+|-----------|-------------|-----------|
+| Day-Clear | 93.01% | 71.95% |
+| Day-Rain | 93.84% | 70.45% |
+| Night-Clear | 92.80% | 71.47% |
+| Night-Rain | 91.99% | 67.54% |
+| **Average** | **92.91%** | **70.35%** |
+- **Best Human IoU**: 73.09% (Epoch 269)
+- **Inference Time**: 28.95ms (34.5 FPS)
+- **Parameters**: 120.03M
+- **VRAM**: 3.46GB
+## Usage
+```python
+import torch
+from models.clft_sparse import CLFT_Sparse
+# Load model
+model = CLFT_Sparse(...)
+checkpoint = torch.load('clft_sparse_epoch_269_best_human.pth')
+model.load_state_dict(checkpoint['model_state_dict'])
+model.eval()
+# Inference
+with torch.no_grad():
+    output = model(rgb_input, lidar_input)
+```
+## Requirements
+- Python 3.10
+- PyTorch 2.9.0+cu128
+- NATTEN 0.21.1
+- Mamba-SSM 2.3.0
+## Citation
+```bibtex
+@misc{clft_sparse_aks_2026,
+  title={CLFT-Sparse-AKS: Camera-LiDAR Fusion with Sparse Adaptive Kernel Selection},
+  author={Young},
+  year={2026},
+  url={https://github.com/mw701/CLFT_AKS}
+}
+```
+## Links
+- **GitHub**: [https://github.com/mw701/CLFT_AKS](https://github.com/mw701/CLFT_AKS)
+- **Technical Report**: See `docs/CLFT_Sparse_AKS_Technical_Report.md`