ldxxx nielsr HF Staff commited on
Commit
a2ddf3d
·
verified ·
1 Parent(s): c8cb3e0

Add image-segmentation pipeline tag, PyTorch library, and usage examples (#1)

Browse files

- Add image-segmentation pipeline tag, PyTorch library, and usage examples (ee9a3baa52b652529eeb43928f736aa7c79ec77f)


Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +78 -0
README.md CHANGED
@@ -1,5 +1,7 @@
1
  ---
2
  license: apache-2.0
 
 
3
  ---
4
 
5
  # UAGLNet
@@ -9,3 +11,79 @@ license: apache-2.0
9
  **Authors:** [Dstate](https://github.com/Dstate) | **License:** Apache 2.0
10
 
11
  **Paper:** *“UAGLNet: Uncertainty-Aggregated Global-Local Fusion Network with Cooperative CNN-Transformer for Building Extraction”* ([arXiv:2512.12941](https://arxiv.org/abs/2512.12941))
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ pipeline_tag: image-segmentation
4
+ library_name: pytorch
5
  ---
6
 
7
  # UAGLNet
 
11
  **Authors:** [Dstate](https://github.com/Dstate) | **License:** Apache 2.0
12
 
13
  **Paper:** *“UAGLNet: Uncertainty-Aggregated Global-Local Fusion Network with Cooperative CNN-Transformer for Building Extraction”* ([arXiv:2512.12941](https://arxiv.org/abs/2512.12941))
14
+
15
+ 🔥 **UAGLNet has been accepted by IEEE TGRS**
16
+
17
+ We present UAGLNet, which is capable to exploit high-quality global-local visual semantics under the guidance of uncertainty modeling. Specifically, we propose a novel cooperative encoder, which adopts hybrid CNN and transformer layers at different stages to capture the local and global visual semantics, respectively. An intermediate cooperative interaction block (CIB) is designed to narrow the gap between the local and global features when the network becomes deeper. Afterwards, we propose a Global-Local Fusion (GLF) module to complementarily fuse the global and local representations. Moreover, to mitigate the segmentation ambiguity in uncertain regions, we propose an Uncertainty-Aggregated Decoder (UAD) to explicitly estimate the pixel-wise uncertainty to enhance the segmentation accuracy. Extensive experiments demonstrate that our method achieves superior performance to other state-of-the-art methods.
18
+
19
+ <img width="1000" src="https://github.com/Dstate/UAGLNet/raw/main/assets/architecture2.png">
20
+
21
+ ## Quick Start
22
+
23
+ ### Installation
24
+
25
+ Clone this repository and create the environment.
26
+ ```bash
27
+ git git@github.com:Dstate/UAGLNet.git
28
+ cd UAGLNet
29
+
30
+ conda create -n uaglnet python=3.8 -y
31
+ conda activate uaglnet
32
+ conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=11.3 -c pytorch -c conda-forge
33
+ pip install -r requirements.txt
34
+ ```
35
+
36
+ ### Data Preprocessing
37
+
38
+ We conduct experiments on the Inria, WHU, and Massachusetts datasets. Detailed guidance for dataset preprocessing is provided here: [DATA_PREPARATION.md](https://github.com/Dstate/UAGLNet/blob/main/assets/DATA_PREPARATION.md).
39
+
40
+ ### Training & Testing
41
+
42
+ Training and testing examples on the Inria dataset:
43
+ ```bash
44
+ # training
45
+ python UAGLNet_train.py -c config/inria/UAGLNet.py
46
+
47
+ # testing
48
+ python UAGLNet_test.py -c config/inria/UAGLNet.py
49
+ ```
50
+
51
+ ### Main Results
52
+
53
+ The following table presents the performance of UAGLNet on building extraction benchmarks.
54
+
55
+ | **Benchmark** | **IoU** | **F1** | **P** | **R** | **Weight** |
56
+ | :-------: | :--------: | :--------: | :-----------: | :------: | :------: |
57
+ | Inria | 83.74 | 91.15 | 92.09 | 90.22 | [UAGLNet_Inria](https://huggingface.co/ldxxx/UAGLNet_Inria) |
58
+ | Mass | 76.97 | 86.99 | 88.28 | 85.73 | [UAGLNet_Mass](https://huggingface.co/ldxxx/UAGLNet_Massachusetts) |
59
+ | WHU | 92.07 | 95.87 | 96.21 | 95.54 | [UAGLNet_WHU](https://huggingface.co/ldxxx/UAGLNet_WHU) |
60
+
61
+ You can quickly reproduce these results by running `Reproduce.py`, which will load the pretrained checkpoints from Hugging Face and perform inference.
62
+
63
+ ```bash
64
+ # Inria
65
+ python Reproduce.py -d Inria
66
+
67
+ # Massachusetts
68
+ python Reproduce.py -d Mass
69
+
70
+ # WHU
71
+ python Reproduce.py -d WHU
72
+ ```
73
+
74
+ ## Citation
75
+ If you find this project useful in your research, please cite it as:
76
+ ```
77
+ @article{UAGLNet,
78
+ title = {UAGLNet: Uncertainty-Aggregated Global-Local Fusion Network with Cooperative CNN-Transformer for Building Extraction},
79
+ author = {Siyuan Yao and Dongxiu Liu and Taotao Li and Shengjie Li and Wenqi Ren and Xiaochun Cao},
80
+ journal = {arXiv preprint arXiv:2512.12941},
81
+ year = {2025}
82
+ }
83
+ ```
84
+
85
+ ## Acknowledgement
86
+ This work is built upon [BuildingExtraction](https://github.com/stdcoutzrh/BuildingExtraction), [GeoSeg](https://github.com/WangLibo1995/GeoSeg/tree/main) and [SMT](https://github.com/AFeng-x/SMT). We sincerely appreciate their contributions which provide a clear pipeline and well-organized code.
87
+
88
+ ## License
89
+ This project is licensed under the [Apache License 2.0](https://github.com/Dstate/UAGLNet/blob/main/LICENSE).