File size: 4,217 Bytes
66a2b45
 
 
 
 
 
 
 
 
 
 
 
7f25226
 
66a2b45
 
0bd0c8c
66a2b45
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0bd0c8c
66a2b45
 
0bd0c8c
 
 
 
 
 
 
66a2b45
 
 
0bd0c8c
66a2b45
0bd0c8c
66a2b45
0bd0c8c
 
 
 
 
66a2b45
 
0bd0c8c
 
 
 
66a2b45
 
0bd0c8c
 
66a2b45
 
0bd0c8c
 
 
 
66a2b45
 
0bd0c8c
 
66a2b45
 
0bd0c8c
 
 
 
66a2b45
0bd0c8c
 
 
 
66a2b45
 
 
 
 
 
 
 
 
 
0bd0c8c
 
 
 
 
 
 
 
 
 
 
 
 
 
66a2b45
0bd0c8c
66a2b45
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
---
license: apache-2.0
library_name: diffusers
tags:
  - hsigene
  - hyperspectral
  - latent-diffusion
  - controlnet
  - arxiv:2409.12470
pipeline_tag: image-to-image
---

> [!WARNING] we do not have a full checkpoint conversion validation, if you encounter pipeline loading failure and unsidered output, please contact me via bili_sakura@zju.edu.cn

# BiliSakura/HSIGene

**Hyperspectral image generation** — HSIGene converted to diffusers format. Supports task-specific conditioning with local controls (HED, MLSD, sketch, segmentation), global controls (content or text), or metadata embeddings. Outputs 48-band hyperspectral images (256×256 pixels).

> Source: [HSIGene](https://arxiv.org/abs/2409.12470). Converted to diffusers format; model dir is self-contained (no external project for inference).

## Repository Structure (after conversion)

| Component               | Path                     |
|------------------------|--------------------------|
| UNet (LocalControlUNet)| `unet/`                  |
| VAE                    | `vae/`                   |
| Text encoder (CLIP)    | `text_encoder/`          |
| Local adapter         | `local_adapter/`         |
| Global content adapter| `global_content_adapter/`|
| Global text adapter    | `global_text_adapter/`   |
| Metadata encoder       | `metadata_encoder/`      |
| Scheduler              | `scheduler/`             |
| Pipeline               | `pipeline_hsigene.py`    |
| Config                 | `model_index.json`       |

## Usage

**Inference Demo (`DiffusionPipeline.from_pretrained`)**

```python
from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained(
  "/path/to/BiliSakura/HSIGene",
  trust_remote_code=True,
  custom_pipeline="path/to/pipeline_hsigene.py",
  model_path="path/to/BiliSakura/HSIGene"
)
pipe = pipe.to("cuda")
```

**Dependencies:** `pip install diffusers transformers torch einops safetensors`

### Per-Condition Inference Demos (Not Combined)

`local_conditions` shape: `(B, 18, H, W)`; `global_conditions` shape: `(B, 768)`; `metadata` shape: `(7,)` or `(B, 7)`.

```python
# HED condition
output = pipe(prompt="", local_conditions=hed_local, global_conditions=None, metadata=None)
```

```python
# MLSD condition
output = pipe(prompt="", local_conditions=mlsd_local, global_conditions=None, metadata=None)
```

```python
# Sketch condition
output = pipe(prompt="", local_conditions=sketch_local, global_conditions=None, metadata=None)
```

```python
# Segmentation condition
output = pipe(prompt="", local_conditions=seg_local, global_conditions=None, metadata=None)
```

```python
# Content condition (global)
output = pipe(prompt="", local_conditions=None, global_conditions=content_global, metadata=None)
```

```python
# Text condition
output = pipe(prompt="Wasteland", local_conditions=None, global_conditions=None, metadata=None)
```

```python
# Metadata condition
output = pipe(prompt="", local_conditions=None, global_conditions=None, metadata=metadata_vec)
```

## Model Sources

- **Paper**: [HSIGene: A Foundation Model For Hyperspectral Image Generation](https://arxiv.org/abs/2409.12470)
- **Checkpoint**: [GoogleDrive](https://drive.google.com/file/d/1euJAbsxCgG1wIu_Eh5nPfmiSP9suWsR4/view?usp=drive_link)
- **Annotators**: [BaiduNetdisk](https://pan.baidu.com/s/1K1Y__blA6uJVV9l1QG7QvQ?pwd=98f1) (code: 98f1) → `data_prepare/annotator/ckpts`

## Citation

```bibtex
@article{pangHSIGeneFoundationModel2026,
  title = {{{HSIGene}}: {{A Foundation Model}} for {{Hyperspectral Image Generation}}},
  shorttitle = {{{HSIGene}}},
  author = {Pang, Li and Cao, Xiangyong and Tang, Datao and Xu, Shuang and Bai, Xueru and Zhou, Feng and Meng, Deyu},
  year = 2026,
  month = jan,
  journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
  volume = {48},
  number = {1},
  pages = {730--746},
  issn = {1939-3539},
  doi = {10.1109/TPAMI.2025.3610927},
  urldate = {2026-01-02},
  keywords = {Adaptation models,Computational modeling,Controllable generation,deep learning,diffusion model,Diffusion models,Foundation models,hyperspectral image synthesis,Hyperspectral imaging,Image synthesis,Noise reduction,Reliability,Superresolution,Training}
}

```