| | --- |
| | license: apache-2.0 |
| | library_name: diffusers |
| | tags: |
| | - hsigene |
| | - hyperspectral |
| | - latent-diffusion |
| | - controlnet |
| | - arxiv:2409.12470 |
| | pipeline_tag: image-to-image |
| | --- |
| | |
| | > [!WARNING] we do not have a full checkpoint conversion validation, if you encounter pipeline loading failure and unsidered output, please contact me via bili_sakura@zju.edu.cn |
| | |
| | # BiliSakura/HSIGene |
| | |
| | **Hyperspectral image generation** — HSIGene converted to diffusers format. Supports task-specific conditioning with local controls (HED, MLSD, sketch, segmentation), global controls (content or text), or metadata embeddings. Outputs 48-band hyperspectral images (256×256 pixels). |
| | |
| | > Source: [HSIGene](https://arxiv.org/abs/2409.12470). Converted to diffusers format; model dir is self-contained (no external project for inference). |
| | |
| | ## Repository Structure (after conversion) |
| | |
| | | Component | Path | |
| | |------------------------|--------------------------| |
| | | UNet (LocalControlUNet)| `unet/` | |
| | | VAE | `vae/` | |
| | | Text encoder (CLIP) | `text_encoder/` | |
| | | Local adapter | `local_adapter/` | |
| | | Global content adapter| `global_content_adapter/`| |
| | | Global text adapter | `global_text_adapter/` | |
| | | Metadata encoder | `metadata_encoder/` | |
| | | Scheduler | `scheduler/` | |
| | | Pipeline | `pipeline_hsigene.py` | |
| | | Config | `model_index.json` | |
| |
|
| | ## Usage |
| |
|
| | **Inference Demo (`DiffusionPipeline.from_pretrained`)** |
| | |
| | ```python |
| | from diffusers import DiffusionPipeline |
| | pipe = DiffusionPipeline.from_pretrained( |
| | "/path/to/BiliSakura/HSIGene", |
| | trust_remote_code=True, |
| | custom_pipeline="path/to/pipeline_hsigene.py", |
| | model_path="path/to/BiliSakura/HSIGene" |
| | ) |
| | pipe = pipe.to("cuda") |
| | ``` |
| | |
| | **Dependencies:** `pip install diffusers transformers torch einops safetensors` |
| | |
| | ### Per-Condition Inference Demos (Not Combined) |
| | |
| | `local_conditions` shape: `(B, 18, H, W)`; `global_conditions` shape: `(B, 768)`; `metadata` shape: `(7,)` or `(B, 7)`. |
| | |
| | ```python |
| | # HED condition |
| | output = pipe(prompt="", local_conditions=hed_local, global_conditions=None, metadata=None) |
| | ``` |
| | |
| | ```python |
| | # MLSD condition |
| | output = pipe(prompt="", local_conditions=mlsd_local, global_conditions=None, metadata=None) |
| | ``` |
| | |
| | ```python |
| | # Sketch condition |
| | output = pipe(prompt="", local_conditions=sketch_local, global_conditions=None, metadata=None) |
| | ``` |
| | |
| | ```python |
| | # Segmentation condition |
| | output = pipe(prompt="", local_conditions=seg_local, global_conditions=None, metadata=None) |
| | ``` |
| | |
| | ```python |
| | # Content condition (global) |
| | output = pipe(prompt="", local_conditions=None, global_conditions=content_global, metadata=None) |
| | ``` |
| | |
| | ```python |
| | # Text condition |
| | output = pipe(prompt="Wasteland", local_conditions=None, global_conditions=None, metadata=None) |
| | ``` |
| | |
| | ```python |
| | # Metadata condition |
| | output = pipe(prompt="", local_conditions=None, global_conditions=None, metadata=metadata_vec) |
| | ``` |
| | |
| | ## Model Sources |
| | |
| | - **Paper**: [HSIGene: A Foundation Model For Hyperspectral Image Generation](https://arxiv.org/abs/2409.12470) |
| | - **Checkpoint**: [GoogleDrive](https://drive.google.com/file/d/1euJAbsxCgG1wIu_Eh5nPfmiSP9suWsR4/view?usp=drive_link) |
| | - **Annotators**: [BaiduNetdisk](https://pan.baidu.com/s/1K1Y__blA6uJVV9l1QG7QvQ?pwd=98f1) (code: 98f1) → `data_prepare/annotator/ckpts` |
| | |
| | ## Citation |
| | |
| | ```bibtex |
| | @article{pangHSIGeneFoundationModel2026, |
| | title = {{{HSIGene}}: {{A Foundation Model}} for {{Hyperspectral Image Generation}}}, |
| | shorttitle = {{{HSIGene}}}, |
| | author = {Pang, Li and Cao, Xiangyong and Tang, Datao and Xu, Shuang and Bai, Xueru and Zhou, Feng and Meng, Deyu}, |
| | year = 2026, |
| | month = jan, |
| | journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence}, |
| | volume = {48}, |
| | number = {1}, |
| | pages = {730--746}, |
| | issn = {1939-3539}, |
| | doi = {10.1109/TPAMI.2025.3610927}, |
| | urldate = {2026-01-02}, |
| | keywords = {Adaptation models,Computational modeling,Controllable generation,deep learning,diffusion model,Diffusion models,Foundation models,hyperspectral image synthesis,Hyperspectral imaging,Image synthesis,Noise reduction,Reliability,Superresolution,Training} |
| | } |
| | |
| | ``` |
| | |