Improve model card and add metadata
#1
by
nielsr HF Staff - opened
README.md
CHANGED
|
@@ -1,39 +1,50 @@
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
license: other
|
| 3 |
license_name: nvidia-license-non-commercial
|
| 4 |
license_link: LICENSE
|
| 5 |
-
|
| 6 |
-
- handsomeWilliam/Relation252K
|
| 7 |
-
base_model:
|
| 8 |
-
- black-forest-labs/FLUX.1-Kontext-dev
|
| 9 |
---
|
| 10 |
|
| 11 |
-
# LoRWeB
|
| 12 |
-
|
| 13 |
-
<div align="center">
|
| 14 |
-
<a href="https://arxiv.org/">ArXiv</a> | <a href="https://github.com/NVlabs/LoRWeB" style="display:inline;text-decoration:underline;"><img width="20" height="20" style="display:inline;margin:0;" src="https://img.icons8.com/ios-glyphs/30/github.png" alt="github"> GitHub Repository</a> | <a href="https://research.nvidia.com/labs/par/lorweb"> ๐ Project Website</a> | <a href="https://huggingface.co/datasets/hilamanor/LoRWeB_evalset">๐ค Evaluation Dataset (Comming Soon)</a>
|
| 15 |
-
</div>
|
| 16 |
|
| 17 |
<div align="center">
|
| 18 |
-
|
| 19 |
-
**Hila Manor**<sup>1,2</sup>,  **Rinon Gal**<sup>2</sup>,  **Haggai Maron**<sup>1,2</sup>,  **Tomer Michaeli**<sup>1</sup>,  **Gal Chechik**<sup>2,3</sup>
|
| 20 |
|
| 21 |
-
|
|
|
|
|
|
|
|
|
|
| 22 |
|
| 23 |
</div>
|
| 24 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
<div align="center">
|
| 26 |
-
<img src="https://github.com/NVlabs/LoRWeB/raw/main/assets/teaser.jpg" alt="Teaser" width="800"/>
|
| 27 |
|
| 28 |
<i>Given a prompt and an image triplet {**a**, **a'**, **b**} that visually describe a desired transformation, LoRWeB dynamically constructs a single LoRA from a learnable basis of LoRA modules, and produces an editing result **b'** that applies the same analogy to the new image.</i>
|
| 29 |
</div>
|
| 30 |
|
|
|
|
| 31 |
|
| 32 |
-
|
| 33 |
|
| 34 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
|
| 36 |
-
##
|
| 37 |
|
| 38 |
If you use this model in your research, please cite:
|
| 39 |
|
|
@@ -41,9 +52,15 @@ If you use this model in your research, please cite:
|
|
| 41 |
@article{manor2026lorweb,
|
| 42 |
title={Spanning the Visual Analogy Space with a Weight Basis of LoRAs},
|
| 43 |
author={Manor, Hila and Gal, Rinon and Maron, Haggai and Michaeli, Tomer and Chechik, Gal},
|
| 44 |
-
journal={arXiv preprint},
|
| 45 |
year={2026}
|
| 46 |
}
|
| 47 |
```
|
| 48 |
|
|
|
|
| 49 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model:
|
| 3 |
+
- black-forest-labs/FLUX.1-Kontext-dev
|
| 4 |
+
datasets:
|
| 5 |
+
- handsomeWilliam/Relation252K
|
| 6 |
license: other
|
| 7 |
license_name: nvidia-license-non-commercial
|
| 8 |
license_link: LICENSE
|
| 9 |
+
pipeline_tag: image-to-image
|
|
|
|
|
|
|
|
|
|
| 10 |
---
|
| 11 |
|
| 12 |
+
# LoRWeB: Spanning the Visual Analogy Space with a Weight Basis of LoRAs
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
|
| 14 |
<div align="center">
|
|
|
|
|
|
|
| 15 |
|
| 16 |
+
[](https://huggingface.co/papers/2602.15727)
|
| 17 |
+
[](https://research.nvidia.com/labs/par/lorweb)
|
| 18 |
+
[](https://github.com/NVlabs/LoRWeB)
|
| 19 |
+
[](https://huggingface.co/datasets/hilamanor/LoRWeB_evalset)
|
| 20 |
|
| 21 |
</div>
|
| 22 |
|
| 23 |
+
Visual analogy learning enables image manipulation through demonstration rather than textual description, allowing users to specify complex transformations difficult to articulate in words. Given a triplet $\{a, a', b\}$, the goal is to generate $b'$ such that $a : a' :: b : b'$.
|
| 24 |
+
|
| 25 |
+
**LoRWeB** specializes the model for each analogy task at inference time through dynamic composition of learned transformation primitives. It introduces a learnable basis of LoRA modules to span the space of different visual transformations and a lightweight encoder that dynamically selects and weighs these basis LoRAs based on the input analogy pair.
|
| 26 |
+
|
| 27 |
<div align="center">
|
| 28 |
+
<img src="https://github.com/NVlabs/LoRWeB/raw/main/assets/teaser.jpg" alt="LoRWeB Teaser" width="800"/>
|
| 29 |
|
| 30 |
<i>Given a prompt and an image triplet {**a**, **a'**, **b**} that visually describe a desired transformation, LoRWeB dynamically constructs a single LoRA from a learnable basis of LoRA modules, and produces an editing result **b'** that applies the same analogy to the new image.</i>
|
| 31 |
</div>
|
| 32 |
|
| 33 |
+
## Sample Usage
|
| 34 |
|
| 35 |
+
To perform inference using the LoRWeB weights, use the `inference.py` script from the [official GitHub repository](https://github.com/NVlabs/LoRWeB):
|
| 36 |
|
| 37 |
+
```bash
|
| 38 |
+
python inference.py \
|
| 39 |
+
-w "path/to/lorweb_model.safetensors" \
|
| 40 |
+
-c "config/your_config.yaml" \
|
| 41 |
+
-a "data/path_to_a_img.jpg" \
|
| 42 |
+
-t "data/path_to_atag_img.jpg" \
|
| 43 |
+
-b "data/path_to_b_img.jpg" \
|
| 44 |
+
-o "outputs/generated_btag_img_path.jpg"
|
| 45 |
+
```
|
| 46 |
|
| 47 |
+
## Citation
|
| 48 |
|
| 49 |
If you use this model in your research, please cite:
|
| 50 |
|
|
|
|
| 52 |
@article{manor2026lorweb,
|
| 53 |
title={Spanning the Visual Analogy Space with a Weight Basis of LoRAs},
|
| 54 |
author={Manor, Hila and Gal, Rinon and Maron, Haggai and Michaeli, Tomer and Chechik, Gal},
|
| 55 |
+
journal={arXiv preprint arXiv:2602.15727},
|
| 56 |
year={2026}
|
| 57 |
}
|
| 58 |
```
|
| 59 |
|
| 60 |
+
## Acknowledgements
|
| 61 |
|
| 62 |
+
This project builds upon:
|
| 63 |
+
- [FLUX.1-Kontext](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev) by Black Forest Labs
|
| 64 |
+
- [Diffusers](https://github.com/huggingface/diffusers) by Hugging Face
|
| 65 |
+
- [PEFT](https://github.com/huggingface/peft) by Hugging Face
|
| 66 |
+
- [AI-Toolkit](https://github.com/ostris/ai-toolkit) for training infrastructure
|