hilamanor
/

lorweb

Image-to-Image

Safetensors

Model card Files Files and versions

xet

Community

Improve model card and add metadata

by nielsr HF Staff - opened Feb 24

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+34

-17

Files changed (1) hide show

README.md +34 -17

README.md CHANGED Viewed

@@ -1,39 +1,50 @@
 ---
 license: other
 license_name: nvidia-license-non-commercial
 license_link: LICENSE
-datasets:
-- handsomeWilliam/Relation252K
-base_model:
-- black-forest-labs/FLUX.1-Kontext-dev
 ---
-# LoRWeB Model (Coming Soon)
-<div align="center">
-  <a href="https://arxiv.org/">ArXiv</a> | <a href="https://github.com/NVlabs/LoRWeB" style="display:inline;text-decoration:underline;"><img width="20" height="20" style="display:inline;margin:0;" src="https://img.icons8.com/ios-glyphs/30/github.png" alt="github"> GitHub Repository</a> | <a href="https://research.nvidia.com/labs/par/lorweb"> 🌐 Project Website</a> | <a href="https://huggingface.co/datasets/hilamanor/LoRWeB_evalset">🤗 Evaluation Dataset (Comming Soon)</a>
-</div>
 <div align="center">
-**Hila Manor**<sup>1,2</sup>,&ensp; **Rinon Gal**<sup>2</sup>,&ensp; **Haggai Maron**<sup>1,2</sup>,&ensp; **Tomer Michaeli**<sup>1</sup>,&ensp; **Gal Chechik**<sup>2,3</sup>
-<sup>1</sup>Technion - Israel Institute of Technology  &ensp;&ensp; <sup>2</sup>NVIDIA &ensp;&ensp; <sup>3</sup>Bar-Ilan University
 </div>
 <div align="center">
-<img src="https://github.com/NVlabs/LoRWeB/raw/main/assets/teaser.jpg" alt="Teaser" width="800"/>
 <i>Given a prompt and an image triplet {**a**, **a'**, **b**} that visually describe a desired transformation, LoRWeB dynamically constructs a single LoRA from a learnable basis of LoRA modules, and produces an editing result **b'** that applies the same analogy to the new image.</i>
 </div>
-### ℹ️ Additional Information
-Please see our full modelcard and further details in the [GitHub Repo](https://github.com/NVlabs/LoRWeB)
-## 📚 Citation
 If you use this model in your research, please cite:
@@ -41,9 +52,15 @@ If you use this model in your research, please cite:
 @article{manor2026lorweb,
     title={Spanning the Visual Analogy Space with a Weight Basis of LoRAs},
     author={Manor, Hila and Gal, Rinon and Maron, Haggai and Michaeli, Tomer and Chechik, Gal},
-    journal={arXiv preprint},
     year={2026}
 }
 ```

 ---
+base_model:
+- black-forest-labs/FLUX.1-Kontext-dev
+datasets:
+- handsomeWilliam/Relation252K
 license: other
 license_name: nvidia-license-non-commercial
 license_link: LICENSE
+pipeline_tag: image-to-image
 ---
+# LoRWeB: Spanning the Visual Analogy Space with a Weight Basis of LoRAs
 <div align="center">
+[![arXiv](https://img.shields.io/badge/arXiv-2602.15727-b31b1b.svg)](https://huggingface.co/papers/2602.15727)
+[![Project Website](https://img.shields.io/badge/🌐-Project%20Website-blue)](https://research.nvidia.com/labs/par/lorweb)
+[![GitHub Repository](https://img.shields.io/badge/GitHub-Repository-black?logo=github)](https://github.com/NVlabs/LoRWeB)
+[![Evaluation Dataset](https://img.shields.io/badge/🤗-Evaluation%20Dataset-yellow)](https://huggingface.co/datasets/hilamanor/LoRWeB_evalset)
 </div>
+Visual analogy learning enables image manipulation through demonstration rather than textual description, allowing users to specify complex transformations difficult to articulate in words. Given a triplet $\{a, a', b\}$, the goal is to generate $b'$ such that $a : a' :: b : b'$.
+**LoRWeB** specializes the model for each analogy task at inference time through dynamic composition of learned transformation primitives. It introduces a learnable basis of LoRA modules to span the space of different visual transformations and a lightweight encoder that dynamically selects and weighs these basis LoRAs based on the input analogy pair.
 <div align="center">
+<img src="https://github.com/NVlabs/LoRWeB/raw/main/assets/teaser.jpg" alt="LoRWeB Teaser" width="800"/>
 <i>Given a prompt and an image triplet {**a**, **a'**, **b**} that visually describe a desired transformation, LoRWeB dynamically constructs a single LoRA from a learnable basis of LoRA modules, and produces an editing result **b'** that applies the same analogy to the new image.</i>
 </div>
+## Sample Usage
+To perform inference using the LoRWeB weights, use the `inference.py` script from the [official GitHub repository](https://github.com/NVlabs/LoRWeB):
+```bash
+python inference.py \
+  -w "path/to/lorweb_model.safetensors" \
+  -c "config/your_config.yaml" \
+  -a "data/path_to_a_img.jpg" \
+  -t "data/path_to_atag_img.jpg" \
+  -b "data/path_to_b_img.jpg" \
+  -o "outputs/generated_btag_img_path.jpg"
+```
+## Citation
 If you use this model in your research, please cite:
 @article{manor2026lorweb,
     title={Spanning the Visual Analogy Space with a Weight Basis of LoRAs},
     author={Manor, Hila and Gal, Rinon and Maron, Haggai and Michaeli, Tomer and Chechik, Gal},
+    journal={arXiv preprint arXiv:2602.15727},
     year={2026}
 }
 ```
+## Acknowledgements
+This project builds upon:
+- [FLUX.1-Kontext](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev) by Black Forest Labs
+- [Diffusers](https://github.com/huggingface/diffusers) by Hugging Face
+- [PEFT](https://github.com/huggingface/peft) by Hugging Face
+- [AI-Toolkit](https://github.com/ostris/ai-toolkit) for training infrastructure