hilamanor
/

lorweb

Image-to-Image

Safetensors

Model card Files Files and versions

xet

Community

hilamanor commited on Apr 26

Commit

3f800e7

verified ·

1 Parent(s): 0ef124c

update per PR #1

Browse files

Files changed (1) hide show

README.md +38 -6

README.md CHANGED Viewed

@@ -1,19 +1,30 @@
 ---
 license: other
 license_name: nvidia-license-non-commercial
 license_link: LICENSE
-datasets:
-- handsomeWilliam/Relation252K
-base_model:
-- black-forest-labs/FLUX.1-Kontext-dev
 ---
-# LoRWeB Model
 <div align="center">
-  <a href="https://arxiv.org/abs/2602.15727">ArXiv</a> | <a href="https://github.com/NVlabs/LoRWeB" style="display:inline;text-decoration:underline;"><img width="20" height="20" style="display:inline;margin:0;" src="https://img.icons8.com/ios-glyphs/30/github.png" alt="github"> GitHub Repository</a> | <a href="https://research.nvidia.com/labs/par/lorweb"> 🌐 Project Website</a> | <a href="https://huggingface.co/datasets/hilamanor/LoRWeB_evalset">🤗 Evaluation Dataset</a>
 </div>
 <div align="center">
 **Hila Manor**<sup>1,2</sup>,&ensp; **Rinon Gal**<sup>2</sup>,&ensp; **Haggai Maron**<sup>1,2</sup>,&ensp; **Tomer Michaeli**<sup>1</sup>,&ensp; **Gal Chechik**<sup>2,3</sup>
@@ -29,6 +40,20 @@ base_model:
 </div>
 ### ℹ️ Additional Information
 **This model is a reproduction of the original model from the paper. It was trained from scratch using Technion resources.** This might introduce differences from the results reported in the paper. Please see the `samples` directory for examples of this model's outputs on the {**a**, **a'**, **b**} triplets from the teaser figure.
@@ -48,4 +73,11 @@ If you use this model in your research, please cite:
 }
 ```

 ---
+base_model:
+- black-forest-labs/FLUX.1-Kontext-dev
+datasets:
+- handsomeWilliam/Relation252K
 license: other
 license_name: nvidia-license-non-commercial
 license_link: LICENSE
+pipeline_tag: image-to-image
 ---
+# LoRWeB: Spanning the Visual Analogy Space with a Weight Basis of LoRAs
 <div align="center">
+[![arXiv](https://img.shields.io/badge/arXiv-2602.15727-b31b1b.svg)](https://huggingface.co/papers/2602.15727)
+[![Project Website](https://img.shields.io/badge/🌐-Project%20Website-blue)](https://research.nvidia.com/labs/par/lorweb)
+[![GitHub Repository](https://img.shields.io/badge/GitHub-LoRWeB-black?logo=github)](https://github.com/NVlabs/LoRWeB)
+[![Evaluation Dataset](https://img.shields.io/badge/🤗-Evaluation%20Dataset-yellow)](https://huggingface.co/datasets/hilamanor/LoRWeB_evalset)
 </div>
+Visual analogy learning enables image manipulation through demonstration rather than textual description, allowing users to specify complex transformations difficult to articulate in words.
+Given a triplet {**a**, **a'**, **b**}, the goal is to generate **b'** such that **a** : **a'** :: **b** : **b'**.
+**LoRWeB** specializes the model for each analogy task at inference time through dynamic composition of learned transformation primitives. It introduces a learnable basis of LoRA modules to span the space of different visual transformations and a lightweight encoder that dynamically selects and weighs these basis LoRAs based on the input analogy pair.
 <div align="center">
 **Hila Manor**<sup>1,2</sup>,&ensp; **Rinon Gal**<sup>2</sup>,&ensp; **Haggai Maron**<sup>1,2</sup>,&ensp; **Tomer Michaeli**<sup>1</sup>,&ensp; **Gal Chechik**<sup>2,3</sup>
 </div>
+## 🛠 Sample Usage
+To perform inference using the LoRWeB weights, use the `inference.py` script from the [official GitHub repository](https://github.com/NVlabs/LoRWeB):
+```bash
+python inference.py \
+  -w "path/to/lorweb_model.safetensors" \
+  -c "config/your_config.yaml" \
+  -a "data/path_to_a_img.jpg" \
+  -t "data/path_to_atag_img.jpg" \
+  -b "data/path_to_b_img.jpg" \
+  -o "outputs/generated_btag_img_path.jpg"
+```
 ### ℹ️ Additional Information
 **This model is a reproduction of the original model from the paper. It was trained from scratch using Technion resources.** This might introduce differences from the results reported in the paper. Please see the `samples` directory for examples of this model's outputs on the {**a**, **a'**, **b**} triplets from the teaser figure.
 }
 ```
+## 🙏🏻 Acknowledgements
+This project builds upon:
+- [FLUX.1-Kontext](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev) by Black Forest Labs
+- [Diffusers](https://github.com/huggingface/diffusers) by Hugging Face
+- [PEFT](https://github.com/huggingface/peft) by Hugging Face
+- [AI-Toolkit](https://github.com/ostris/ai-toolkit) for training infrastructure