update per PR #1
Browse files
README.md
CHANGED
|
@@ -1,19 +1,30 @@
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
license: other
|
| 3 |
license_name: nvidia-license-non-commercial
|
| 4 |
license_link: LICENSE
|
| 5 |
-
|
| 6 |
-
- handsomeWilliam/Relation252K
|
| 7 |
-
base_model:
|
| 8 |
-
- black-forest-labs/FLUX.1-Kontext-dev
|
| 9 |
---
|
| 10 |
|
| 11 |
-
# LoRWeB
|
| 12 |
|
| 13 |
<div align="center">
|
| 14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
</div>
|
| 16 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 |
<div align="center">
|
| 18 |
|
| 19 |
**Hila Manor**<sup>1,2</sup>,  **Rinon Gal**<sup>2</sup>,  **Haggai Maron**<sup>1,2</sup>,  **Tomer Michaeli**<sup>1</sup>,  **Gal Chechik**<sup>2,3</sup>
|
|
@@ -29,6 +40,20 @@ base_model:
|
|
| 29 |
</div>
|
| 30 |
|
| 31 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
### ℹ️ Additional Information
|
| 33 |
|
| 34 |
**This model is a reproduction of the original model from the paper. It was trained from scratch using Technion resources.** This might introduce differences from the results reported in the paper. Please see the `samples` directory for examples of this model's outputs on the {**a**, **a'**, **b**} triplets from the teaser figure.
|
|
@@ -48,4 +73,11 @@ If you use this model in your research, please cite:
|
|
| 48 |
}
|
| 49 |
```
|
| 50 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 51 |
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model:
|
| 3 |
+
- black-forest-labs/FLUX.1-Kontext-dev
|
| 4 |
+
datasets:
|
| 5 |
+
- handsomeWilliam/Relation252K
|
| 6 |
license: other
|
| 7 |
license_name: nvidia-license-non-commercial
|
| 8 |
license_link: LICENSE
|
| 9 |
+
pipeline_tag: image-to-image
|
|
|
|
|
|
|
|
|
|
| 10 |
---
|
| 11 |
|
| 12 |
+
# LoRWeB: Spanning the Visual Analogy Space with a Weight Basis of LoRAs
|
| 13 |
|
| 14 |
<div align="center">
|
| 15 |
+
|
| 16 |
+
[](https://huggingface.co/papers/2602.15727)
|
| 17 |
+
[](https://research.nvidia.com/labs/par/lorweb)
|
| 18 |
+
[](https://github.com/NVlabs/LoRWeB)
|
| 19 |
+
[](https://huggingface.co/datasets/hilamanor/LoRWeB_evalset)
|
| 20 |
+
|
| 21 |
</div>
|
| 22 |
|
| 23 |
+
Visual analogy learning enables image manipulation through demonstration rather than textual description, allowing users to specify complex transformations difficult to articulate in words.
|
| 24 |
+
Given a triplet {**a**, **a'**, **b**}, the goal is to generate **b'** such that **a** : **a'** :: **b** : **b'**.
|
| 25 |
+
|
| 26 |
+
**LoRWeB** specializes the model for each analogy task at inference time through dynamic composition of learned transformation primitives. It introduces a learnable basis of LoRA modules to span the space of different visual transformations and a lightweight encoder that dynamically selects and weighs these basis LoRAs based on the input analogy pair.
|
| 27 |
+
|
| 28 |
<div align="center">
|
| 29 |
|
| 30 |
**Hila Manor**<sup>1,2</sup>,  **Rinon Gal**<sup>2</sup>,  **Haggai Maron**<sup>1,2</sup>,  **Tomer Michaeli**<sup>1</sup>,  **Gal Chechik**<sup>2,3</sup>
|
|
|
|
| 40 |
</div>
|
| 41 |
|
| 42 |
|
| 43 |
+
## 🛠 Sample Usage
|
| 44 |
+
|
| 45 |
+
To perform inference using the LoRWeB weights, use the `inference.py` script from the [official GitHub repository](https://github.com/NVlabs/LoRWeB):
|
| 46 |
+
|
| 47 |
+
```bash
|
| 48 |
+
python inference.py \
|
| 49 |
+
-w "path/to/lorweb_model.safetensors" \
|
| 50 |
+
-c "config/your_config.yaml" \
|
| 51 |
+
-a "data/path_to_a_img.jpg" \
|
| 52 |
+
-t "data/path_to_atag_img.jpg" \
|
| 53 |
+
-b "data/path_to_b_img.jpg" \
|
| 54 |
+
-o "outputs/generated_btag_img_path.jpg"
|
| 55 |
+
```
|
| 56 |
+
|
| 57 |
### ℹ️ Additional Information
|
| 58 |
|
| 59 |
**This model is a reproduction of the original model from the paper. It was trained from scratch using Technion resources.** This might introduce differences from the results reported in the paper. Please see the `samples` directory for examples of this model's outputs on the {**a**, **a'**, **b**} triplets from the teaser figure.
|
|
|
|
| 73 |
}
|
| 74 |
```
|
| 75 |
|
| 76 |
+
## 🙏🏻 Acknowledgements
|
| 77 |
+
|
| 78 |
+
This project builds upon:
|
| 79 |
+
- [FLUX.1-Kontext](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev) by Black Forest Labs
|
| 80 |
+
- [Diffusers](https://github.com/huggingface/diffusers) by Hugging Face
|
| 81 |
+
- [PEFT](https://github.com/huggingface/peft) by Hugging Face
|
| 82 |
+
- [AI-Toolkit](https://github.com/ostris/ai-toolkit) for training infrastructure
|
| 83 |
|