Image-to-Image
Safetensors
hilamanor commited on
Commit
3f800e7
·
verified ·
1 Parent(s): 0ef124c

update per PR #1

Browse files
Files changed (1) hide show
  1. README.md +38 -6
README.md CHANGED
@@ -1,19 +1,30 @@
1
  ---
 
 
 
 
2
  license: other
3
  license_name: nvidia-license-non-commercial
4
  license_link: LICENSE
5
- datasets:
6
- - handsomeWilliam/Relation252K
7
- base_model:
8
- - black-forest-labs/FLUX.1-Kontext-dev
9
  ---
10
 
11
- # LoRWeB Model
12
 
13
  <div align="center">
14
- <a href="https://arxiv.org/abs/2602.15727">ArXiv</a> | <a href="https://github.com/NVlabs/LoRWeB" style="display:inline;text-decoration:underline;"><img width="20" height="20" style="display:inline;margin:0;" src="https://img.icons8.com/ios-glyphs/30/github.png" alt="github"> GitHub Repository</a> | <a href="https://research.nvidia.com/labs/par/lorweb"> 🌐 Project Website</a> | <a href="https://huggingface.co/datasets/hilamanor/LoRWeB_evalset">🤗 Evaluation Dataset</a>
 
 
 
 
 
15
  </div>
16
 
 
 
 
 
 
17
  <div align="center">
18
 
19
  **Hila Manor**<sup>1,2</sup>,&ensp; **Rinon Gal**<sup>2</sup>,&ensp; **Haggai Maron**<sup>1,2</sup>,&ensp; **Tomer Michaeli**<sup>1</sup>,&ensp; **Gal Chechik**<sup>2,3</sup>
@@ -29,6 +40,20 @@ base_model:
29
  </div>
30
 
31
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
  ### ℹ️ Additional Information
33
 
34
  **This model is a reproduction of the original model from the paper. It was trained from scratch using Technion resources.** This might introduce differences from the results reported in the paper. Please see the `samples` directory for examples of this model's outputs on the {**a**, **a'**, **b**} triplets from the teaser figure.
@@ -48,4 +73,11 @@ If you use this model in your research, please cite:
48
  }
49
  ```
50
 
 
 
 
 
 
 
 
51
 
 
1
  ---
2
+ base_model:
3
+ - black-forest-labs/FLUX.1-Kontext-dev
4
+ datasets:
5
+ - handsomeWilliam/Relation252K
6
  license: other
7
  license_name: nvidia-license-non-commercial
8
  license_link: LICENSE
9
+ pipeline_tag: image-to-image
 
 
 
10
  ---
11
 
12
+ # LoRWeB: Spanning the Visual Analogy Space with a Weight Basis of LoRAs
13
 
14
  <div align="center">
15
+
16
+ [![arXiv](https://img.shields.io/badge/arXiv-2602.15727-b31b1b.svg)](https://huggingface.co/papers/2602.15727)
17
+ [![Project Website](https://img.shields.io/badge/🌐-Project%20Website-blue)](https://research.nvidia.com/labs/par/lorweb)
18
+ [![GitHub Repository](https://img.shields.io/badge/GitHub-LoRWeB-black?logo=github)](https://github.com/NVlabs/LoRWeB)
19
+ [![Evaluation Dataset](https://img.shields.io/badge/🤗-Evaluation%20Dataset-yellow)](https://huggingface.co/datasets/hilamanor/LoRWeB_evalset)
20
+
21
  </div>
22
 
23
+ Visual analogy learning enables image manipulation through demonstration rather than textual description, allowing users to specify complex transformations difficult to articulate in words.
24
+ Given a triplet {**a**, **a'**, **b**}, the goal is to generate **b'** such that **a** : **a'** :: **b** : **b'**.
25
+
26
+ **LoRWeB** specializes the model for each analogy task at inference time through dynamic composition of learned transformation primitives. It introduces a learnable basis of LoRA modules to span the space of different visual transformations and a lightweight encoder that dynamically selects and weighs these basis LoRAs based on the input analogy pair.
27
+
28
  <div align="center">
29
 
30
  **Hila Manor**<sup>1,2</sup>,&ensp; **Rinon Gal**<sup>2</sup>,&ensp; **Haggai Maron**<sup>1,2</sup>,&ensp; **Tomer Michaeli**<sup>1</sup>,&ensp; **Gal Chechik**<sup>2,3</sup>
 
40
  </div>
41
 
42
 
43
+ ## 🛠 Sample Usage
44
+
45
+ To perform inference using the LoRWeB weights, use the `inference.py` script from the [official GitHub repository](https://github.com/NVlabs/LoRWeB):
46
+
47
+ ```bash
48
+ python inference.py \
49
+ -w "path/to/lorweb_model.safetensors" \
50
+ -c "config/your_config.yaml" \
51
+ -a "data/path_to_a_img.jpg" \
52
+ -t "data/path_to_atag_img.jpg" \
53
+ -b "data/path_to_b_img.jpg" \
54
+ -o "outputs/generated_btag_img_path.jpg"
55
+ ```
56
+
57
  ### ℹ️ Additional Information
58
 
59
  **This model is a reproduction of the original model from the paper. It was trained from scratch using Technion resources.** This might introduce differences from the results reported in the paper. Please see the `samples` directory for examples of this model's outputs on the {**a**, **a'**, **b**} triplets from the teaser figure.
 
73
  }
74
  ```
75
 
76
+ ## 🙏🏻 Acknowledgements
77
+
78
+ This project builds upon:
79
+ - [FLUX.1-Kontext](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev) by Black Forest Labs
80
+ - [Diffusers](https://github.com/huggingface/diffusers) by Hugging Face
81
+ - [PEFT](https://github.com/huggingface/peft) by Hugging Face
82
+ - [AI-Toolkit](https://github.com/ostris/ai-toolkit) for training infrastructure
83