Improve model card and add metadata

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +34 -17
README.md CHANGED
@@ -1,39 +1,50 @@
1
  ---
 
 
 
 
2
  license: other
3
  license_name: nvidia-license-non-commercial
4
  license_link: LICENSE
5
- datasets:
6
- - handsomeWilliam/Relation252K
7
- base_model:
8
- - black-forest-labs/FLUX.1-Kontext-dev
9
  ---
10
 
11
- # LoRWeB Model (Coming Soon)
12
-
13
- <div align="center">
14
- <a href="https://arxiv.org/">ArXiv</a> | <a href="https://github.com/NVlabs/LoRWeB" style="display:inline;text-decoration:underline;"><img width="20" height="20" style="display:inline;margin:0;" src="https://img.icons8.com/ios-glyphs/30/github.png" alt="github"> GitHub Repository</a> | <a href="https://research.nvidia.com/labs/par/lorweb"> ๐ŸŒ Project Website</a> | <a href="https://huggingface.co/datasets/hilamanor/LoRWeB_evalset">๐Ÿค— Evaluation Dataset (Comming Soon)</a>
15
- </div>
16
 
17
  <div align="center">
18
-
19
- **Hila Manor**<sup>1,2</sup>,&ensp; **Rinon Gal**<sup>2</sup>,&ensp; **Haggai Maron**<sup>1,2</sup>,&ensp; **Tomer Michaeli**<sup>1</sup>,&ensp; **Gal Chechik**<sup>2,3</sup>
20
 
21
- <sup>1</sup>Technion - Israel Institute of Technology &ensp;&ensp; <sup>2</sup>NVIDIA &ensp;&ensp; <sup>3</sup>Bar-Ilan University
 
 
 
22
 
23
  </div>
24
 
 
 
 
 
25
  <div align="center">
26
- <img src="https://github.com/NVlabs/LoRWeB/raw/main/assets/teaser.jpg" alt="Teaser" width="800"/>
27
 
28
  <i>Given a prompt and an image triplet {**a**, **a'**, **b**} that visually describe a desired transformation, LoRWeB dynamically constructs a single LoRA from a learnable basis of LoRA modules, and produces an editing result **b'** that applies the same analogy to the new image.</i>
29
  </div>
30
 
 
31
 
32
- ### โ„น๏ธ Additional Information
33
 
34
- Please see our full modelcard and further details in the [GitHub Repo](https://github.com/NVlabs/LoRWeB)
 
 
 
 
 
 
 
 
35
 
36
- ## ๐Ÿ“š Citation
37
 
38
  If you use this model in your research, please cite:
39
 
@@ -41,9 +52,15 @@ If you use this model in your research, please cite:
41
  @article{manor2026lorweb,
42
  title={Spanning the Visual Analogy Space with a Weight Basis of LoRAs},
43
  author={Manor, Hila and Gal, Rinon and Maron, Haggai and Michaeli, Tomer and Chechik, Gal},
44
- journal={arXiv preprint},
45
  year={2026}
46
  }
47
  ```
48
 
 
49
 
 
 
 
 
 
 
1
  ---
2
+ base_model:
3
+ - black-forest-labs/FLUX.1-Kontext-dev
4
+ datasets:
5
+ - handsomeWilliam/Relation252K
6
  license: other
7
  license_name: nvidia-license-non-commercial
8
  license_link: LICENSE
9
+ pipeline_tag: image-to-image
 
 
 
10
  ---
11
 
12
+ # LoRWeB: Spanning the Visual Analogy Space with a Weight Basis of LoRAs
 
 
 
 
13
 
14
  <div align="center">
 
 
15
 
16
+ [![arXiv](https://img.shields.io/badge/arXiv-2602.15727-b31b1b.svg)](https://huggingface.co/papers/2602.15727)
17
+ [![Project Website](https://img.shields.io/badge/๐ŸŒ-Project%20Website-blue)](https://research.nvidia.com/labs/par/lorweb)
18
+ [![GitHub Repository](https://img.shields.io/badge/GitHub-Repository-black?logo=github)](https://github.com/NVlabs/LoRWeB)
19
+ [![Evaluation Dataset](https://img.shields.io/badge/๐Ÿค—-Evaluation%20Dataset-yellow)](https://huggingface.co/datasets/hilamanor/LoRWeB_evalset)
20
 
21
  </div>
22
 
23
+ Visual analogy learning enables image manipulation through demonstration rather than textual description, allowing users to specify complex transformations difficult to articulate in words. Given a triplet $\{a, a', b\}$, the goal is to generate $b'$ such that $a : a' :: b : b'$.
24
+
25
+ **LoRWeB** specializes the model for each analogy task at inference time through dynamic composition of learned transformation primitives. It introduces a learnable basis of LoRA modules to span the space of different visual transformations and a lightweight encoder that dynamically selects and weighs these basis LoRAs based on the input analogy pair.
26
+
27
  <div align="center">
28
+ <img src="https://github.com/NVlabs/LoRWeB/raw/main/assets/teaser.jpg" alt="LoRWeB Teaser" width="800"/>
29
 
30
  <i>Given a prompt and an image triplet {**a**, **a'**, **b**} that visually describe a desired transformation, LoRWeB dynamically constructs a single LoRA from a learnable basis of LoRA modules, and produces an editing result **b'** that applies the same analogy to the new image.</i>
31
  </div>
32
 
33
+ ## Sample Usage
34
 
35
+ To perform inference using the LoRWeB weights, use the `inference.py` script from the [official GitHub repository](https://github.com/NVlabs/LoRWeB):
36
 
37
+ ```bash
38
+ python inference.py \
39
+ -w "path/to/lorweb_model.safetensors" \
40
+ -c "config/your_config.yaml" \
41
+ -a "data/path_to_a_img.jpg" \
42
+ -t "data/path_to_atag_img.jpg" \
43
+ -b "data/path_to_b_img.jpg" \
44
+ -o "outputs/generated_btag_img_path.jpg"
45
+ ```
46
 
47
+ ## Citation
48
 
49
  If you use this model in your research, please cite:
50
 
 
52
  @article{manor2026lorweb,
53
  title={Spanning the Visual Analogy Space with a Weight Basis of LoRAs},
54
  author={Manor, Hila and Gal, Rinon and Maron, Haggai and Michaeli, Tomer and Chechik, Gal},
55
+ journal={arXiv preprint arXiv:2602.15727},
56
  year={2026}
57
  }
58
  ```
59
 
60
+ ## Acknowledgements
61
 
62
+ This project builds upon:
63
+ - [FLUX.1-Kontext](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev) by Black Forest Labs
64
+ - [Diffusers](https://github.com/huggingface/diffusers) by Hugging Face
65
+ - [PEFT](https://github.com/huggingface/peft) by Hugging Face
66
+ - [AI-Toolkit](https://github.com/ostris/ai-toolkit) for training infrastructure