|
|
--- |
|
|
tags: |
|
|
- text-to-image |
|
|
- lora |
|
|
- diffusers |
|
|
- template:diffusion-lora |
|
|
widget: |
|
|
- output: |
|
|
url: images/0.png |
|
|
text: 'face headshot, a Middle Eastern male with thick black hair styled in a fade and a well-groomed beard. He’s wearing a fitted white turtleneck sweater. His lips are rose-tinted, and his almond-shaped brown eyes are calm but confident. The backdrop is a deep wine red, adding richness to the composition.' |
|
|
- output: |
|
|
url: images/1.png |
|
|
text: 'face headshot, a woman with long brown hair, wearing a gray turtleneck, is stunning against a vibrant red wall. Her left hand is resting on her chin, adding a touch of warmth to her face. The womans gaze is directed to the right, her lips are pursed, revealing a soft pink lip. Her eyebrows are a darker shade of brown, and her hair cascades down to her shoulders. The wall behind her is a soft shade of red, and the womans hair is pulled back in a ponytail.' |
|
|
- output: |
|
|
url: images/11.png |
|
|
text: 'face headshot, High-resolution photograph, woman, UHD, photorealistic, shot on a Sony A7III --chaos 20 --ar 1:2 --style raw --stylize 250' |
|
|
- output: |
|
|
url: images/2.png |
|
|
text: 'face headshot, an African American woman with short curly black hair, wearing a yellow off-shoulder top. Her lips are a glossy shade of red, and her eyes are warm brown, reflecting a sense of calm. She has gold hoop earrings, and her skin glows under soft studio lighting. The backdrop is a muted teal, enhancing the radiance of her complexion.' |
|
|
- output: |
|
|
url: images/3.png |
|
|
text: 'face headshot, a Caucasian woman with shoulder-length blonde hair styled in loose waves, wearing a navy blue blazer over a white blouse. Her lips are nude with a soft matte finish, and her hazel eyes are accentuated with subtle eyeliner. She is smiling gently, revealing her warmth. The background is a gradient of soft gray, giving the portrait a professional tone.' |
|
|
- output: |
|
|
url: images/4.png |
|
|
text: 'face headshot, an East Asian man with medium-length black hair styled messily, wearing a black leather jacket over a plain white t-shirt. His lips are pale pink, and his eyes are a sharp gray-brown, looking away from the camera. The background is urban-inspired, a textured concrete gray wall, giving the portrait a cinematic edge.' |
|
|
- output: |
|
|
url: images/5.png |
|
|
text: 'face headshot, a Latina woman with long straight black hair, parted in the middle and tucked behind her ears. She is wearing a red satin blouse with subtle shimmer. Her lips are painted a bold crimson, and her almond-shaped eyes are dark brown, framed with long lashes. The backdrop is soft golden beige, illuminating her features.' |
|
|
- output: |
|
|
url: images/6.png |
|
|
text: 'face headshot, a Caucasian male with wavy blonde hair swept back, wearing a white linen shirt slightly unbuttoned at the collar. His lips are pale pink, and his piercing blue eyes are directed slightly upward, as if lost in thought. The background is a pale sandy beige, evoking a beach-inspired vibe.' |
|
|
- output: |
|
|
url: images/7.png |
|
|
text: 'face headshot, a Hispanic male with close-cropped hair and a full beard, wearing a black crew-neck t-shirt. His dark brown eyes are intense, and his lips are neutral with a serious expression. The background is a dramatic dark gray, adding depth to the mood.' |
|
|
base_model: Qwen/Qwen-Image |
|
|
instance_prompt: face headshot |
|
|
license: apache-2.0 |
|
|
--- |
|
|
|
|
|
 |
|
|
|
|
|
# Qwen-Image-HeadshotX |
|
|
|
|
|
<Gallery /> |
|
|
|
|
|
> [!note] |
|
|
**Qwen-Image-HeadshotX** is a super-realistic headshot adapter for [Qwen-Image](https://huggingface.co/Qwen/Qwen-Image), an image generation model by Qwen. It is an advanced LoRA adaptation of the Qwen-Image model and an upgraded version of [Qwen-Image-Studio-Realism](https://huggingface.co/prithivMLmods/Qwen-Image-Studio-Realism), offering more precise portrait rendering with a strong focus on realism. The model was trained on diverse face types from across the world, labeled with **florence2-en** and caption-optimized using [DeepCaption-VLA-7B](https://huggingface.co/prithivMLmods/DeepCaption-VLA-7B). **Total Images Used for Training:** 55 RAW images \[11(types) × 5 different face types: Asian, Hispanic, Caucasian, Latina, Middle Eastern, etc.]. |
|
|
|
|
|
|
|
|
--- |
|
|
|
|
|
# Model description for Qwen-Image-HeadshotX |
|
|
|
|
|
Image Processing Parameters |
|
|
|
|
|
| Parameter | Value | Parameter | Value | |
|
|
|---------------------------|--------|---------------------------|--------| |
|
|
| LR Scheduler | constant | Noise Offset | 0.03 | |
|
|
| Optimizer | AdamW | Multires Noise Discount | 0.1 | |
|
|
| Network Dim | 64 | Multires Noise Iterations | 10 | |
|
|
| Network Alpha | 32 | Repeat & Steps | 25 & 4000 | |
|
|
| Epoch | 30 | Save Every N Epochs | 1 | |
|
|
|
|
|
Labeling: florence2-en(natural language & English) + 🔥 Optimized with Long-Caption VLA Multimodal : https://huggingface.co/prithivMLmods/DeepCaption-VLA-7B |
|
|
|
|
|
Total Images Used for Training: 55 RAW [11(types)×5 different face types (Asian, Hispanic, Caucasian, Latina, Middle Eastern, etc.)] |
|
|
|
|
|
## Data Sources |
|
|
|
|
|
| Source | Link | |
|
|
|--------------|-------------------------------------| |
|
|
| Playground | [playground.com](https://playground.com/) | |
|
|
| ArtStation | [artstation.com](https://www.artstation.com/) | |
|
|
| 4K Wallpapers| [4kwallpapers.com](https://4kwallpapers.com/) | |
|
|
|
|
|
## Best Dimensions & Inference |
|
|
|
|
|
| **Dimensions** | **Aspect Ratio** | **Recommendation** | |
|
|
|-----------------|------------------|---------------------------| |
|
|
| 1472 x 1140 | 4:3 (approx.) | Best | |
|
|
| 1024 x 1024 | 1:1 | Default | |
|
|
|
|
|
### Inference Range |
|
|
|
|
|
- **Recommended Inference Steps:** 45-50 (approx. ~ `100 Seconds Inference`) |
|
|
|
|
|
## Setting Up |
|
|
```python |
|
|
import torch |
|
|
from diffusers import DiffusionPipeline |
|
|
|
|
|
base_model = "Qwen/Qwen-Image" |
|
|
pipe = DiffusionPipeline.from_pretrained(base_model, torch_dtype=torch.bfloat16) |
|
|
|
|
|
lora_repo = "prithivMLmods/Qwen-Image-HeadshotX" |
|
|
trigger_word = "face headshot" |
|
|
pipe.load_lora_weights(lora_repo) |
|
|
|
|
|
device = torch.device("cuda") |
|
|
pipe.to(device) |
|
|
``` |
|
|
## Trigger words |
|
|
|
|
|
You should use `face headshot` to trigger the image generation. |
|
|
|
|
|
|
|
|
## Download model |
|
|
|
|
|
|
|
|
[Download](/prithivMLmods/Qwen-Image-HeadshotX/tree/main) them in the Files & versions tab. |
|
|
|