| # PortraitCraft Track 2 Solution |
|
|
| This repository contains the inference code and documentation for our |
| PortraitCraft Track 2: Portrait Composition Generation submission. |
|
|
| ## Model |
|
|
| The released model checkpoint is hosted on Hugging Face: |
|
|
| https://huggingface.co/Jessamine/portraitcraft-track2 |
|
|
| The model is based on Qwen-Image and is further fine-tuned for portrait |
| composition generation using the official 4,500 PortraitCraft training samples |
| together with additional private portrait aesthetic-composition data curated by |
| our team. We compared LoRA fine-tuning and full-parameter fine-tuning under the |
| same inference settings, and selected full-parameter fine-tuning because it |
| performed better for this task, especially in aesthetic quality, composition |
| stability, and prompt-to-layout alignment. |
|
|
| ## Adaptive Canvas Policy |
|
|
| We do not use a fixed 1:1 canvas for all generations. In portrait composition |
| generation, different prompts imply different spatial structures: some are best |
| served by square canvases, some require vertical canvases to preserve full-body |
| framing and headroom/footroom, and some require horizontal canvases for |
| environmental portraits, roads, coastlines, leading lines, and large negative |
| space. |
|
|
| To handle this, we design a prompt-conditioned adaptive canvas policy. The |
| policy reads the input prompt and the released learned policy state, then |
| selects the generation canvas before image synthesis. Its keyword weights, |
| decision thresholds, and candidate aspect ratios were optimized on the training |
| set through iterative evolutionary search. The longer side is normalized to |
| 1584 pixels. For reproducibility, we release the final policy state together |
| with the inference code so reviewers can reproduce the same canvas choices used |
| by our submission. |
|
|
| ## Inference |
|
|
| Install dependencies: |
|
|
| ```bash |
| pip install -r requirements.txt |
| ``` |
|
|
| Run inference: |
|
|
| ```bash |
| python scripts/infer_portraitcraft.py \ |
| --input-json /path/to/track2_test.json \ |
| --base-model /path/to/Qwen-Image-2512 \ |
| --checkpoint /path/to/portraitcraft-track2.safetensors \ |
| --aspect-policy configs/aspect_policy_manifest.json \ |
| --output-dir outputs/portraitcraft_track2 |
| ``` |
|
|
| Package the output directory as a flat submission zip: |
|
|
| ```bash |
| python scripts/package_submission.py \ |
| --image-dir outputs/portraitcraft_track2 \ |
| --zip-path portraitcraft_track2_submission.zip |
| ``` |
|
|
| Default inference parameters: |
|
|
| - `num_inference_steps = 50` |
| - `cfg_scale = 4.0` |
| - `seed = 346346` |
| - adaptive canvas longest side = `1584` |
|
|