Jessamine
/

portraitcraft-track2

Model card Files Files and versions

portraitcraft-track2 / README.md

Jessamine's picture

Upload folder using huggingface_hub

06839ab verified 15 days ago

|

history blame contribute delete

2.52 kB

	# PortraitCraft Track 2 Solution

	This repository contains the inference code and documentation for our
	PortraitCraft Track 2: Portrait Composition Generation submission.

	## Model

	The released model checkpoint is hosted on Hugging Face:

	https://huggingface.co/Jessamine/portraitcraft-track2

	The model is based on Qwen-Image and is further fine-tuned for portrait
	composition generation using the official 4,500 PortraitCraft training samples
	together with additional private portrait aesthetic-composition data curated by
	our team. We compared LoRA fine-tuning and full-parameter fine-tuning under the
	same inference settings, and selected full-parameter fine-tuning because it
	performed better for this task, especially in aesthetic quality, composition
	stability, and prompt-to-layout alignment.

	## Adaptive Canvas Policy

	We do not use a fixed 1:1 canvas for all generations. In portrait composition
	generation, different prompts imply different spatial structures: some are best
	served by square canvases, some require vertical canvases to preserve full-body
	framing and headroom/footroom, and some require horizontal canvases for
	environmental portraits, roads, coastlines, leading lines, and large negative
	space.

	To handle this, we design a prompt-conditioned adaptive canvas policy. The
	policy reads the input prompt and the released learned policy state, then
	selects the generation canvas before image synthesis. Its keyword weights,
	decision thresholds, and candidate aspect ratios were optimized on the training
	set through iterative evolutionary search. The longer side is normalized to
	1584 pixels. For reproducibility, we release the final policy state together
	with the inference code so reviewers can reproduce the same canvas choices used
	by our submission.

	## Inference

	Install dependencies:

	```bash
	pip install -r requirements.txt
	```

	Run inference:

	```bash
	python scripts/infer_portraitcraft.py \
	--input-json /path/to/track2_test.json \
	--base-model /path/to/Qwen-Image-2512 \
	--checkpoint /path/to/portraitcraft-track2.safetensors \
	--aspect-policy configs/aspect_policy_manifest.json \
	--output-dir outputs/portraitcraft_track2
	```

	Package the output directory as a flat submission zip:

	```bash
	python scripts/package_submission.py \
	--image-dir outputs/portraitcraft_track2 \
	--zip-path portraitcraft_track2_submission.zip
	```

	Default inference parameters:

	- `num_inference_steps = 50`
	- `cfg_scale = 4.0`
	- `seed = 346346`
	- adaptive canvas longest side = `1584`