| # xj + MJHQ-30K β 4-model comparison samples |
|
|
| Inference outputs of 4 stage-2 text-to-image models on the combined |
| `xj_mjhq30k_prompts` set (30,954 prompts), one image per prompt per model: |
|
|
| | Model | CFG | Tar file | |
| |---|---|---| |
| | flux | 6 | `t2i-ddt-en28d1152hd72-dn2d2048hd128-flux-vpred-t4-norepa-v0_ep-0000020_xj_mjhq30k_prompts_steps50_cfg6.tar` | |
| | flux2 | 7 | `t2i-ddt-en28d1152hd72-dn2d2048hd128-flux2-vpred-t4-norepa-v0_ep-0000020_xj_mjhq30k_prompts_steps50_cfg7.tar` | |
| | e2e-vavae | 6 | `t2i-ddt-en28d1152hd72-dn2d2048hd128-e2e-vavae-vpred-t4-norepa-v0_ep-0000020_xj_mjhq30k_prompts_steps50_cfg6.tar` | |
| | langpe-l | 7 | `t2i-ddt-n28_2d1152_2048hd72_128-rae-langpe-vit-l-vpred-t4-norepa-v0_ep-0000020_xj_mjhq30k_prompts_steps50_cfg7.tar` | |
|
|
| The 4 image folders are packed as **`.tar`** (no gzip β PNG is already |
| compressed) to keep file count low for fast cloning. Each tar extracts into a |
| folder named after itself, containing `00000.png β¦ 30953.png` and matching |
| `.txt` prompt sidecars. |
|
|
| ## Extract |
|
|
| ```bash |
| for f in *.tar; do tar -xf "$f"; done |
| ``` |
|
|
| ## Contents after extract |
|
|
| ``` |
| <this-repo>/ |
| βββ <model1_dir>/00000.png, 00000.txt, ... |
| βββ <model2_dir>/... |
| βββ <model3_dir>/... |
| βββ <model4_dir>/... |
| βββ selected/<NNNNN>/{flux,flux2,e2e-vavae,langpe-l}.png + prompt.txt + meta.json |
| βββ viewer/ # browser-based 4-model comparison viewer (works once tars are extracted) |
| ``` |
|
|
| ## Viewer |
|
|
| After extracting all tars, browse the 4-model comparison side-by-side: |
|
|
| ```bash |
| python viewer/viewer.py # default :8765, or --port N |
| # open http://localhost:<port>/test/xj_mjhq30k_inference_outputs/viewer/index.html |
| ``` |
|
|
| (The viewer expects to be served from a directory layout where the run folders |
| and `viewer/` are siblings, as in this repo.) |
|
|
| `selected/<NNNNN>/` holds curated picks β 54 prompts selected via the viewer's |
| Save button, each with the 4 models' images plus the prompt text and metadata. |
|
|