Buckets:

Mercity
/

FluxDistill

11 days ago

838 Bytes

	"""Extract matched MJHQ real images for the sampled prompts (FID-vs-real), resized to RES.
	Usage: python3 scripts/33_extract_ref.py [RES] [N]
	"""
	import sys, json, os, io, zipfile
	from PIL import Image

	RES = int(sys.argv[1]) if len(sys.argv) > 1 else 512
	N = int(sys.argv[2]) if len(sys.argv) > 2 else 10**9
	sel = json.load(open('outputs/eval/prompts.json'))[:N]
	z = zipfile.ZipFile('data/mjhq_raw/mjhq30k_imgs.zip')
	out = 'outputs/eval/imgs/mjhq_ref'; os.makedirs(out, exist_ok=True)
	n = 0
	for d in sel:
	arc = f"{d['category']}/{d['id']}.jpg"
	try:
	b = z.read(arc)
	except KeyError:
	print("missing", arc); continue
	Image.open(io.BytesIO(b)).convert('RGB').resize((RES, RES), Image.LANCZOS).save(
	f"{out}/{d['idx']:05d}.jpg", quality=95)
	n += 1
	print(f"extracted {n} ref imgs @ {RES} -> {out}")

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.