TokForge SD1.5-LCM — Photoreal (Realistic Vision V5.1)

A Realistic Vision V5.1 + LCM SD1.5 MNN bundle — the TokForge SD15_LCM_MNN fast few-step image model, but with the photoreal Realistic Vision V5.1 checkpoint as the base instead of DreamShaper-7. Renders a coherent, photorealistic 512px image in 4 steps on the in-process MNN diffusion engine — full MNN speed, the photoreal look baked into the UNet weights.

This is the Photoreal tier of the TokForge styled image catalog: natural skin, lifelike texture and lighting, true-to-life portraits and scenes. Verified on-device (OnePlus D9500 CPU, 4 steps, ~36 s) rendering a clean photorealistic candid portrait where the DreamShaper base renders a more stylized/cinematic look.

9-file MNN bundle: unet.mnn(.weight) (INT8, Realistic Vision V5.1 + LCM fused), reference SD1.5 f16 text_encoder.mnn(.weight) + vae_decoder.mnn(.weight), vocab.json, merges.txt, alphas.txt.

Provenance & licenses

  • Base model: SG161222/Realistic_Vision_V5.1_noVAECreativeML-OpenRAIL-M (commercial-OK; redistribution of derivatives is licensed under OpenRAIL-M §III.4). RealisticVision V5.1 by SG_161222.
  • LCM adapter: latent-consistency/lcm-lora-sdv1-5 — openrail++ (UNet-only consistency adapter, fused to keep the 4-step floor).
  • CLIP / VAE / tokenizer: the standard SD1.5 reference assets (CLIP ViT-L/14 text encoder + SD1.5 VAE) reused verbatim from the TokForge base LCM bundle. Realistic Vision V5.1 ships "noVAE", so the standard SD1.5 VAE is used (the same VAE referenced by the DreamShaper bundle).

Note: The Lykon "Add More Details" detail-enhancer LoRA (Civitai 82098) was evaluated for additional skin/texture detail but is distributed in kohya format with text-encoder (lora_te_*) keys that the diffusers fuse converter cannot apply (raises IndexError); it is not included. Realistic Vision V5.1 already biases strongly photoreal on its own.

Modifications (OpenRAIL-M §III "mark modified")

This bundle is a modified derivative of Realistic Vision V5.1:

  1. The LCM consistency adapter (lcm-lora-sdv1-5) is fused into the UNet (fp32 ΔW via diffusers.fuse_lora()) so the model is coherent at 4-8 steps with CFG≈1.0.
  2. The fused UNet is exported to ONNX and converted to an INT8-quantized MNN model (MNNConvert, asymmetric 8-bit weight quant) for on-device CPU inference.
  3. CLIP and VAE are the SD1.5 reference MNN assets, not Realistic Vision's own.

No retraining or fine-tuning of the original weights was performed beyond the LCM fuse + quant.

Use restrictions (OpenRAIL-M Attachment A)

Use of this model is subject to the CreativeML Open RAIL-M license, including the Attachment A use-based restrictions: you agree not to use the model, or any derivative, to violate any law; to exploit/harm minors; to generate or disseminate verifiably false information to harm others; to generate or disseminate personal identifiable information to harm someone; to defame, disparage or harass others; for fully automated decision-making that adversely affects legal rights or creates binding obligations; for discrimination or harm to individuals or groups based on protected characteristics; to exploit vulnerabilities of a specific group; to generate non-consensual or false content about individuals; or to provide medical advice/interpretation of medical results as a substitute for professional advice. See the full license text: https://huggingface.co/spaces/CompVis/stable-diffusion-license

These use-based restrictions propagate to all who use or redistribute this bundle.

Attribution

  • Realistic Vision V5.1 — © SG_161222, CreativeML-OpenRAIL-M.
  • Latent Consistency Model LoRA (SD1.5) — Latent Consistency team, openrail++.
  • Packaged for on-device MNN inference by TokForge (dev.tokforge).
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including darkmaniac7/TokForge-SD15-LCM-Photoreal