Instructions to use SceneWorks/instantid-mlx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use SceneWorks/instantid-mlx with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir instantid-mlx SceneWorks/instantid-mlx
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
SceneWorks/instantid-mlx
Converted weights for running InstantID identity-preserving SDXL natively on Apple Silicon
with MLX — zero Python at inference time. These are the three artifacts the
mlx-gen-instantid provider loads to compose InstantID
out of the SDXL backbone + the native MLX face stack (mlx-gen-face).
This repo holds only the InstantID-specific glue weights. The SDXL base (e.g.
SG161222/RealVisXL_V5.0 or stabilityai/stable-diffusion-xl-base-1.0), the IdentityNet
ControlNet (InstantX/InstantID → ControlNetModel/), and the OpenPose ControlNet for pose mode
(xinsir/controlnet-openpose-sdxl-1.0) are loaded directly from their own diffusers repos — no
conversion needed for those.
Files
| File | Size | What it is | Source | Converter |
|---|---|---|---|---|
ip-adapter.safetensors |
1.57 GB | The InstantID face IP-Adapter: the image-projection Resampler (image_proj.*, ArcFace 512-d → 16×2048 face tokens) + the 70 decoupled cross-attention K/V pairs (ip_adapter.*). Re-serialized from the upstream torch pickle ip-adapter.bin into safetensors (MLX's loader reads safetensors, not pickle). |
InstantX/InstantID → ip-adapter.bin |
tools/convert_instantid.py |
scrfd_10g.safetensors |
16 MB | SCRFD 5-point face detector (bbox + landmarks) — the detection half of the native face stack. Ported from the insightface antelopev2 scrfd_10g_bnkps ONNX graph. |
insightface antelopev2 (scrfd_10g_bnkps.onnx) |
tools/convert_scrfd.py |
arcface_iresnet100.safetensors |
248 MB | ArcFace iresnet100 512-d recognition embedder — the identity-fidelity half. Ported from the insightface antelopev2 glintr100 ONNX graph. |
insightface antelopev2 (glintr100.onnx) |
tools/convert_glintr100.py |
Checksums (sha256)
fa5608b6121ffaa40228e76ac96e10f56e39b3aba2f6c4905ff7ef9046391c29 ip-adapter.safetensors
7b40147a85771139e70a8d9fe6be27ffcf32f4c911770ef24b5b05c29f534eda scrfd_10g.safetensors
9deff2fef8fe1b3e357a99c01f28cc478dd8acbeab0d3749d252f6d69990ee39 arcface_iresnet100.safetensors
Usage
In mlx-gen-instantid (Rust / MLX)
use mlx_gen::weights::Weights;
use mlx_gen::WeightsSource;
use mlx_gen_instantid::{InstantId, InstantIdPaths, InstantIdRequest};
let model = InstantId::load(&InstantIdPaths {
sdxl_base: "/path/to/RealVisXL_V5.0".into(), // diffusers SDXL snapshot
identitynet: WeightsSource::Dir("/path/to/InstantX--InstantID/ControlNetModel".into()),
ip_adapter: "ip-adapter.safetensors".into(), // <- from this repo
})?
.with_face(
&Weights::from_file("scrfd_10g.safetensors")?, // <- from this repo
&Weights::from_file("arcface_iresnet100.safetensors")?, // <- from this repo
)?;
let out = model.generate(&InstantIdRequest { /* prompt, w/h, steps, guidance, scales, seed */ ..Default::default() }, &reference_image)?;
For pose mode add .with_openpose(&WeightsSource::Dir("/path/to/xinsir--controlnet-openpose-sdxl-1.0".into()))?
and call generate_pose(req, &reference, &keypoints); for the ADetailer-style face-restore pass
call restore_face(req, &base, &reference_embedding).
In SceneWorks (download-on-first-use)
The SceneWorks Rust GPU worker fetches these three files from this repo on first use into its app
cache (mirroring the SceneWorks/yolo11m-person-detect-mlx and SceneWorks/sam2-mlx pattern). You
can pre-stage them with the env override SCENEWORKS_INSTANTID_WEIGHTS=/dir/with/the/three/files.
Validation (real-weight, MLX, RealVisXL_V5.0 @ 1024²/30, fp16)
| Mode | Metric | Result |
|---|---|---|
Single identity (generate) |
ArcFace-cosine(ref, generated) | 0.8731 (torch baseline ≈ 0.876) |
Angle set (generate_angle, three-quarter right) |
ArcFace-cosine | 0.8343 |
Pose mode (generate_pose, full-body) |
ArcFace-cosine | 0.7129 (small full-body face) |
Face-restore (restore_face) |
ArcFace-cosine | base 0.7370 → 0.8338 |
Reproducing the conversion
All three converters live in mlx-gen/tools/
and run in a torch venv (torch + safetensors; insightface for SCRFD/ArcFace ONNX import):
python tools/convert_instantid.py # InstantX/InstantID ip-adapter.bin -> ip-adapter.safetensors
python tools/convert_scrfd.py # antelopev2 scrfd_10g_bnkps.onnx -> scrfd_10g.safetensors
python tools/convert_glintr100.py # antelopev2 glintr100.onnx -> arcface_iresnet100.safetensors
Provenance & licensing
These are format conversions of third-party weights; the upstream licenses govern use. Verify you comply with each before using them:
ip-adapter.safetensors— derived fromInstantX/InstantID(Apache-2.0). InstantID research: "InstantID: Zero-shot Identity-Preserving Generation in Seconds" (Wang et al., 2024).scrfd_10g.safetensorsandarcface_iresnet100.safetensors— derived from the InsightFaceantelopev2model pack (scrfd_10g_bnkpsglintr100). The InsightFace pretrained models are released for non-commercial research purposes only — see the InsightFace repository for their terms. Do not use these two files in a commercial setting without securing appropriate rights from the upstream authors.
license: other reflects this mix; this card is the authoritative license statement. No additional
license is granted by the conversion. Conversions produced by the
mlx-gen tooling (Apache-2.0 code).
Quantized