LAM-20K CoreML (INT8 Quantized)

CoreML conversion of LAM (Large Avatar Model) for on-device 3D avatar reconstruction on iOS/macOS.

Single photo in, animatable 3D Gaussian head avatar out.

Model Details

Property Value
Source 3DAIGC/LAM-20K (SIGGRAPH 2025)
Parameters 557.6M
Input 518x518 RGB image (DINOv2 ViT-L/14 patch-aligned)
Output 20,018 Gaussians x 14 channels
Precision INT8 (linear symmetric quantization)
Model size 609 MB
Format CoreML .mlpackage (iOS 17+)
Minimum deployment iOS 17.0 / macOS 14.0

Output Channels (14 per Gaussian)

Channels Meaning
0-2 Position offsets (xyz)
3-5 Colors (RGB, sigmoid-activated)
6 Opacity (sigmoid-activated)
7-9 Scales (3, exp-activated)
10-13 Rotations (unit quaternion)

Architecture

Input Image (518x518)
  |
DINOv2 ViT-L/14 Encoder --> multi-scale image features
  |
10-layer SD3-style Transformer Decoder (FLAME canonical queries)
  |
GSLayer MLP Heads --> 20,018 Gaussians x 14 channels

The 20,018 Gaussians correspond to the FLAME parametric face mesh (5,023 vertices) with one level of subdivision.

Usage

Swift (iOS/macOS)

import CoreML

let config = MLModelConfiguration()
config.computeUnits = .all

let model = try MLModel(contentsOf: compiledModelURL, configuration: config)

// Input: 518x518 RGB image as CVPixelBuffer
let input = try MLDictionaryFeatureProvider(dictionary: [
    "input_image": MLFeatureValue(pixelBuffer: pixelBuffer)
])

let output = try model.prediction(from: input)
let attrs = output.featureValue(for: "gaussian_attributes")!.multiArrayValue!
// Shape: (1, 20018, 14)

Python (verification)

import coremltools as ct

model = ct.models.MLModel("LAMReconstruct_int8.mlpackage")
prediction = model.predict({"input_image": pil_image})
attrs = prediction["gaussian_attributes"]  # (1, 20018, 14)

Files

File Size Description
LAMReconstruct_int8.mlpackage/ 609 MB CoreML model (INT8 quantized)
LAMReconstruct_int8.mlpackage.zip 525 MB Zipped version for direct download

Conversion

Converted from the original PyTorch checkpoint using coremltools 9.0 with extensive patching for macOS compatibility (CUDA stubs, in-place op replacement, torch.compile removal). See conversion script.

Key conversion steps:

  1. Stub CUDA-only modules (diff_gaussian_rasterization, simple_knn)
  2. Stub chumpy for FLAME model deserialization
  3. Patch GSLayer in-place ops for CoreML tracing
  4. Replace custom trunc_exp autograd.Function with torch.exp
  5. Trace in float16 on CPU (~13.6GB peak memory)
  6. Convert to CoreML with iOS 17 target
  7. INT8 linear symmetric quantization

Animation

The output Gaussians are positioned on the FLAME parametric face mesh. To animate:

  1. Load the FLAME-to-ARKit blendshape mapping (52 ARKit shapes mapped to FLAME expression parameters)
  2. For each ARKit blendshape, apply FLAME Linear Blend Skinning to compute per-Gaussian position deltas
  3. At runtime: deformed[i] = neutral[i] + sum(weight_j * delta_j[i])

Compatible with ARKit face tracking (52 blendshapes) and any system that outputs ARKit-style blend weights.

Citation

@article{lam2025,
    title={LAM: Large Avatar Model for One-Shot Animatable Gaussian Head Avatar},
    author={Alibaba 3DAIGC Team},
    journal={SIGGRAPH 2025},
    year={2025}
}

License

Apache-2.0 (same as the original LAM model).

Downloads last month
11
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for spizzerp/LAM-20K-CoreML

Base model

3DAIGC/LAM-20K
Quantized
(1)
this model