creatora
/

liveface

computer-vision

neural-rendering

knowledge-distillation

Model card Files Files and versions

liveface / README.md

creatora's picture

Update README.md

af3c4a5 verified 7 days ago

|

history blame contribute delete

3.18 kB

	---
	license: cc-by-nc-nd-4.0
	tags:
	- face-animation
	- mobile
	- real-time
	- avatar
	- computer-vision
	- neural-rendering
	- knowledge-distillation
	pipeline_tag: image-to-video
	---

	# LiveFace

	Real-Time Photorealistic Facial Animation on Low-End Mobile Devices

	Patent Pending (USPTO) \| [Paper (Zenodo)](https://doi.org/10.5281/zenodo.19477081) \| [Website](https://creatora.app)

	## What is LiveFace?

	LiveFace is a patent-pending neural rendering system that turns a single photo into a photorealistic talking avatar running at 30 fps on budget mobile devices — fully offline, no cloud required.

	## Architecture

	Four compact per-avatar neural decoders + one shared compositor-upscaler:

	\| Module \| Parameters \| Output \| Function \|
	\|--------\|-----------\|--------\|----------\|
	\| MouthDecoder \| 5-12M \| 128x96 RGBA \| Lip sync, jaw, emotions \|
	\| EyeDecoder \| 1.3-2M \| 192x80 RGBA \| Blink, gaze, wink \|
	\| HairDecoder \| 3-5M \| 192x192 RGBA \| Hair physics, inertia \|
	\| BodyDecoder \| 3-12M \| 256x64 RGBA \| Breathing, shoulders \|
	\| Compositor-Upscaler \| ~7M (shared) \| 360x640 (9:16) \| Seam blending, upscale, lighting \|

	Total: ~20M INT8 parameters \| ~19ms per frame on Snapdragon 439

	## Key Features

	- Photorealistic — neural rendering, not cartoon or stylized
	- Real-time — 30+ fps on budget phones ($100+)
	- Offline — fully on-device, no cloud, no internet
	- One photo — create avatar from a single selfie
	- Identity embedding — 128-dim learnable per-avatar parameter
	- Dual input — viseme-based (audio) or landmark-based (MediaPipe)
	- Portrait 9:16 — optimized for mobile displays

	## Training

	Per-avatar decoders are trained via knowledge distillation:
	1. Server-side teacher model generates diverse training data from RAVDESS emotional speech videos
	2. Per-frame quality filter (Haar + blur + SSIM) ensures data integrity (~0.6% rejection)
	3. Student decoders learn from teacher-generated pairs with L1 + perceptual loss
	4. Each avatar trains in ~40 minutes on a single GPU

	## Performance

	\| Device \| Compute \| Latency \| FPS \|
	\|--------\|---------\|---------\|-----\|
	\| Snapdragon 439 \| ~10 GFLOPS \| ~19ms \| 30+ \|
	\| Snapdragon 665 \| ~22 GFLOPS \| ~12ms \| 30+ \|
	\| Snapdragon 778G \| ~65 GFLOPS \| ~4ms \| 60+ \|

	## Model Weights

	Model weights are proprietary and not distributed in this repository. This page serves as documentation for the LiveFace architecture.

	For licensing inquiries: business@creatora.app

	## Publications

	- Zenodo: [DOI: 10.5281/zenodo.19477081](https://doi.org/10.5281/zenodo.19477081)
	- TechRxiv: Under review
	- arXiv: Pending submission (cs.CV)

	## Authors

	- Dmitry Rodin — Founder & Lead Researcher, Creatora (dmitry.r@creatora.app)
	- Nikita Rodin — Texas Tech University (nikita.r@creatora.app)

	## Citation

	```bibtex
	@misc{rodin2026liveface,
	title={LiveFace: Real-Time Photorealistic Facial Animation on Low-End Mobile Devices via Compact Per-Avatar Neural Decoders and Universal Compositor-Upscaler},
	author={Dmitry Rodin and Nikita Rodin},
	year={2026},
	doi={10.5281/zenodo.19477081},
	url={https://doi.org/10.5281/zenodo.19477081}
	}