krystv
/

ArtFlow

Model card Files Files and versions

xet

Community

krystv commited on Apr 28

Commit

a01ca3e

verified ·

1 Parent(s): 771ff98

Add ArtFlow architecture specification

Browse files

Files changed (1) hide show

README.md +31 -0

README.md ADDED Viewed

	@@ -0,0 +1,31 @@

+# 🎨 ArtFlow: Reasoning-Native Artistic Image Generation for Mobile Devices
+## A Novel Architecture for Intelligent, Lightweight Illustration Generation
+**Version:** 1.0
+**Status:** Architecture Specification + Prototype Implementation
+**Target:** 2-4GB RAM, 1024px native generation, anime/illustration focus
+### 🔬 Validated Prototype Results
+```
+📊 Parameter Count: 114.7M (backbone only, without text encoder/VAE)
+💾 Model Memory: 229 MB (FP16) / 115 MB (INT8)
+📱 Total inference: ~235 MB (well under 2GB mobile budget)
+🌊 Wavelet reconstruction: perfect (error < 1e-7)
+🔀 Zigzag scan: perfect round-trip
+✅ Forward pass: correct shapes
+✅ Backward pass: no NaN/Inf gradients
+```
+See `ARCHITECTURE.md` for the complete 1000+ line technical specification, and `artflow_model.py` for the validated PyTorch implementation.
+### Key Novel Contributions
+1. **WaveMamba**: Wavelet-decomposed Mamba denoising backbone (O(n) complexity)
+2. **Recursive Latent Reasoning**: TRM/HRM-style reasoning within denoising steps
+3. **ArtStyle Matrix**: Explicit, manipulable style space for illustration generation
+4. **Liquid-dynamics Mood Control**: Physics-inspired mood modulation
+5. **Art-Aware Velocity Scaling**: Frequency-weighted flow matching loss
+6. **KAN-based Composition**: Kolmogorov-Arnold Networks for compositional rules
+### Research Foundation
+Synthesized from 40+ papers including MobileDiffusion, SnapGen, DreamLite, ZigMa, DiMSUM, DC-AE, TRM/HRM, Liquid Neural Networks, RWKV, KAN, Illustrious, and more.