Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# Visione
|
| 2 |
|
| 3 |
A local-first AI creative production suite for consumer GPUs.
|
|
@@ -13,6 +19,29 @@ The pipeline covers the full creative arc: text-to-image and video generation, r
|
|
| 13 |
|
| 14 |
**Stack:** Python 3.12 + FastAPI + SSE 路 React 18 + TypeScript + Zustand 路 Tauri 2 desktop shell 路 ComfyUI headless for video inference 路 PyTorch 2.7 + CUDA
|
| 15 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
---
|
| 17 |
|
| 18 |
## Components
|
|
@@ -28,43 +57,28 @@ The pipeline covers the full creative arc: text-to-image and video generation, r
|
|
| 28 |
| **Characters** | Persistent character library with 5-shot reference generation for cross-shot consistency |
|
| 29 |
| **Gallery** | Unified asset browser across all components |
|
| 30 |
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
| SeedVR2 3B FP8 | Video upscaling |
|
| 54 |
-
| RIFE v4.26 | Frame interpolation |
|
| 55 |
-
| ACE-Step SFT + Base | Music generation |
|
| 56 |
-
| ACE-Step LM 1.7B | Music language model |
|
| 57 |
-
| ACE-Step VAE + TextEnc | Music pipeline |
|
| 58 |
-
| Qwen3-TTS 1.7B (3 variants) | Text-to-speech |
|
| 59 |
-
| HunyuanVideo-Foley XL | Video-to-audio |
|
| 60 |
-
| Wan 2.1 T2V 1.3B | StyleMaster backbone |
|
| 61 |
-
| StyleMaster checkpoints | Style injection weights |
|
| 62 |
-
| CLIP ViT-H-14 | Style extraction |
|
| 63 |
-
| IS-Net (rembg) | Background removal (CPU) |
|
| 64 |
-
| LatentSync 1.6 | Lip sync (quality) |
|
| 65 |
-
| MuseTalk 1.5 | Lip sync (fast) |
|
| 66 |
-
| InsightFace buffalo_l | Face detection/swap |
|
| 67 |
-
| Inswapper_128.onnx | Face swap model |
|
| 68 |
|
| 69 |
---
|
| 70 |
|
|
@@ -76,7 +90,27 @@ The desktop shell (Tauri 2) wraps the frontend as a native window and manages ba
|
|
| 76 |
|
| 77 |
Components share models where possible. Image generation models are reused across Imagine, Retouch, Retexture, and Storyboard; video models feed through from Imagine into Retexture and Sound Studio. The Video Editor and Gallery operate CPU-side, assembling outputs produced by the GPU components.
|
| 78 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 79 |
---
|
|
|
|
| 80 |
## License
|
| 81 |
|
| 82 |
MIT
|
|
|
|
| 1 |
+
<p align="center">
|
| 2 |
+
<img
|
| 3 |
+
src="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/YJHpzH436J828nNymCNk7.png"
|
| 4 |
+
width="600" />
|
| 5 |
+
</p>
|
| 6 |
+
|
| 7 |
# Visione
|
| 8 |
|
| 9 |
A local-first AI creative production suite for consumer GPUs.
|
|
|
|
| 19 |
|
| 20 |
**Stack:** Python 3.12 + FastAPI + SSE 路 React 18 + TypeScript + Zustand 路 Tauri 2 desktop shell 路 ComfyUI headless for video inference 路 PyTorch 2.7 + CUDA
|
| 21 |
|
| 22 |
+
<table align="center"><tr>
|
| 23 |
+
<td><a href="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/X0pIezsKwIRl-Guw3k58A.
|
| 24 |
+
png"><img
|
| 25 |
+
src="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/X0pIezsKwIRl-Guw3k58A.png"
|
| 26 |
+
width="300" /></a></td>
|
| 27 |
+
<td><a href="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/euOPxXTNWxjmRl-C88uU2.
|
| 28 |
+
png"><img
|
| 29 |
+
src="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/euOPxXTNWxjmRl-C88uU2.png"
|
| 30 |
+
width="300" /></a></td>
|
| 31 |
+
<td><a href="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/lW_zGi1O8HblIoamV0RLr.
|
| 32 |
+
png"><img
|
| 33 |
+
src="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/lW_zGi1O8HblIoamV0RLr.png"
|
| 34 |
+
width="300" /></a></td>
|
| 35 |
+
<td><a href="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/qKWonqa8ZQvl3CTdD0Pje.
|
| 36 |
+
png"><img
|
| 37 |
+
src="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/qKWonqa8ZQvl3CTdD0Pje.png"
|
| 38 |
+
width="300" /></a></td>
|
| 39 |
+
<td><a href="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/IjNbVVpnLepr9NI8cdxA3.
|
| 40 |
+
png"><img
|
| 41 |
+
src="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/IjNbVVpnLepr9NI8cdxA3.png"
|
| 42 |
+
width="300" /></a></td>
|
| 43 |
+
</tr></table>
|
| 44 |
+
|
| 45 |
---
|
| 46 |
|
| 47 |
## Components
|
|
|
|
| 57 |
| **Characters** | Persistent character library with 5-shot reference generation for cross-shot consistency |
|
| 58 |
| **Gallery** | Unified asset browser across all components |
|
| 59 |
|
| 60 |
+
<table align="center"><tr>
|
| 61 |
+
<td><a href="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/No1ABmspTrCWqpvsukafQ.
|
| 62 |
+
png"><img
|
| 63 |
+
src="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/No1ABmspTrCWqpvsukafQ.png"
|
| 64 |
+
width="300" /></a></td>
|
| 65 |
+
<td><a href="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/mXVAiuj8Vpik0a_UNREIU.
|
| 66 |
+
png"><img
|
| 67 |
+
src="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/mXVAiuj8Vpik0a_UNREIU.png"
|
| 68 |
+
width="300" /></a></td>
|
| 69 |
+
<td><a href="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/Gmzmavqm9antFHYsbl4Ka.
|
| 70 |
+
png"><img
|
| 71 |
+
src="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/Gmzmavqm9antFHYsbl4Ka.png"
|
| 72 |
+
width="300" /></a></td>
|
| 73 |
+
<td><a href="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/BbYSmMGcXZENjW-LBiIUz.
|
| 74 |
+
png"><img
|
| 75 |
+
src="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/BbYSmMGcXZENjW-LBiIUz.png"
|
| 76 |
+
width="300" /></a></td>
|
| 77 |
+
<td><a href="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/jcy5-_cKf0oa_Utf3ZXbK.
|
| 78 |
+
png"><img
|
| 79 |
+
src="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/jcy5-_cKf0oa_Utf3ZXbK.png"
|
| 80 |
+
width="300" /></a></td>
|
| 81 |
+
</tr></table>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 82 |
|
| 83 |
---
|
| 84 |
|
|
|
|
| 90 |
|
| 91 |
Components share models where possible. Image generation models are reused across Imagine, Retouch, Retexture, and Storyboard; video models feed through from Imagine into Retexture and Sound Studio. The Video Editor and Gallery operate CPU-side, assembling outputs produced by the GPU components.
|
| 92 |
|
| 93 |
+
<table align="center"><tr>
|
| 94 |
+
<td><a href="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/7_CDVBV6B08IosFIkr5jq.
|
| 95 |
+
png"><img
|
| 96 |
+
src="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/7_CDVBV6B08IosFIkr5jq.png"
|
| 97 |
+
width="300" /></a></td>
|
| 98 |
+
<td><a href="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/fRhZcUYtK_TE8uIlXyPH-.
|
| 99 |
+
png"><img
|
| 100 |
+
src="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/fRhZcUYtK_TE8uIlXyPH-.png"
|
| 101 |
+
width="300" /></a></td>
|
| 102 |
+
<td><a href="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/B1J7kJuRPiPY12-Wja0jW.
|
| 103 |
+
png"><img
|
| 104 |
+
src="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/B1J7kJuRPiPY12-Wja0jW.png"
|
| 105 |
+
width="300" /></a></td>
|
| 106 |
+
<td><a href="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/MXtHgy7hlq9YZVaQED_WA.
|
| 107 |
+
png"><img
|
| 108 |
+
src="https://cdn-uploads.huggingface.co/production/uploads/695017bb0c3fc8b9c78497e9/MXtHgy7hlq9YZVaQED_WA.png"
|
| 109 |
+
width="300" /></a></td>
|
| 110 |
+
</tr></table>
|
| 111 |
+
|
| 112 |
---
|
| 113 |
+
|
| 114 |
## License
|
| 115 |
|
| 116 |
MIT
|