| # AetherOps Hugging Face Docker Space — Build Spec |
|
|
| ## Objective |
|
|
| Build a polished Hugging Face Docker Space for the AetherOps organization that acts as: |
|
|
| 1. a public-facing product showcase |
| 2. a docs-style explainer for Aether Voice Studio |
| 3. a credibility layer for the Aether ecosystem |
| 4. a controlled demo surface for voice workflows |
| 5. a launchpad into AetherOps, Aether Voice Studio, and future public docs |
|
|
| This is not intended to replace production infrastructure. |
|
|
| --- |
|
|
| ## Product Positioning |
|
|
| The Space should feel like: |
| - a premium technical landing page |
| - a product docs entrypoint |
| - a live architecture showcase |
| - a curated interactive demo, not a toy |
|
|
| The tone should communicate: |
| - serious infrastructure |
| - creator utility |
| - enterprise utility |
| - model routing clarity |
| - composable voice tooling |
|
|
| --- |
|
|
| ## Platform Constraints |
|
|
| ### Docker Space requirements |
| The Space must be implemented as a Hugging Face Docker Space using: |
| - `sdk: docker` in README YAML |
| - a standard `Dockerfile` |
| - a single externally exposed app port via `app_port` |
|
|
| If multiple internal services are needed, they must be proxied behind the single external app port. |
|
|
| Do not use Docker Compose as the deployment contract for Hugging Face Space runtime. |
|
|
| --- |
|
|
| ## Primary Goal |
|
|
| Deliver a highly polished front-end app that explains and showcases: |
|
|
| - Aether Voice Studio |
| - Voice routes and model matrix |
| - ASR / TTS / Dialogue / Voice Design flows |
| - future Sound Design lane |
| - LLM routing |
| - internal architecture at a high level |
| - product screenshots / diagrams / cards |
| - selected live demo interactions if safe and lightweight |
|
|
| --- |
|
|
| ## Recommended Technical Stack |
|
|
| Preferred: |
| - Next.js or Vite + React frontend |
| - FastAPI backend for lightweight API/demo endpoints |
| - Nginx reverse proxy only if needed for clean single-port routing |
|
|
| Alternative: |
| - pure frontend app with static + client-side demo interactions if no backend is necessary |
|
|
| Do not build a heavyweight all-in-one inference stack inside the Space for v1. |
|
|
| --- |
|
|
| ## Information Architecture |
|
|
| ### Main sections |
| 1. Hero |
| 2. Product Overview |
| 3. Aether Voice Studio |
| 4. Route / Model Matrix |
| 5. Demo Workflows |
| 6. Creator Workflows |
| 7. Enterprise / VoiceOps Workflows |
| 8. Docs Entry |
| 9. Architecture |
| 10. FAQ |
| 11. CTA / Links |
|
|
| --- |
|
|
| ## Page Layout |
|
|
| ### 1. Hero |
| Content: |
| - product title |
| - short positioning statement |
| - 2-3 primary CTA buttons |
| - subtle animated visual or waveform motif |
| - one-line trust statement |
|
|
| Primary CTAs: |
| - View Voice Studio |
| - Explore Docs |
| - View Architecture |
|
|
| ### 2. Product Overview |
| Cards for: |
| - ASR Live |
| - ASR File |
| - TTS Live |
| - TTS File |
| - TTS Studio |
| - VoiceOps |
| - LLM Routing |
| - Sound Design (coming soon) |
|
|
| Each card should have: |
| - short explanation |
| - intended user |
| - route target summary |
| - status badge (live / in build / planned) |
|
|
| ### 3. Aether Voice Studio |
| Large section explaining: |
| - why TTS Live is separate from TTS Studio |
| - voice cloning |
| - voice design |
| - batch narration |
| - dialogue generation |
| - reusable voice assets |
|
|
| Use diagrams or cards. |
|
|
| ### 4. Route / Model Matrix |
| Present the canonical routing system: |
| - `moss_realtime` |
| - `moss_tts` |
| - `moss_ttsd` |
| - `moss_voice_generator` |
| - `moss_soundeffect` |
| - `chatterbox` fallback |
|
|
| Display: |
| - use case |
| - route |
| - output type |
| - UI surface |
|
|
| ### 5. Demo Workflows |
| Show example workflows: |
| - Live voice agent response |
| - Clone a reference voice |
| - Design a custom character voice |
| - Batch narration export |
| - Dialogue scene generation |
| - Future sound effects |
|
|
| If a backend demo is included, keep it small and stable. |
|
|
| ### 6. Creator Workflows |
| Emphasize: |
| - docuseries narration |
| - faceless YouTube content |
| - dialogue scenes |
| - voice design presets |
| - future ambience and transition sounds |
|
|
| ### 7. Enterprise / VoiceOps Workflows |
| Emphasize: |
| - telephony |
| - branded voice agents |
| - support routing |
| - dispatch and field service voices |
| - enterprise voice infrastructure |
|
|
| ### 8. Docs Entry |
| This should look like a docs portal. |
| Include: |
| - Overview |
| - Quick Start |
| - Architecture |
| - Voice Routes |
| - Voice Registry |
| - TTS Studio |
| - LLM Routing |
| - API Reference |
| - Troubleshooting |
|
|
| ### 9. Architecture |
| Show: |
| - Voice Studio frontend |
| - routing layer |
| - route targets |
| - provider-aware LLM config |
| - model registry / voice registry |
| - output artifacts |
|
|
| ### 10. FAQ |
| Keep concise. |
| Include: |
| - Why multiple MOSS routes? |
| - Why separate TTS Live from Studio? |
| - What is voice design? |
| - What is a voice registry? |
| - What is the role of Chatterbox? |
| - Is Sound Design included? |
|
|
| ### 11. CTA / Links |
| Buttons to: |
| - AetherOps |
| - docs |
| - future public API |
| - internal app screenshots |
| - contact / partner form later |
|
|
| --- |
|
|
| ## Navigation |
|
|
| Top nav: |
| - Overview |
| - Studio |
| - Routes |
| - Demos |
| - Docs |
| - Architecture |
| - FAQ |
|
|
| Optional secondary links: |
| - GitHub |
| - Hugging Face org |
| - API docs |
|
|
| --- |
|
|
| ## Visual Design Direction |
|
|
| Style goals: |
| - dark, premium, technical |
| - high-contrast but clean |
| - soft gradients |
| - waveform / voice / routing motifs |
| - “serious studio” look, not toy AI landing page |
|
|
| Use: |
| - large cards |
| - clean spacing |
| - strong section boundaries |
| - subtle motion |
| - architecture diagrams |
| - demo output panels |
|
|
| --- |
|
|
| ## Demo Scope for v1 |
|
|
| Allowed: |
| - static screenshots |
| - route matrix interaction |
| - expandable workflow cards |
| - example voice design prompt gallery |
| - selected sample audio players if pre-generated |
| - lightweight metadata-driven demo UI |
|
|
| Avoid for v1: |
| - full realtime inference in the Space |
| - heavy model hosting |
| - anything that depends on long cold starts |
| - telephony integration |
| - complex secrets-dependent internal routing |
|
|
| --- |
|
|
| ## Suggested v1 Deliverables |
|
|
| ### Required |
| - Docker Space app builds and runs on single app port |
| - polished landing page |
| - route/model matrix section |
| - docs-style navigation section |
| - screenshots or diagrams from Aether Voice Studio |
| - creator and enterprise workflow sections |
| - FAQ section |
| - CTA/footer |
|
|
| ### Nice to have |
| - small demo API endpoint |
| - small JSON-driven voice preset gallery |
| - pre-generated audio preview cards |
| - architecture diagram tab |
|
|
| --- |
|
|
| ## Hugging Face Space Repo Files |
|
|
| Minimum structure: |
|
|
| - `README.md` |
| - `Dockerfile` |
| - `.dockerignore` |
| - `app/` |
| - `app/frontend/` |
| - `app/backend/` |
| - `app/public/` |
| - `app/data/` |
| - `app/data/voice_seed_library.json` |
| - `app/data/route_model_matrix.json` |
|
|
| If frontend-only: |
| - simplify accordingly |
|
|
| --- |
|
|
| ## README YAML |
|
|
| Use Hugging Face Space metadata in the README YAML block. |
|
|
| Recommended fields: |
| - title |
| - emoji |
| - colorFrom |
| - colorTo |
| - sdk: docker |
| - app_port |
| - short_description |
| - pinned |
| - header |
| - fullWidth |
| - suggested_hardware |
| - models |
| - tags |
| |
| --- |
| |
| ## Runtime / Secrets |
| |
| The Space should be designed so it can run without production secrets for v1. |
| |
| If environment variables are needed: |
| - use Hugging Face Space Settings variables/secrets |
| - keep secrets server-side only |
| - avoid any browser-visible secret flow |
| |
| --- |
| |
| ## Persistence |
| |
| Do not assume persistent app state for v1. |
| Use: |
| - repo assets |
| - generated static content |
| - environment variables |
| - optional external APIs later |
| |
| Do not design v1 around runtime disk persistence. |
| |
| --- |
| |
| ## Build / Runtime Requirements |
| |
| - must run as a proper Docker Space app |
| - respect single public port model |
| - avoid permission issues by using UID 1000-compatible ownership in image build |
| - no GPU requirement for v1 Space |
| |
| --- |
| |
| ## Non-goals |
| |
| - no full production inference cluster inside Hugging Face Space |
| - no Docker Compose-based deployment target |
| - no full telephony backend in Space |
| - no user secrets vault in v1 |
| - no multi-tenant auth flows in v1 |
| |
| --- |
| |
| ## Success Criteria |
| |
| The finished Space should: |
| 1. look premium and intentional |
| 2. explain Aether Voice Studio clearly |
| 3. demonstrate model/route sophistication |
| 4. support partner/investor/customer credibility |
| 5. be easy to extend later into a richer public product surface |
| |
| Add a small Data Contracts section with downloadable or viewable JSON snippets: |
| |
| voice registry shape |
| |
| route/model matrix |
| |
| seed voice library |
| |
| provider config shape |
| |
| That quietly signals: “this thing is machine-readable and automation-native.” |
| That matters for your brand. |
| |