πΊοΈ Celeste Imperia: The Forge Guide
Welcome to the Forge. This guide is designed to help you navigate our optimized model suite and choose the perfect "flavor" for your specific hardware.
Whether you are running a high-end workstation or a 4GB RAM laptop, we have a build for you.
ποΈ 1. Choose Your Engine (Format)
We provide models in two primary formats, each optimized for different ecosystems:
OpenVINO (Intel Optimized)
Best for Windows and Linux systems with Intel Core processors or Intel ARC/Iris graphics. These models leverage the optimum-intel library for maximum hardware utilization.
- Use case: High-speed local image generation and real-time speech-to-text.
GGUF (Universal CPU)
The industry standard for "run-anywhere" AI. These models are designed for llama.cpp and work seamlessly on Apple Silicon (M1/M2/M3), AMD, and Snapdragon devices.
- Use case: Large Language Models (LLMs) running in private, low-resource environments.
π 2. Choose Your Precision (The Trinity)
We categorize our models into three tiers. Use the table below to find your match:
| Tier | Precision | Hardware Requirement | Best For... |
|---|---|---|---|
| Master | FP16 |
32GB+ RAM / 16GB VRAM | Production Quality: No loss in detail. Best for professional creative work. |
| Pro | INT8 |
16GB RAM | Daily Driving: 50% smaller size with ~99% quality retention. Perfectly balanced. |
| Lite | INT4 |
8GB RAM / Laptops | Maximum Speed: The smallest possible footprint. Ideal for background tasks and edge devices. |
π 3. Hardware Recommendation Matrix
The Laptop Setup (Lite/Mobile)
- Target: 8GB RAM / Intel i5 (10th Gen+)
- Recommendation: Use our INT4 (Lite) OpenVINO models or Q4_K_M GGUF weights.
- Result: Fast, snappy responses without freezing your system.
The Creator Setup (Pro/Standard)
- Target: 16GB - 32GB RAM / Intel i7 or i9
- Recommendation: Use our INT8 (Pro) OpenVINO models for SDXL and Q8_0 for LLMs.
- Result: Professional-grade outputs with lightning-fast inference.
The Workstation Setup (Master)
- Target: 64GB RAM / Intel ARC A770 or RTX 4000
- Recommendation: Use our FP16 (Master) suite.
- Result: Zero-compromise AI at maximum hardware throughput.
π οΈ Quick Performance Tips
- The "First Run" Tax: The very first time you run an OpenVINO model, it will take 30-60 seconds to compile the graph. Don't cancel it. Every run after that will be nearly instant.
- Guidance Scale: For our SDXL Trinity, always keep your
guidance_scalebetween 1.0 and 2.0. We use fused LCM technology, and high CFG values will cause artifacts. - Background Tasks: AI inference is CPU-heavy. For the fastest results, close memory-heavy apps like Chrome or Photoshop before starting a large generation.
Need more help? Check out our individual repository Model Cards for specific implementation code.