🗺️ Celeste Imperia: The Forge Guide

Welcome to the Forge. This guide is designed to help you navigate our optimized model suite and choose the perfect "flavor" for your specific hardware.

Whether you are running a high-end workstation or a 4GB RAM laptop, we have a build for you.

🏎️ 1. Choose Your Engine (Format)

We provide models in two primary formats, each optimized for different ecosystems:

OpenVINO (Intel Optimized)

Best for Windows and Linux systems with Intel Core processors or Intel ARC/Iris graphics. These models leverage the optimum-intel library for maximum hardware utilization.

Use case: High-speed local image generation and real-time speech-to-text.

GGUF (Universal CPU)

The industry standard for "run-anywhere" AI. These models are designed for llama.cpp and work seamlessly on Apple Silicon (M1/M2/M3), AMD, and Snapdragon devices.

Use case: Large Language Models (LLMs) running in private, low-resource environments.

💎 2. Choose Your Precision (The Trinity)

We categorize our models into three tiers. Use the table below to find your match:

Tier	Precision	Hardware Requirement	Best For...
Master	`FP16`	32GB+ RAM / 16GB VRAM	Production Quality: No loss in detail. Best for professional creative work.
Pro	`INT8`	16GB RAM	Daily Driving: 50% smaller size with ~99% quality retention. Perfectly balanced.
Lite	`INT4`	8GB RAM / Laptops	Maximum Speed: The smallest possible footprint. Ideal for background tasks and edge devices.

🚀 3. Hardware Recommendation Matrix

The Laptop Setup (Lite/Mobile)

Target: 8GB RAM / Intel i5 (10th Gen+)
Recommendation: Use our INT4 (Lite) OpenVINO models or Q4_K_M GGUF weights.
Result: Fast, snappy responses without freezing your system.

The Creator Setup (Pro/Standard)

Target: 16GB - 32GB RAM / Intel i7 or i9
Recommendation: Use our INT8 (Pro) OpenVINO models for SDXL and Q8_0 for LLMs.
Result: Professional-grade outputs with lightning-fast inference.

The Workstation Setup (Master)

Target: 64GB RAM / Intel ARC A770 or RTX 4000
Recommendation: Use our FP16 (Master) suite.
Result: Zero-compromise AI at maximum hardware throughput.

🛠️ Quick Performance Tips

The "First Run" Tax: The very first time you run an OpenVINO model, it will take 30-60 seconds to compile the graph. Don't cancel it. Every run after that will be nearly instant.
Guidance Scale: For our SDXL Trinity, always keep your guidance_scale between 1.0 and 2.0. We use fused LCM technology, and high CFG values will cause artifacts.
Background Tasks: AI inference is CPU-heavy. For the fastest results, close memory-heavy apps like Chrome or Photoshop before starting a large generation.

Need more help? Check out our individual repository Model Cards for specific implementation code.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support