Saken OmniFF
FFmpeg for AI β universal multimodal runtime for inference, generation, and transformation
Quickstart β’ Pipelines β’ Architecture β’ API β’ Whitepaper
What is OmniFF?
OmniFF applies the FFmpeg philosophy to AI: any input modality β any output modality, through a managed graph of models, filters, and validators.
omniff -i "What is the capital of Kazakhstan?" -p "Answer briefly" --thinking off
# β Astana
omniff -i photo.jpg -p "Describe this image"
# β A sunset over mountains with golden light...
omniff -i meeting.wav
# β [Full transcription of the audio]
omniff -i sketch.png -p "Make it photorealistic" -o result.png
# β result.png (SDXL-turbo image-to-image)
omniff -i "A cyberpunk city at night" -f image -o city.png
# β city.png (text-to-image generation)
omniff -i lecture.mp4 -p "Summarize this video"
# β The lecture covers three main topics...
omniff -i report.pdf -p "Extract key findings"
# β Key findings: 1) Revenue grew 23%...
Quickstart
# Install
pip install -e "python/.[all]"
# Text β Text
omniff -i "Explain quantum computing" --thinking normal
# Image β Text
omniff -i photo.jpg -p "What's in this image?"
# Audio β Text
omniff -i recording.wav --lang kk
# Text β Image
omniff -i "A red panda eating bamboo" -f image -o panda.png
# Start HTTP API
python -m omniff.api
Pipelines
| Pipeline | Input | Output | Model | Status |
|---|---|---|---|---|
| Text β Text | text / prompt | text | Qwen3-4B | β |
| Image β Text | image + prompt | text | Qwen2.5-VL-3B | β |
| Audio β Text | audio file | text | Whisper-large-v3 | β |
| Image β Image | image + prompt | image | SDXL-turbo | β |
| Text β Image | prompt | image | SDXL-turbo | β |
| Video β Text | video + prompt | text | Qwen2.5-VL-3B | β |
| Document β Text | PDF/DOCX/TXT | text | Extraction + Qwen3 | β |
Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β OmniFF Runtime β
β β
β Input ββ Demuxer ββ Router ββ GraphPlanner β
β β β
β βββββββββββ΄βββββββββββ β
β β OmniGraph DAG β β
β β β β
β β βββββββ ββββββββ β β
β β βModelββββFilterβ β β
β β βββββββ ββββ¬ββββ β β
β β β β β
β β ββββββ΄ββββ β β
β β βValidateβ β β
β β ββββββ¬ββββ β β
β ββββββββββββββββ β β
β β β β
β Muxer ββ Output β
β β
β ββββββββββββ βββββββββββββ ββββββββββββ βββββββββββββ β
β βScheduler β β Thinking+ β β Plugins β β HTTP API β β
β βhot/warm/ β βoff/fast/ β β custom β β FastAPI β β
β βcold/LRU β βnormal/deepβ β models β β /run β β
β ββββββββββββ βββββββββββββ ββββββββββββ βββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Core Components
| Component | Description |
|---|---|
| KeywordRouter | Routes input to the right pipeline based on modality + keywords |
| GraphPlanner | Builds DAG execution plans (demux β model β validate β mux) |
| GraphExecutor | Walks DAG in topological order, passes data between nodes |
| ModelScheduler | Hot/warm/cold loading with TTL eviction and LRU |
| ValidationPipeline | Multi-pass validation chain with required/optional passes |
| Thinking+ | Prompt control: off β fast β normal β deep β research |
| PluginRegistry | Register custom model implementations |
API
Python SDK
from omniff.runtime.engine import OmniFFRuntime
runtime = OmniFFRuntime.from_yaml("omniff.yaml")
# Text β Text
result = runtime.run(input="Explain gravity", thinking="normal")
print(result.output_text)
# Image β Text
result = runtime.run(input="photo.jpg", prompt="Describe this")
print(result.output_text)
# Text β Image
result = runtime.run(input="A sunset", output_modality="image", output="sunset.png")
print(result.output_path)
HTTP API
# Start server
python -m omniff.api
# Text processing
curl -X POST http://localhost:8000/run \
-F "input_text=What is AI?" \
-F "thinking=fast"
# File processing
curl -X POST http://localhost:8000/run/file \
-F "file=@photo.jpg" \
-F "prompt=Describe this image"
# Health check
curl http://localhost:8000/health
CLI (FFmpeg-style)
omniff -i <input> [-p <prompt>] [-f <format>] [-o <output>] [--thinking <level>]
[--strength <0-1>] [--lang <code>] [--model <id>] [--seed <n>]
Project Structure
omniff/
βββ python/omniff/ # Python SDK (saken-omniff)
β βββ models/ # Model wrappers (LLM, VLM, ASR, ImageEdit, ...)
β βββ runtime/ # Engine, config, result
β βββ router/ # KeywordRouter
β βββ graph/ # OmniGraph, executor, planner, loader
β βββ scheduler/ # ModelScheduler (hot/warm/cold)
β βββ validators/ # Text/Image validators, pipeline
β βββ filters/ # Language detection
β βββ nodes/ # Node registry
β βββ api.py # FastAPI HTTP server
β βββ cli.py # CLI entry point
β βββ thinking.py # Thinking+ controller
β βββ plugins.py # Plugin model interface
βββ crates/ # Rust workspace
β βββ omniff-core/ # Core types (OmniPacket, OmniFrame, OmniNode)
β βββ omniff-graph/ # Graph types and planner trait
β βββ omniff-runtime/ # Runtime traits (Router, Executor, Scheduler)
β βββ omniff-cli/ # Rust CLI binary
βββ tests/python/ # 85 unit + 15 integration tests
βββ graph_templates/ # YAML pipeline templates
βββ omniff.yaml # Runtime configuration
βββ ARCHITECTURE.md # Full architectural whitepaper
Configuration
# omniff.yaml
runtime:
name: omniff
version: "1.0"
router:
type: keyword
experts:
text_small:
model_id: Qwen/Qwen3-4B
loading_policy: warm
ttl: 300
vision:
model_id: Qwen/Qwen2.5-VL-3B-Instruct
loading_policy: warm
asr:
model_id: openai/whisper-large-v3
loading_policy: cold
image_edit:
model_id: stabilityai/sdxl-turbo
loading_policy: cold
Testing
# All tests
PYTHONPATH=python python -m pytest tests/python/ -v
# Unit tests only
PYTHONPATH=python python -m pytest tests/python/unit/ -v
# Integration tests (requires GPU)
PYTHONPATH=python python -m pytest tests/python/integration/ -v
# Rust
cargo check --workspace
Tested on KazNU server (2Γ NVIDIA A10 22GB).
Naming Convention
| Surface | Identifier |
|---|---|
| CLI | omniff |
| Python package | saken-omniff |
| Rust crates | omniff-core, omniff-graph, omniff-runtime, omniff-cli |
| GitHub | stukenov/omniff |
| Hugging Face | stukenov/omniff-runtime |
License
Apache 2.0
Built by Saken Tukenov