--- license: apache-2.0 base_model: - Yuanshi/OminiControl ---
**An all-in-one, control framework for unified visual content creation using GenAI.** Generate multi-view, diverse-scene, and task-specific high-resolution images from a single subject imageโwithout fine-tuning. --- ## ๐ง Overview **ZenCtrl** is a comprehensive toolkit built to tackle core challenges in image generation: - No fine-tuning needed โ works from **a single subject image** - Maintains **control over shape, pose, camera angle, context** - Supports **high-resolution**, multi-scene generation - Modular toolkit for preprocessing, control, editing, and post-processing tasks ZenCtrl is based on OminiControl but enhanced with more fine-grained control, consistent subject preservation, and more improved and ready-to-use models. Our goal is to build an **agentic visual generation system** that can orchestrate image/video creation from **LLM-driven recipes.** --- ## ๐ฆ Github code https://github.com/FotographerAI/ZenCtrl/tree/main --- ## ๐ Toolkit Components (coming soon) ### ๐งน Preprocessing - Background removal - Matting - Reshaping - Segmentation ### ๐ฎ Control Models - Shape (Canny, HED, Scribble, Depth) - Pose (OpenPose, DensePose) - Mask control - Camera/View control ### ๐จ Post-processing - Deblurring - Color fixing - Natural blending ### โ๏ธ Editing Models - Inpainting (removal, masked editing, replacement) - Outpainting - Transformation / Motion - Relighting --- ## ๐ฏ Supported Tasks - Background generation - Controlled background generation - Subject-consistent context-aware generation - Object and subject placement (coming soon) - In-context image/video generation (coming soon) - Multi-object/subject merging & blending (coming soon) - Video generation (coming soon) --- ## ๐ฆ Target Use Cases - Product photography - Fashion & accessory try-on - Virtual try-on (shoes, hats, glasses, etc.) - People & portrait control - Illustration, animation, and ad creatives All of these tasks can be **mixed and layered** โ ZenCtrl is designed to support real-world visual workflows with **agentic task composition**. --- ## ๐ข News - **2025-03-26**: ๐ง First release โ model weights available on Hugging Face! - **Coming Soon**: Source code release, Quick Start guide, Example notebooks - **Next**: Controlled fine-grain version on our platform and API (Pro version) - **Future**: Video generation toolkit release ## ๐ง Limitations 1. Models currently perform best with **objects**, and to some extent **humans**. 2. Resolution support is currently capped at **1024x1024** (higher quality coming soon). 3. Performance with **illustrations** is currently limited. 4. The models were **not trained on large-scale or highly diverse datasets** yet โ we plan to improve quality and variation by training on larger and more diverse datasets, especially for **illustration and stylized content**. 5. Video support and the full **agentic task pipeline** are still under development. --- ## ๐ To-do - [x] Release early pretrained model weights for defined tasks - [ ] Release additional task-specific models and modes - [ ] Release open source code (coming soon) - [ ] Release Quick Start guide and example notebooks - [ ] Launch API access via our app and Baseten for easier deployment - [ ] Release high-resolution models (1500ร1500+) - [ ] Enable full toolkit integration with agent API - [ ] Add video generation module --- ## ๐ค Join the Community - ๐ฌ [Discord](https://discord.com/invite/b9RuYQ3F8k) โ share ideas and feedback - ๐ [Landing Page](https://fotographer.ai) - ๐งช [Try it now on Hugging Face Space (release on 2025/03/28 PST)](https://huggingface.co/fotographerai/zenctrl_tools/tree/main/weights) --- ## ๐ค Community Collaboration We hope to collaborate closely with the open-source community to make **ZenCtrl** a powerful and extensible toolkit for visual content creation. Once the source code is released, we welcome contributions in training, expanding supported use cases, and developing new task-specific modules. Our vision is to make ZenCtrl the **standard framework** for agentic, high-quality image and video generation โ built together, for everyone.