The code in github is just fake code？？？？

by sjmind - opened Oct 3, 2025

Discussion

sjmind

Oct 3, 2025

You can publish nothing, but you shouldn't tell a lie.

负责人来解释解释？ziyuan？

forde450

inclusionAI org Oct 3, 2025

Hi, that's an excellent question, and thank you for your close attention to our work!

You're right to point out that the current public interfaces for understanding and generation are separate. This was a deliberate choice for two primary reasons:

Clear Evaluation: It allows the community to independently verify the model's performance on both tasks, which is a standard practice for benchmarking.
Inference Pipelines: The two tasks currently have slightly different preprocessing needs during inference (e.g., mixed-resolution and classifier-free guidance for generation).

The key thing to emphasize is that this separation is only at the interface level, not within the model's core architecture.
That said, unified interface for generation and understanding is crucial to natively unify visual understanding and generation within a single autoregressive framework. We are actively working on it and the corresponding codes will be released in the coming days.

We’d love to have you involved in shaping the project—feel free to open issues, suggest features, or submit PRs so we can build this together!

Best regards,
Ming team

zyhuangnus

inclusionAI org Oct 7, 2025

Hi, thank you again for your feedback!

We’re excited to share that we’ve now released the unified interface for image understanding, generation, and editing! This update allows seamless multimodal interactions within a single autoregressive framework, supporting flexible input types ("text" and "image"), mixed input orders, and multi-turn conversations via internal state management.

Key features:

Image generation: Use descriptive prompts with output_image_prefix to save generated images.
Image understanding: Include both "image" and "text" in the same message for joint reasoning.
Image editing: Chain multiple generate(..., for_edit=True) calls with unique output_image_prefix names.
Multi-turn interactions: Supported via the model’s internal state — call model.reset_inner_state() to reset when needed.

You can find detailed usage examples in the updated README. We’d love for you to try it out and let us know what you think!
As always, we welcome your contributions — feel free to open issues, suggest improvements, or submit PRs. Let’s build this together!

Best regards,
Ming team

zyhuangnus changed discussion status to closed Oct 7, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment