A Hybrid Approach for Closing the Sim2real Appearance Gap in Game Engine Synthetic Datasets
Abstract
A hybrid approach combining diffusion-based and image-to-image translation methods improves photorealism in synthetic datasets while maintaining semantic consistency.
Video game engines have been an important source for generating large volumes of visual synthetic datasets for training and evaluating computer vision algorithms that are to be deployed in the real world. While the visual fidelity of modern game engines has been significantly improved with technologies such as ray-tracing, a notable sim2real appearance gap between the synthetic and the real-world images still remains, which limits the utilization of synthetic datasets in real-world applications. In this letter, we investigate the ability of a state-of-the-art image generation and editing diffusion model (FLUX.2-4B Klein) to enhance the photorealism of synthetic datasets and compare its performance against a traditional image-to-image translation model (REGEN). Furthermore, we propose a hybrid approach that combines the strong geometry and material transformations of diffusion-based methods with the distribution-matching capabilities of image-to-image translation techniques. Through experiments, it is demonstrated that REGEN outperforms FLUX.2-4B Klein and that by combining both FLUX.2-4B Klein and REGEN models, better visual realism can be achieved compared to using each model individually, while maintaining semantic consistency. The code is available at: https://github.com/stefanos50/Hybrid-Sim2Real
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- HyPER-GAN: Hybrid Patch-Based Image-to-Image Translation for Real-Time Photorealism Enhancement (2026)
- Lighting-grounded Video Generation with Renderer-based Agent Reasoning (2026)
- RealMaster: Lifting Rendered Scenes into Photorealistic Video (2026)
- PAM: A Pose-Appearance-Motion Engine for Sim-to-Real HOI Video Generation (2026)
- Exploring the Role of Synthetic Data Augmentation in Controllable Human-Centric Video Generation (2026)
- Generative World Renderer (2026)
- Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Get this paper in your agent:
hf papers read 2605.02291 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper
