OSOR

OSOR is a one-step diffusion framework for effect-aware object removal. It removes target objects together with associated effects such as shadows, reflections, and residual traces, while requiring only a single denoising step at inference.

Model page: https://huggingface.co/QinmingZhou/OSOR

Model Summary

OSOR is trained with a two-phase curriculum:

Phase I trains one-step removal with hard latent blending and occupancy-guided discriminator supervision.
Phase II adds alpha prediction and trains with incomplete-mask conditioning, enabling the model to expand the effective removal region beyond conservative user masks.

The code release includes two backbone implementations:

OSOR-FLUX-Fill
OSOR-SDXL-Inpainting

Release Status

The model repository contains checkpoints for both released OSOR backbones:

osor-fluxfill/weights/fluxfill_phase1.pth
osor-fluxfill/weights/fluxfill_phase2.pth
osor-sdxlinpainting/weights/sdxlinpainting_phase1.pth
osor-sdxlinpainting/weights/sdxlinpainting_phase2.pth

Download with:

hf download QinmingZhou/OSOR --include "osor-fluxfill/weights/*.pth" --local-dir .
hf download QinmingZhou/OSOR --include "osor-sdxlinpainting/weights/*.pth" --local-dir .

Intended Use

OSOR is intended for research on object removal, image inpainting, and mask-conditioned image editing. Given an object-present image and a user-provided mask, OSOR predicts a clean background with object-associated effects removed.

Inputs And Outputs

Inputs:

image: object-present input image.
mask: binary or soft removal mask.

Outputs:

image: object-removed image.
Phase II implementations may also produce or internally use an alpha map for adaptive blending.

Training Data

OSOR is trained on CORNE, a SAVP-verified effect-aware object-removal dataset. Evaluation uses CORNE-Val, AnimeEraseBench, TextEraseBench, and additional paired-background object-removal benchmarks.

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support