| --- |
| license: mit |
| pipeline_tag: image-to-image |
| tags: |
| - pytorch |
| - computer-vision |
| - image-to-image |
| - super-resolution |
| - image-upscaling |
| - custom-code |
| --- |
| |
| # ImageAI-Upscale |
|
|
| `ImageAI-Upscale` is a custom PyTorch model for sparse pixel completion and 2x-by-2x canvas-based image upscaling. |
|
|
| The idea behind this model is simple: |
|
|
| 1. Take an input image. |
| 2. Expand the canvas to `2x width` and `2x height`. |
| 3. Place each original pixel into the **bottom-left** position of a `2x2` block. |
| 4. Leave the other 3 pixels empty (black). |
| 5. Let the model fill the missing pixels. |
|
|
| This produces an output image with: |
|
|
| - `2x` width |
| - `2x` height |
| - `4x` total pixel count |
|
|
| ## What Is Included |
|
|
| This repository contains: |
|
|
| - `best.pt`: trained model checkpoint |
| - `sparse_unet_native_bc96.yaml`: model/training config |
| - `example_input.png`: sample input image |
| - `example_output.png`: sample output image |
|
|
| ## Model Type |
|
|
| This is a **custom full-image sparse completion model**, not a standard Transformers or Diffusers model. |
|
|
| Architecture summary: |
|
|
| - custom PyTorch U-Net |
| - pixel-unshuffle based sparse representation |
| - trained to reconstruct dense RGB output from sparse structured input |
| - final inference runs on the **full image directly**, without tiling |
|
|
| ## Training Summary |
|
|
| The model was trained on a PNG image dataset prepared from a larger original image collection. |
|
|
| Training pipeline summary: |
|
|
| - all source images were converted to PNG |
| - full-resolution images were used as the master dataset |
| - to make training more efficient, each full-resolution image was split into `16` parts |
| - sparse training pairs were created from those image tiles |
| - each `2x2` sparse block kept only the **bottom-left** pixel |
| - the other 3 pixels were set to black |
|
|
| The model was then trained to learn: |
|
|
| - `SparsePNG -> MasterPNG` |
|
|
| This means the model specifically learns how to restore this exact sparse pattern. |
|
|
| ## Important Limitation |
|
|
| This model is **not** a general-purpose super-resolution model. |
|
|
| It works best when the input follows the same sparse structure used during training: |
|
|
| - each original pixel is placed into the bottom-left position of a `2x2` block |
| - the other three pixels in that block are black |
|
|
| If you feed normal images directly, you should first convert them into this sparse canvas format. |
|
|
| ## Intended Use |
|
|
| This model is intended for: |
|
|
| - experimental image upscaling workflows |
| - sparse pixel reconstruction research |
| - custom image-to-image pipelines where the sparse sampling rule is fixed |
|
|
| ## Example Usage |
|
|
| This repository stores only the model assets. The runtime is expected to be used with the original local project code. |
|
|
| Example command: |
|
|
| ```powershell |
| python -m imageai.upscale_cli ^ |
| --input "D:\AI\ImageAI\Test.png" ^ |
| --output "D:\AI\ImageAI\Test_upscaled.png" ^ |
| --config "D:\AI\ImageAI\configs\sparse_unet_native_bc96.yaml" ^ |
| --ckpt "D:\AI\ImageAI\checkpoints\sparse_unet_native_bc96\best.pt" |
| ``` |
|
|
| Or, if the CLI entrypoint is installed: |
|
|
| ```powershell |
| imageai-upscale --input "input.png" --output "output.png" |
| ``` |
|
|
| ## Notes |
|
|
| - trained with PyTorch |
| - designed around full-image inference |
| - developed as a custom research/project pipeline rather than a framework-native Hugging Face architecture |
|
|
| ## License |
|
|
| MIT |
|
|