Upload 5 files

Browse files

Files changed (6) hide show

.gitattributes +2 -0
README.md +117 -3
best.pt +3 -0
example_input.png +3 -0
example_output.png +3 -0
sparse_unet_native_bc96.yaml +60 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+example_input.png filter=lfs diff=lfs merge=lfs -text
+example_output.png filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,3 +1,117 @@
----
-license: mit
----

+---
+license: mit
+pipeline_tag: image-to-image
+tags:
+  - pytorch
+  - computer-vision
+  - image-to-image
+  - super-resolution
+  - image-upscaling
+  - custom-code
+---
+# ImageAI-Upscale
+`ImageAI-Upscale` is a custom PyTorch model for sparse pixel completion and 2x-by-2x canvas-based image upscaling.
+The idea behind this model is simple:
+1. Take an input image.
+2. Expand the canvas to `2x width` and `2x height`.
+3. Place each original pixel into the **bottom-left** position of a `2x2` block.
+4. Leave the other 3 pixels empty (black).
+5. Let the model fill the missing pixels.
+This produces an output image with:
+- `2x` width
+- `2x` height
+- `4x` total pixel count
+## What Is Included
+This repository contains:
+- `best.pt`: trained model checkpoint
+- `sparse_unet_native_bc96.yaml`: model/training config
+- `example_input.png`: sample input image
+- `example_output.png`: sample output image
+## Model Type
+This is a **custom full-image sparse completion model**, not a standard Transformers or Diffusers model.
+Architecture summary:
+- custom PyTorch U-Net
+- pixel-unshuffle based sparse representation
+- trained to reconstruct dense RGB output from sparse structured input
+- final inference runs on the **full image directly**, without tiling
+## Training Summary
+The model was trained on a PNG image dataset prepared from a larger original image collection.
+Training pipeline summary:
+- all source images were converted to PNG
+- full-resolution images were used as the master dataset
+- to make training more efficient, each full-resolution image was split into `16` parts
+- sparse training pairs were created from those image tiles
+- each `2x2` sparse block kept only the **bottom-left** pixel
+- the other 3 pixels were set to black
+The model was then trained to learn:
+- `SparsePNG -> MasterPNG`
+This means the model specifically learns how to restore this exact sparse pattern.
+## Important Limitation
+This model is **not** a general-purpose super-resolution model.
+It works best when the input follows the same sparse structure used during training:
+- each original pixel is placed into the bottom-left position of a `2x2` block
+- the other three pixels in that block are black
+If you feed normal images directly, you should first convert them into this sparse canvas format.
+## Intended Use
+This model is intended for:
+- experimental image upscaling workflows
+- sparse pixel reconstruction research
+- custom image-to-image pipelines where the sparse sampling rule is fixed
+## Example Usage
+This repository stores only the model assets. The runtime is expected to be used with the original local project code.
+Example command:
+```powershell
+python -m imageai.upscale_cli ^
+  --input "D:\AI\ImageAI\Test.png" ^
+  --output "D:\AI\ImageAI\Test_upscaled.png" ^
+  --config "D:\AI\ImageAI\configs\sparse_unet_native_bc96.yaml" ^
+  --ckpt "D:\AI\ImageAI\checkpoints\sparse_unet_native_bc96\best.pt"
+```
+Or, if the CLI entrypoint is installed:
+```powershell
+imageai-upscale --input "input.png" --output "output.png"
+```
+## Notes
+- trained with PyTorch
+- designed around full-image inference
+- developed as a custom research/project pipeline rather than a framework-native Hugging Face architecture
+## License
+MIT

best.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:be06f195b7800a111f63b860e99ad1a1bbba4dde4c877ec7379c326cf142f413
+size 208208813

example_input.png ADDED Viewed

Git LFS Details

SHA256: 75c8c2e2aad08d63dc9cc253d02113c43bb7779b9c16be3c41a10f90177a5547
Pointer size: 132 Bytes
Size of remote file: 5.31 MB

example_output.png ADDED Viewed

Git LFS Details

SHA256: babf74d31b7b84d8cb2fa23cfc451d205dcc76b96b5bd45ec5fc4b61d9a6abfe
Pointer size: 133 Bytes
Size of remote file: 16.8 MB

sparse_unet_native_bc96.yaml ADDED Viewed

	@@ -0,0 +1,60 @@

+seed: 42
+paths:
+  root_dir: D:/AI/ImageAI
+  master_dir: D:/AI/ImageAI/MasterPNG
+  sparse_dir: D:/AI/ImageAI/SparsePNG
+  splits_dir: D:/AI/ImageAI/splits
+  runs_dir: D:/AI/ImageAI/runs
+  checkpoints_dir: D:/AI/ImageAI/checkpoints
+  outputs_dir: D:/AI/ImageAI/outputs
+data:
+  train_split: train_native_even.txt
+  val_split: val_native_even.txt
+  test_split: test_native_even.txt
+  sample_mode: native_image
+  patch_size: null
+  eval_patch_size: null
+  full_frame_size: null
+  full_frame_pad_mode: edge
+  train_batch_size: 1
+  val_batch_size: 1
+  num_workers: 12
+  pin_memory: true
+  persistent_workers: true
+  prefetch_factor: 2
+  train_derive_sparse_from_gt: false
+  val_derive_sparse_from_gt: false
+  train_horizontal_flip: false
+  train_vertical_flip: false
+model:
+  name: sparse_unet
+  in_channels: 16
+  out_channels: 12
+  base_channels: 96
+loss:
+  missing_weight: 1.0
+  known_weight: 0.05
+training:
+  device: cuda
+  amp: true
+  max_steps: 1000000
+  warmup_steps: 1000
+  validate_every: 1000
+  save_every: 1000
+  log_every: 50
+  learning_rate: 0.0002
+  min_learning_rate: 0.000001
+  weight_decay: 0.01
+  betas: [0.9, 0.99]
+  grad_clip_norm: 1.0
+  run_name: sparse_unet_native_bc96
+  resume: null
+inference:
+  tile_size: 512
+  tile_overlap: 64