UgurI commited on
Commit
02bd563
·
verified ·
1 Parent(s): 7e5ea44

Upload 5 files

Browse files
.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ example_input.png filter=lfs diff=lfs merge=lfs -text
37
+ example_output.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,117 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ pipeline_tag: image-to-image
4
+ tags:
5
+ - pytorch
6
+ - computer-vision
7
+ - image-to-image
8
+ - super-resolution
9
+ - image-upscaling
10
+ - custom-code
11
+ ---
12
+
13
+ # ImageAI-Upscale
14
+
15
+ `ImageAI-Upscale` is a custom PyTorch model for sparse pixel completion and 2x-by-2x canvas-based image upscaling.
16
+
17
+ The idea behind this model is simple:
18
+
19
+ 1. Take an input image.
20
+ 2. Expand the canvas to `2x width` and `2x height`.
21
+ 3. Place each original pixel into the **bottom-left** position of a `2x2` block.
22
+ 4. Leave the other 3 pixels empty (black).
23
+ 5. Let the model fill the missing pixels.
24
+
25
+ This produces an output image with:
26
+
27
+ - `2x` width
28
+ - `2x` height
29
+ - `4x` total pixel count
30
+
31
+ ## What Is Included
32
+
33
+ This repository contains:
34
+
35
+ - `best.pt`: trained model checkpoint
36
+ - `sparse_unet_native_bc96.yaml`: model/training config
37
+ - `example_input.png`: sample input image
38
+ - `example_output.png`: sample output image
39
+
40
+ ## Model Type
41
+
42
+ This is a **custom full-image sparse completion model**, not a standard Transformers or Diffusers model.
43
+
44
+ Architecture summary:
45
+
46
+ - custom PyTorch U-Net
47
+ - pixel-unshuffle based sparse representation
48
+ - trained to reconstruct dense RGB output from sparse structured input
49
+ - final inference runs on the **full image directly**, without tiling
50
+
51
+ ## Training Summary
52
+
53
+ The model was trained on a PNG image dataset prepared from a larger original image collection.
54
+
55
+ Training pipeline summary:
56
+
57
+ - all source images were converted to PNG
58
+ - full-resolution images were used as the master dataset
59
+ - to make training more efficient, each full-resolution image was split into `16` parts
60
+ - sparse training pairs were created from those image tiles
61
+ - each `2x2` sparse block kept only the **bottom-left** pixel
62
+ - the other 3 pixels were set to black
63
+
64
+ The model was then trained to learn:
65
+
66
+ - `SparsePNG -> MasterPNG`
67
+
68
+ This means the model specifically learns how to restore this exact sparse pattern.
69
+
70
+ ## Important Limitation
71
+
72
+ This model is **not** a general-purpose super-resolution model.
73
+
74
+ It works best when the input follows the same sparse structure used during training:
75
+
76
+ - each original pixel is placed into the bottom-left position of a `2x2` block
77
+ - the other three pixels in that block are black
78
+
79
+ If you feed normal images directly, you should first convert them into this sparse canvas format.
80
+
81
+ ## Intended Use
82
+
83
+ This model is intended for:
84
+
85
+ - experimental image upscaling workflows
86
+ - sparse pixel reconstruction research
87
+ - custom image-to-image pipelines where the sparse sampling rule is fixed
88
+
89
+ ## Example Usage
90
+
91
+ This repository stores only the model assets. The runtime is expected to be used with the original local project code.
92
+
93
+ Example command:
94
+
95
+ ```powershell
96
+ python -m imageai.upscale_cli ^
97
+ --input "D:\AI\ImageAI\Test.png" ^
98
+ --output "D:\AI\ImageAI\Test_upscaled.png" ^
99
+ --config "D:\AI\ImageAI\configs\sparse_unet_native_bc96.yaml" ^
100
+ --ckpt "D:\AI\ImageAI\checkpoints\sparse_unet_native_bc96\best.pt"
101
+ ```
102
+
103
+ Or, if the CLI entrypoint is installed:
104
+
105
+ ```powershell
106
+ imageai-upscale --input "input.png" --output "output.png"
107
+ ```
108
+
109
+ ## Notes
110
+
111
+ - trained with PyTorch
112
+ - designed around full-image inference
113
+ - developed as a custom research/project pipeline rather than a framework-native Hugging Face architecture
114
+
115
+ ## License
116
+
117
+ MIT
best.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:be06f195b7800a111f63b860e99ad1a1bbba4dde4c877ec7379c326cf142f413
3
+ size 208208813
example_input.png ADDED

Git LFS Details

  • SHA256: 75c8c2e2aad08d63dc9cc253d02113c43bb7779b9c16be3c41a10f90177a5547
  • Pointer size: 132 Bytes
  • Size of remote file: 5.31 MB
example_output.png ADDED

Git LFS Details

  • SHA256: babf74d31b7b84d8cb2fa23cfc451d205dcc76b96b5bd45ec5fc4b61d9a6abfe
  • Pointer size: 133 Bytes
  • Size of remote file: 16.8 MB
sparse_unet_native_bc96.yaml ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ seed: 42
2
+
3
+ paths:
4
+ root_dir: D:/AI/ImageAI
5
+ master_dir: D:/AI/ImageAI/MasterPNG
6
+ sparse_dir: D:/AI/ImageAI/SparsePNG
7
+ splits_dir: D:/AI/ImageAI/splits
8
+ runs_dir: D:/AI/ImageAI/runs
9
+ checkpoints_dir: D:/AI/ImageAI/checkpoints
10
+ outputs_dir: D:/AI/ImageAI/outputs
11
+
12
+ data:
13
+ train_split: train_native_even.txt
14
+ val_split: val_native_even.txt
15
+ test_split: test_native_even.txt
16
+ sample_mode: native_image
17
+ patch_size: null
18
+ eval_patch_size: null
19
+ full_frame_size: null
20
+ full_frame_pad_mode: edge
21
+ train_batch_size: 1
22
+ val_batch_size: 1
23
+ num_workers: 12
24
+ pin_memory: true
25
+ persistent_workers: true
26
+ prefetch_factor: 2
27
+ train_derive_sparse_from_gt: false
28
+ val_derive_sparse_from_gt: false
29
+ train_horizontal_flip: false
30
+ train_vertical_flip: false
31
+
32
+ model:
33
+ name: sparse_unet
34
+ in_channels: 16
35
+ out_channels: 12
36
+ base_channels: 96
37
+
38
+ loss:
39
+ missing_weight: 1.0
40
+ known_weight: 0.05
41
+
42
+ training:
43
+ device: cuda
44
+ amp: true
45
+ max_steps: 1000000
46
+ warmup_steps: 1000
47
+ validate_every: 1000
48
+ save_every: 1000
49
+ log_every: 50
50
+ learning_rate: 0.0002
51
+ min_learning_rate: 0.000001
52
+ weight_decay: 0.01
53
+ betas: [0.9, 0.99]
54
+ grad_clip_norm: 1.0
55
+ run_name: sparse_unet_native_bc96
56
+ resume: null
57
+
58
+ inference:
59
+ tile_size: 512
60
+ tile_overlap: 64