Ill-Ness commited on
Commit
f56ef93
·
verified ·
1 Parent(s): eee6388

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -43
README.md CHANGED
@@ -1,58 +1,67 @@
1
  ---
2
- license: apache-2.0
3
- language:
4
- - en
5
- pipeline_tag: image-to-image
6
  library_name: pytorch
7
  tags:
8
  - image-to-image
 
9
  - diffusion
10
  - pixel-diffusion-decoder
11
- - super-resolution
 
12
  - denoising
13
  - pytorch
14
  - safetensors
15
- - pie
16
  ---
17
 
18
- # PiEa: Pixel Diffusion Decoder
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
- `PiEa` is a compact image-to-image Pixel Diffusion Decoder. It is trained to reconstruct clean pixels from noisy pixels while conditioning on a degraded version of the same image. The model is designed for decoder-style image restoration, image refinement, and pixel-space reconstruction experiments.
21
 
22
- PiEa is not a text-to-image generator. It is a decoder component: it expects image-like conditioning and produces image-like denoising predictions.
23
 
24
- ## Model details
25
 
26
- | Property | Value |
27
- | --- | --- |
28
- | Model name | `PiEa` |
29
- | Developer | PiEa-ai |
30
- | Model type | Pixel diffusion decoder |
31
- | Input type | `PiE` |
32
- | Architecture | Compact U-Net diffusion decoder |
33
- | Parameters | 1,010,776,675 |
34
- | Resolution | 512 x 512 |
35
- | Base channels | 464 |
36
- | Channel multipliers | [1, 2, 4, 6] |
37
- | Dataset used | `['huggan/wikiart', 'huggan/smithsonian_butterflies_subset']` |
38
- | Optimizer | AdamW |
39
- | Precision | bfloat16 autocast |
40
 
41
- ## Throughput
42
 
43
- | Metric | Value |
44
- | --- | --- |
45
- | Training steps | 9,141 |
46
- | Image tokens processed | 2,396,258,304 |
47
- | Image tokens/sec | 1331229.59 |
48
- | Target image tokens/sec | 250,000 |
49
- | Final loss | 0.023287 |
50
 
51
- Image tokens are counted as processed spatial pixels: `batch_size * height * width` per optimization step.
 
 
 
 
 
 
 
 
 
 
 
 
 
52
 
53
  ## Usage
54
 
55
- This checkpoint is saved as raw PyTorch/safetensors artifacts. Load the model with the architecture definition used in the training script or port the weights into a compatible pixel decoder implementation.
56
 
57
  ```python
58
  from safetensors.torch import load_file
@@ -60,19 +69,38 @@ from safetensors.torch import load_file
60
  state_dict = load_file("model.safetensors")
61
  ```
62
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
63
  ## Limitations
64
 
65
- - PiEa is an pixel decoder, not a full image generation system.
66
- - It is trained on image reconstruction and denoising, not prompt following.
67
- - Quality depends on the surrounding pipeline and conditioning signal.
 
 
 
68
 
69
  ## Citation
70
 
71
  ```bibtex
72
- @misc{piea2026pixeldecoder,
73
- title={PiEa: Pixel Diffusion Decoder},
74
- author={Ill Ness, JasonBruck},
75
- year={2026},
76
- url={https://huggingface.co/PiEa-ai/PiEa}
77
  }
78
- ```
 
1
  ---
 
 
 
 
2
  library_name: pytorch
3
  tags:
4
  - image-to-image
5
+ - super-resolution
6
  - diffusion
7
  - pixel-diffusion-decoder
8
+ - vae-decoder
9
+ - restoration
10
  - denoising
11
  - pytorch
12
  - safetensors
13
+ pipeline_tag: image-to-image
14
  ---
15
 
16
+ # PiEa - Pixel Diffusion Decoder
17
+
18
+ <p align="center">
19
+ <img src="eval_comparison_grid.png" alt="PiEa visual preview" width="100%">
20
+ </p>
21
+
22
+ PiEa is an in-house Pixel Diffusion Decoder developed by **Dl26**. It is designed as an image-to-image decoder that works directly in pixel space. The model receives a degraded visual condition and a noisy image-space input, then predicts the denoising signal used to recover a cleaner high-resolution image.
23
+
24
+ PiEa is built for restoration-style decoding, image refinement, and high-resolution reconstruction research. It is not a text-to-image model and it is not a wrapper around another released decoder. The checkpoint is a standalone image-to-image component intended for custom visual pipelines.
25
+
26
+ Unlike latent-only decoders, PiEa treats pixels as the reconstruction target. This makes the model useful for studying learned decoding behavior where the decoder itself performs conditional restoration instead of simply applying a deterministic upsampling stack. The current release focuses on 512px image reconstruction and pixel-space refinement.
27
+
28
+ ## Model Overview
29
+
30
+ The released checkpoint is a large PiEa decoder with approximately **1.01B parameters**. The architecture is custom and identified in the configuration as `PiEaPixelDiffusionDecoder`. The model uses `input_type: PiE` to mark the intended input family and distinguish it from other pixel decoder formats.
31
+
32
+ PiEa was expanded from an earlier smaller in-house checkpoint. During expansion, overlapping learned tensor regions were copied into the larger model so previously trained structure could be retained, while newly introduced channels were initialized and trained as additional capacity. This gives the larger checkpoint continuity with the previous training stage without claiming the architecture stayed identical.
33
+
34
 
35
+ ## Architecture
36
 
37
+ PiEa uses a convolutional U-Net-style pixel diffusion decoder. The model receives two image-space tensors: a noisy image and a degraded condition image. These tensors are concatenated channel-wise and processed through a downsampling path, residual blocks, a deeper mid-section, and an upsampling path that returns a noise prediction in RGB pixel space.
38
 
39
+ The decoder is conditioned by a continuous noise or timestep embedding. This embedding modulates the residual blocks and lets the same network learn behavior across different noise levels. The output is trained as an epsilon prediction, allowing a reconstruction pipeline to combine the noisy input and predicted noise into a cleaner image estimate.
40
 
41
+ This design keeps the model centered on direct visual reconstruction. PiEa is not a classifier, language model, text encoder, prompt-following diffusion transformer, or general image generator. It is a dedicated pixel decoder for image-to-image workflows.
 
 
 
 
 
 
 
 
 
 
 
 
 
42
 
43
+ ## Data
44
 
45
+ PiEa was trained on a mixed real-image pool that included WikiArt-style imagery and additional natural-image reconstruction data. The training data was used for reconstruction and restoration behavior, with synthetic degradation and noise applied during training.
 
 
 
 
 
 
46
 
47
+ The degradation process is intentionally image-space based. Clean images are transformed into lower-detail or noisy conditions, and the model learns to recover structure, color, and high-frequency detail. This makes the checkpoint suitable for studying restoration-like decoding rather than text-conditioned generation.
48
+
49
+ ## Intended Use
50
+
51
+ PiEa is intended for:
52
+
53
+ - image-to-image restoration research
54
+ - pixel diffusion decoder experiments
55
+ - super-resolution-style reconstruction systems
56
+ - denoising and refinement pipelines
57
+ - visual decoder prototyping
58
+ - studying pixel-space alternatives to deterministic decoders
59
+
60
+ PiEa can be useful where a project needs a learned decoder that performs more than direct interpolation. It is especially relevant for experiments where the decoder is expected to recover visual texture and structure from imperfect image-space inputs.
61
 
62
  ## Usage
63
 
64
+ This repository contains checkpoint weights and configuration for the PiEa decoder. A compatible implementation should construct `PiEaPixelDiffusionDecoder` using `config.json`, then load `model.safetensors`.
65
 
66
  ```python
67
  from safetensors.torch import load_file
 
69
  state_dict = load_file("model.safetensors")
70
  ```
71
 
72
+ At a high level, a PiEa inference pipeline should:
73
+
74
+ 1. Prepare or receive a degraded image condition.
75
+ 2. Prepare a noisy pixel-space input.
76
+ 3. Provide a timestep or noise-level value.
77
+ 4. Run PiEa to predict image noise.
78
+ 5. Convert the predicted noise into a cleaner reconstruction.
79
+
80
+ The exact scheduler and denoising procedure depend on the surrounding system.
81
+
82
+ ## Scope
83
+
84
+ PiEa is a component model. It should be treated as one part of a larger image pipeline, not as a complete application. It does not include a user interface, prompt processor, safety filter, text encoder, or production scheduler.
85
+
86
+ The checkpoint is best suited for researchers and developers who are comfortable integrating raw PyTorch/safetensors weights into custom image systems. It may also be useful as a reference point for experiments around learned pixel decoders, restoration modules, and super-resolution-style reconstruction.
87
+
88
  ## Limitations
89
 
90
+ - PiEa is not a full text-to-image model.
91
+ - PiEa does not follow text prompts.
92
+ - The model requires a compatible custom loader implementation.
93
+ - Output quality depends on the degradation process, noise schedule, and surrounding pipeline.
94
+ - The checkpoint may produce artifacts on image distributions far from its training data.
95
+ - It should be evaluated on target data before deployment.
96
 
97
  ## Citation
98
 
99
  ```bibtex
100
+ @misc{dl26_2026_piea,
101
+ title = {PiEa: Pixel Diffusion Decoder},
102
+ author = {Dl26},
103
+ year = {2026},
104
+ url = {https://huggingface.co/Dl26/PiEa}
105
  }
106
+ ```