Text-to-Image
Diffusers
Safetensors
English
ErnieImagePipeline
mistral
fp32
adamw
transformer
monte-carlo
dit
ernie
Instructions to use Felldude/ERNIE-Image with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Felldude/ERNIE-Image with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Felldude/ERNIE-Image", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
- DiffusionBee
Update README.md
Browse files
README.md
CHANGED
|
@@ -22,11 +22,10 @@ Fully trained in native FP32 precision.
|
|
| 22 |
Optimization performed using standard AdamW.
|
| 23 |
No Adam8bit, quantized optimizer states, or reduced-precision optimizer approximations were used during training.
|
| 24 |
Intended to preserve numerical stability and high-fidelity gradient accumulation throughout all training phases.
|
|
|
|
| 25 |
DIT Ernie Model
|
| 26 |
Uses a Monte Carlo estimation approach to approximate FP32 behavior.
|
| 27 |
-
|
| 28 |
-
Instead, stochastic estimation techniques are applied to emulate FP32 statistical characteristics while reducing computational overhead.
|
| 29 |
-
This approach trades exact deterministic FP32 arithmetic for probabilistic approximation efficiency.
|
| 30 |
Training Details
|
| 31 |
Mistral LLM
|
| 32 |
Precision
|
|
@@ -86,8 +85,6 @@ optimizer precision analysis
|
|
| 86 |
numerical stability benchmarking
|
| 87 |
transformer architecture experimentation
|
| 88 |
Limitations
|
| 89 |
-
Full FP32 training incurs substantial VRAM and compute costs.
|
| 90 |
-
Monte Carlo FP32 approximation may not exactly reproduce deterministic FP32 outputs.
|
| 91 |
Results can vary depending on:
|
| 92 |
sampling strategy
|
| 93 |
hardware backend
|
|
|
|
| 22 |
Optimization performed using standard AdamW.
|
| 23 |
No Adam8bit, quantized optimizer states, or reduced-precision optimizer approximations were used during training.
|
| 24 |
Intended to preserve numerical stability and high-fidelity gradient accumulation throughout all training phases.
|
| 25 |
+
|
| 26 |
DIT Ernie Model
|
| 27 |
Uses a Monte Carlo estimation approach to approximate FP32 behavior.
|
| 28 |
+
|
|
|
|
|
|
|
| 29 |
Training Details
|
| 30 |
Mistral LLM
|
| 31 |
Precision
|
|
|
|
| 85 |
numerical stability benchmarking
|
| 86 |
transformer architecture experimentation
|
| 87 |
Limitations
|
|
|
|
|
|
|
| 88 |
Results can vary depending on:
|
| 89 |
sampling strategy
|
| 90 |
hardware backend
|