Felldude
/

ERNIE-Image

ErnieImagePipeline

Model card Files Files and versions

Felldude commited on 25 days ago

Commit

ea1e135

·

verified ·

1 Parent(s): 589247e

Update README.md

Files changed (1) hide show

README.md +2 -5

README.md CHANGED Viewed

@@ -22,11 +22,10 @@ Fully trained in native FP32 precision.
 Optimization performed using standard AdamW.
 No Adam8bit, quantized optimizer states, or reduced-precision optimizer approximations were used during training.
 Intended to preserve numerical stability and high-fidelity gradient accumulation throughout all training phases.
 DIT Ernie Model
 Uses a Monte Carlo estimation approach to approximate FP32 behavior.
-The model does not operate as a strict full FP32 pipeline.
-Instead, stochastic estimation techniques are applied to emulate FP32 statistical characteristics while reducing computational overhead.
-This approach trades exact deterministic FP32 arithmetic for probabilistic approximation efficiency.
 Training Details
 Mistral LLM
 Precision
@@ -86,8 +85,6 @@ optimizer precision analysis
 numerical stability benchmarking
 transformer architecture experimentation
 Limitations
-Full FP32 training incurs substantial VRAM and compute costs.
-Monte Carlo FP32 approximation may not exactly reproduce deterministic FP32 outputs.
 Results can vary depending on:
 sampling strategy
 hardware backend

 Optimization performed using standard AdamW.
 No Adam8bit, quantized optimizer states, or reduced-precision optimizer approximations were used during training.
 Intended to preserve numerical stability and high-fidelity gradient accumulation throughout all training phases.
 DIT Ernie Model
 Uses a Monte Carlo estimation approach to approximate FP32 behavior.
 Training Details
 Mistral LLM
 Precision
 numerical stability benchmarking
 transformer architecture experimentation
 Limitations
 Results can vary depending on:
 sampling strategy
 hardware backend