data-archetype commited on
Commit
d036923
·
verified ·
1 Parent(s): 8fcd98c

Fix README.md benchmark methodology

Browse files
Files changed (1) hide show
  1. README.md +20 -10
README.md CHANGED
@@ -50,18 +50,28 @@ reconstructions, RGB error deltas, and latent PCA side by side, with FLUX.2 VAE
50
  included for comparison:
51
  [semdisdiffae_p32_v2 reconstruction viewer](https://huggingface.co/spaces/data-archetype/semdisdiffae_p32_v2-results).
52
 
53
- ## Throughput
54
 
55
- Measured on an `NVIDIA GeForce RTX 5090` in `bfloat16`, with `5` warmup batches
56
- and `20` timed batches. Decode uses the default 1-step sampler with PDG
57
- disabled.
58
 
59
- | Operation | Resolution | Batch Size | Mean (ms/batch) | Images/s | Peak Allocated VRAM |
60
- |---|---:|---:|---:|---:|---:|
61
- | Encode | `256x256` | `128` | `12.57` | `10186.8` | `574 MiB` |
62
- | Decode | `256x256` | `128` | `98.93` | `1293.9` | `1042 MiB` |
63
- | Encode | `512x512` | `32` | `12.08` | `2649.9` | `579 MiB` |
64
- | Decode | `512x512` | `32` | `100.36` | `318.8` | `1042 MiB` |
 
 
 
 
 
 
 
 
 
 
 
65
 
66
  ## Latent Interface
67
 
 
50
  included for comparison:
51
  [semdisdiffae_p32_v2 reconstruction viewer](https://huggingface.co/spaces/data-archetype/semdisdiffae_p32_v2-results).
52
 
53
+ ## Encode Throughput
54
 
55
+ Measured on an `NVIDIA GeForce RTX 5090` in `bfloat16`, averaging `20`
56
+ repeated batched `encode()` calls after `5` warmup batches.
 
57
 
58
+ | Resolution | Batch Size | Mean (ms/batch) | ms/image | Images/s | Peak Allocated VRAM |
59
+ |---:|---:|---:|---:|---:|---:|
60
+ | `256x256` | `128` | `12.54` | `0.098` | `10206.3` | `567.8 MiB` |
61
+ | `512x512` | `32` | `12.09` | `0.378` | `2647.2` | `563.8 MiB` |
62
+
63
+ ## Decode Latency
64
+
65
+ Measured on the same `NVIDIA GeForce RTX 5090` in `bfloat16`. This is
66
+ decode-only latency: images are encoded once, latents are cached, and timing is
67
+ sequential batch-1 `decode()` over the cached latent set with the default 1-step
68
+ sampler and PDG disabled.
69
+
70
+ | Resolution | Batch Size | Images | Mean (ms/image) | Images/s | Peak Allocated VRAM |
71
+ |---:|---:|---:|---:|---:|---:|
72
+ | `512x512` | `1` | `20` | `5.11` | `195.6` | `340.8 MiB` |
73
+ | `1024x1024` | `1` | `20` | `10.14` | `98.6` | `409.6 MiB` |
74
+ | `2048x2048` | `1` | `20` | `53.86` | `18.6` | `720.9 MiB` |
75
 
76
  ## Latent Interface
77