YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Model Card: GenomeOcean-4B (FP8)
Generated: 2026-05-09T18:08:14-0700
Architecture
| Parameter | Value |
|---|---|
| Architecture | MistralForCausalLM |
| Model Type | mistral |
| Vocab Size | 4096 |
| Hidden Size | 3072 |
| Num Hidden Layers | 24 |
| Num Attention Heads | 12 |
| Intermediate Size | 16384 |
| Max Position Embeddings | 32768 |
| RoPE Theta | 1000000.0 |
Quantization Method
- Format: FP8 (E4M3) per-channel weight-only quantization
- Scale DType: float32 per-channel scales
- Method: Post-training quantization (PTQ) with per-channel E4M3 weights
Perplexity Results
| Metric | Value |
|---|---|
| Original PPL (BF16) | 41542.0508 |
| Quantized PPL (FP8) | 41411.5552 |
| PPL Difference | -130.4957 |
| PPL Difference (%) | -0.31% |
Quality Assessment: Excellent - negligible quality loss
Weight Fidelity
| Metric | Value |
|---|---|
| Mean Cosine Similarity | 1.002967 |
| Min Cosine Similarity | 0.999493 |
| Mean Relative L2 Error | 0.026588 |
| Max Relative L2 Error | 0.026714 |
| Layers Compared | 169 |
Compression
| Metric | Value |
|---|---|
| Original Size | 8.5066 GB |
| Quantized Size | 4.2449 GB |
| Compression Ratio | 49.9% |
| Space Saved | 4.26 GB |
Summary
The GenomeOcean-4B model was quantized from BF16 to FP8. Perplexity changed by -0.31% (original: 41542.0508, quantized: 41411.5552). Mean weight cosine similarity is 1.0030. Compression ratio is 49.9% (saved 4.26 GB).
- Downloads last month
- 14
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support