rockylynnstein commited on
Commit
a8611c9
·
verified ·
1 Parent(s): 56930c8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -7
README.md CHANGED
@@ -137,7 +137,6 @@ pip install torch>=2.1.0 transformers>=4.40.0 accelerate compressed-tensors
137
  | **Base Model** | [microsoft/NextCoder-7B](https://huggingface.co/microsoft/NextCoder-7B) |
138
  | **Quantization Method** | FP8 E4M3 weight-only |
139
  | **Framework** | llm-compressor + compressed_tensors |
140
- | **Calibration Samples** | 2048 (8x industry standard) |
141
  | **Storage Size** | ~7GB (3 sharded safetensors) |
142
  | **VRAM (vLLM)** | ~7GB |
143
  | **VRAM (Transformers)** | ~14GB (decompressed to BF16) |
@@ -177,12 +176,6 @@ This model is sharded into 3 safetensors files (all required for inference):
177
  - `model-00002-of-00003.safetensors`
178
  - `model-00003-of-00003.safetensors`
179
 
180
- ## 🔬 Quality Assurance
181
-
182
- - **High-quality calibration:** 2048 diverse code samples (8x industry standard of 256)
183
- - **Validation:** Tested on code generation benchmarks
184
- - **Format:** Standard compressed_tensors for broad compatibility
185
-
186
  ## 📚 Original Model
187
 
188
  This quantization is based on [microsoft/NextCoder-7B](https://huggingface.co/microsoft/NextCoder-7B) by Microsoft.
 
137
  | **Base Model** | [microsoft/NextCoder-7B](https://huggingface.co/microsoft/NextCoder-7B) |
138
  | **Quantization Method** | FP8 E4M3 weight-only |
139
  | **Framework** | llm-compressor + compressed_tensors |
 
140
  | **Storage Size** | ~7GB (3 sharded safetensors) |
141
  | **VRAM (vLLM)** | ~7GB |
142
  | **VRAM (Transformers)** | ~14GB (decompressed to BF16) |
 
176
  - `model-00002-of-00003.safetensors`
177
  - `model-00003-of-00003.safetensors`
178
 
 
 
 
 
 
 
179
  ## 📚 Original Model
180
 
181
  This quantization is based on [microsoft/NextCoder-7B](https://huggingface.co/microsoft/NextCoder-7B) by Microsoft.