Ex0bit commited on
Commit
2f26161
·
verified ·
1 Parent(s): ad2b339

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +22 -1
README.md CHANGED
@@ -1,10 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # OLMo-3-7B-Instruct-NVFP4-1M
2
 
3
  NVFP4 quantized version of [allenai/Olmo-3-7B-Instruct](https://huggingface.co/allenai/Olmo-3-7B-Instruct) with extended 1M token context support via linear RoPE scaling.
4
 
5
  ## Model Description
6
 
7
- This model is the NVFP4 (4-bit floating point) quantized version of OLMo-3-7B-Instruct, optimized for NVIDIA GPUs with Blackwell/Ada Lovelace architecture support. The quantization uses NVIDIA's ModelOpt library with two-level scaling: E4M3 FP8 per block plus FP32 global scale.
8
 
9
  ### Key Features
10
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ library_name: transformers
6
+ tags:
7
+ - olmo
8
+ - nvfp4
9
+ - quantized
10
+ - long-context
11
+ - vllm
12
+ - modelopt
13
+ datasets:
14
+ - allenai/c4
15
+ base_model: allenai/Olmo-3-7B-Instruct
16
+ pipeline_tag: text-generation
17
+ model-index:
18
+ - name: OLMo-3-7B-Instruct-NVFP4-1M
19
+ results: []
20
+ ---
21
+
22
  # OLMo-3-7B-Instruct-NVFP4-1M
23
 
24
  NVFP4 quantized version of [allenai/Olmo-3-7B-Instruct](https://huggingface.co/allenai/Olmo-3-7B-Instruct) with extended 1M token context support via linear RoPE scaling.
25
 
26
  ## Model Description
27
 
28
+ This model is the NVFP4 (4-bit floating point) quantized version of OLMo-3-7B-Instruct, optimized for NVIDIA DGX Spark systems with Blackwell GB10 GPUs and Ada Lovelace architecture support. The quantization uses NVIDIA's ModelOpt library with two-level scaling: E4M3 FP8 per block plus FP32 global scale.
29
 
30
  ### Key Features
31