yajunvicky commited on
Commit
bec2c1e
·
verified ·
1 Parent(s): e040956

Initial model upload

Browse files
Files changed (2) hide show
  1. .DS_Store +0 -0
  2. README.md +4 -4
.DS_Store ADDED
Binary file (6.15 kB). View file
 
README.md CHANGED
@@ -27,7 +27,7 @@ We validate the execution of DeepSeek-R1 model with a Triton-based operator libr
27
 
28
  We use a variety of Triton-implemented operation kernels to run the DeepSeek-R1 model. These kernels come from two main sources:
29
 
30
- - Most Triton kernels are provided by FlagGems (https://github.com/FlagOpen/FlagGems). You can enable FlagGems kernels by setting the environment variable USE_FLAGGEMS.
31
 
32
  - Also included are Triton kernels from vLLM, such as fused MoE.
33
 
@@ -38,10 +38,10 @@ We use a variety of Triton-implemented operation kernels to run the DeepSeek-R1
38
  | Basic Image | basic software environment that supports FlagOS model running | <IMAGE> |
39
  # Evaluation Results
40
 
41
- ## Benchmark Result
42
 
43
- | Metrics | DeepSeek-R1-H100-CUDA | DeepSeek-R1-FlagOS-metax |
44
- |:-------------------|--------------------------|-----------------------------|
45
  | cmmmu | 49.11 | 42.89 |
46
  | mmmu | 57.44 | 47.56 |
47
  | mmmu_pro_standard | 38.4 | 30.21 |
 
27
 
28
  We use a variety of Triton-implemented operation kernels to run the DeepSeek-R1 model. These kernels come from two main sources:
29
 
30
+ - Most Triton kernels are provided by FlagGems (https://github.com/FlagOpen/FlagGems). You can enable FlagGems kernels by setting the environment variable USE_FLAGGEMS.
31
 
32
  - Also included are Triton kernels from vLLM, such as fused MoE.
33
 
 
38
  | Basic Image | basic software environment that supports FlagOS model running | <IMAGE> |
39
  # Evaluation Results
40
 
41
+ ## Benchmark Result
42
 
43
+ | Metrics | DeepSeek-R1-H100-CUDA | DeepSeek-R1-FlagOS-metax |
44
+ |-------------------|--------------------------|-----------------------------|
45
  | cmmmu | 49.11 | 42.89 |
46
  | mmmu | 57.44 | 47.56 |
47
  | mmmu_pro_standard | 38.4 | 30.21 |