YummyYum commited on
Commit
f0fb7a9
·
verified ·
1 Parent(s): 02c0be3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +83 -52
README.md CHANGED
@@ -5,7 +5,7 @@ DeepSeek-R1-FlagOS-NVIDIA-BF16 provides an all-in-one deployment solution, enabl
5
  1. Comprehensive Integration:
6
  - Integrated with FlagScale (https://github.com/FlagOpen/FlagScale).
7
  - Open-source inference execution code, preconfigured with all necessary software and hardware settings.
8
- - Verified model files, available on ModelScope ([Model Link](https://www.modelscope.cn/models/FlagRelease/DeepSeek-R1-FlagOS-Nvidia-BF16)).
9
  - Pre-built Docker image for rapid deployment on NVIDIA-H100.
10
  2. High-Precision BF16 Checkpoints:
11
  - BF16 checkpoints dequantized from the official DeepSeek-R1 FP8 model to ensure enhanced inference accuracy and performance.
@@ -40,97 +40,128 @@ We provide dequantized model weights in bfloat16 to run DeepSeek-R1 on NVIDIA GP
40
 
41
  # Bundle Download
42
 
43
- | | Usage | Nvidia |
44
- | ----------- | ------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------ |
45
  | Basic Image | basic software environment that supports model running | `docker pull flagrelease-registry.cn-beijing.cr.aliyuncs.com/flagrelease/flagrelease:deepseek-flagos-nvidia` |
46
- | Model | model weight and configuration files | https://www.modelscope.cn/models/FlagRelease/DeepSeek-R1-FlagOS-Nvidia-BF16 |
47
 
48
  # Evaluation Results
49
 
50
  ## Benchmark Result
51
 
52
- | Metrics | DeepSeek-R1-H100-CUDA | DeepSeek-R1-H100-FlagOS |
53
- |--------------------|-----------------------|----------------------------|
54
- | GSM8K (EM) | 95.75 | 95.83 |
55
- | MMLU (Acc.) | 85.34 | 85.56 |
56
- | CEVAL | 89.00 | 89.60 |
57
- | AIME 2024 (Pass@1) | 76.67 | 70.00 |
58
- | GPQA-Diamond (Pass@1) | 70.20 | 71.21 |
59
- | MATH-500 (Pass@1) | 93.20 | 94.80 |
60
- | MMLU-Pro (Acc.) | TBD | TBD |
61
 
62
 
63
  # How to Run Locally
 
64
  ## 📌 Getting Started
65
- ### Environment Setup
 
66
 
67
  ```bash
68
- # install FlagScale
69
- git clone https://github.com/FlagOpen/FlagScale.git
70
- cd FlagScale
71
- pip install .
72
 
73
- # download image and ckpt
74
- flagscale pull --image flagrelease-registry.cn-beijing.cr.aliyuncs.com/flagrelease/flagrelease:deepseek-flagos-nvidia --ckpt https://www.modelscope.cn/models/FlagRelease/DeepSeek-R1-FlagOS-Nvidia-BF16.git --ckpt-path <CKPT_PATH>
 
 
 
75
 
76
- # Note: For security reasons, this image does not have passwordless configuration. In multi-machine scenarios, you need to configure passwordless access for the image yourself.
77
 
78
- # build and enter the container
79
  docker run -itd --name flagrelease_nv --privileged --gpus all --net=host --ipc=host --device=/dev/infiniband --shm-size 512g --ulimit memlock=-1 -v <CKPT_PATH>:<CKPT_PATH> flagrelease-registry.cn-beijing.cr.aliyuncs.com/flagrelease/flagrelease:deepseek-flagos-nvidia /bin/bash
 
80
  docker exec -it flagrelease_nv /bin/bash
81
 
82
  conda activate flagscale-inference
83
  ```
84
 
85
-
86
  ### Download and install FlagGems
87
 
88
- ```bash
89
  git clone https://github.com/FlagOpen/FlagGems.git
90
  cd FlagGems
91
- pip install ./
92
- cd ../
93
- ```
94
-
95
- ### Download FlagScale and build vllm
96
-
97
- ```bash
98
- git clone https://github.com/FlagOpen/FlagScale.git
99
- cd FlagScale/
100
- git checkout ae85925798358d95050773dfa66680efdb0c2b28
101
- cd vllm
102
  pip install .
103
  cd ../
104
  ```
105
- ### Serve
 
106
 
107
  ```bash
108
- # config the deepseek_r1 yaml
109
- FlagScale/
110
- ├── examples/
111
- │ └── deepseek_r1/
112
- │ └── conf/
113
- │ └── config_deepseek_r1.yaml # set hostfile and ssh_port(optional), if it is passwordless access between containers, the docker field needs to be removed
114
- │ └── serve/
115
- │ └── deepseek_r1.yaml # set model parameters and server port
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
116
 
117
  # install flagscale
 
118
  pip install .
119
-
120
- # serve
121
- flagscale serve deepseek_r1
122
  ```
123
 
124
- # Usage Recommendations
125
- When custom service parameters, users can run:
126
 
127
- ```bash
128
- flagscale serve <MODEL_NAME> <MODEL_CONFIG_YAML>
129
  ```
 
 
 
130
 
131
  # Contributing
132
 
133
  We warmly welcome global developers to join us:
 
134
  1. Submit Issues to report problems
135
  2. Create Pull Requests to contribute code
136
  3. Improve technical documentation
@@ -141,7 +172,7 @@ We warmly welcome global developers to join us:
141
  Scan the QR code below to add our WeChat group
142
  send "FlagRelease"
143
 
144
- ![WeChat](https://cdn-uploads.huggingface.co/production/uploads/673326280dbcb3477ecc2af6/aETN9Zswqts2P9YLrizrz.png)
145
 
146
  # License
147
 
 
5
  1. Comprehensive Integration:
6
  - Integrated with FlagScale (https://github.com/FlagOpen/FlagScale).
7
  - Open-source inference execution code, preconfigured with all necessary software and hardware settings.
8
+ - Verified model files, available on Hugging Face ([Model Link](https://huggingface.co/FlagRelease/DeepSeek-R1-FlagOS-Nvidia-BF16)).
9
  - Pre-built Docker image for rapid deployment on NVIDIA-H100.
10
  2. High-Precision BF16 Checkpoints:
11
  - BF16 checkpoints dequantized from the official DeepSeek-R1 FP8 model to ensure enhanced inference accuracy and performance.
 
40
 
41
  # Bundle Download
42
 
43
+ | | Usage | Nvidia |
44
+ | ----------- | ------------------------------------------------------ | ------------------------------------------------------------ |
45
  | Basic Image | basic software environment that supports model running | `docker pull flagrelease-registry.cn-beijing.cr.aliyuncs.com/flagrelease/flagrelease:deepseek-flagos-nvidia` |
46
+ | Model | model weight and configuration files | https://www.modelscope.cn/models/FlagRelease/DeepSeek-R1-FlagOS-Nvidia-BF16 |
47
 
48
  # Evaluation Results
49
 
50
  ## Benchmark Result
51
 
52
+ | Metrics | DeepSeek-R1-H100-CUDA | DeepSeek-R1-H100-FlagOS |
53
+ | --------------------- | --------------------- | ----------------------- |
54
+ | GSM8K (EM) | 95.75 | 95.83 |
55
+ | MMLU (Acc.) | 85.34 | 85.56 |
56
+ | CEVAL | 89.00 | 89.60 |
57
+ | AIME 2024 (Pass@1) | 76.67 | 70.00 |
58
+ | GPQA-Diamond (Pass@1) | 70.20 | 71.21 |
59
+ | MATH-500 (Pass@1) | 93.20 | 94.80 |
60
+ | MMLU-Pro (Acc.) | TBD | TBD |
61
 
62
 
63
  # How to Run Locally
64
+
65
  ## 📌 Getting Started
66
+
67
+ ### Download open-source weights
68
 
69
  ```bash
70
+ pip install modelscope
71
+ modelscope download --model <Model Name> --local_dir <Cache Path>
72
+ ```
 
73
 
74
+ ### Download the FlagOS image
75
+
76
+ ```bash
77
+ docker pull <IMAGE>
78
+ ```
79
 
80
+ ### Start the inference service
81
 
82
+ ```bash
83
  docker run -itd --name flagrelease_nv --privileged --gpus all --net=host --ipc=host --device=/dev/infiniband --shm-size 512g --ulimit memlock=-1 -v <CKPT_PATH>:<CKPT_PATH> flagrelease-registry.cn-beijing.cr.aliyuncs.com/flagrelease/flagrelease:deepseek-flagos-nvidia /bin/bash
84
+
85
  docker exec -it flagrelease_nv /bin/bash
86
 
87
  conda activate flagscale-inference
88
  ```
89
 
 
90
  ### Download and install FlagGems
91
 
92
+ ```
93
  git clone https://github.com/FlagOpen/FlagGems.git
94
  cd FlagGems
 
 
 
 
 
 
 
 
 
 
 
95
  pip install .
96
  cd ../
97
  ```
98
+
99
+ ### Modify the configuration
100
 
101
  ```bash
102
+ cd FlagScale/examples/deepseek_r1/conf
103
+ # Modify the configuration in config_deepseek_r1.yaml
104
+ defaults:
105
+ - _self_
106
+ - serve: deepseek_r1
107
+ experiment:
108
+ exp_name: deepseek_r1
109
+ exp_dir: outputs/${experiment.exp_name}
110
+ task:
111
+ type: serve
112
+ deploy:
113
+ use_fs_serve: false
114
+ runner:
115
+ hostfile: examples/deepseek_r1/conf/hostfile.txt # set hostfile
116
+ docker: flagrelease_nv # set docker
117
+ ssh_port: 22
118
+ envs:
119
+ CUDA_DEVICE_MAX_CONNECTIONS: 1
120
+ cmds:
121
+ before_start: source /root/miniconda3/bin/activate flagscale-inference && export GLOO_SOCKET_IFNAME=bond0 && export USE_FLAGGEMS=1 # The environment variable GLOO_SOCKET_IF_NAME must be set to the name of the network interface (e.g., eth0, enp0s3) corresponding to the subnet used for inter-machine communication. You can check interface details (IP addresses, names) using the ifconfig command.
122
+ action: run
123
+ hydra:
124
+ run:
125
+ dir: ${experiment.exp_dir}/hydra
126
+
127
+ # Modify the configuration in hostfile.txt
128
+ # ip slots type=xxx[optional]
129
+ # master node
130
+ x.x.x.x slots=8 type=gpu
131
+ # worker nodes
132
+ x.x.x.x slots=8 type=gpu
133
+
134
+ # Modify the configuration in /serve/deepseek_r1.yaml
135
+ - serve_id: vllm_model
136
+ engine: vllm
137
+ engine_args:
138
+ model: /models/deepseek_r1 # path of weight of deepseek r1
139
+ tensor_parallel_size: 8
140
+ pipeline_parallel_size: 4
141
+ gpu_memory_utilization: 0.9
142
+ max_model_len: 32768
143
+ max_num_seqs: 256
144
+ enforce_eager: true
145
+ trust_remote_code: true
146
+ enable_chunked_prefill: true
147
 
148
  # install flagscale
149
+ cd FlagScale/
150
  pip install .
151
+ # Configure passwordless container access by adding its key to other hosts.
 
 
152
  ```
153
 
154
+ ### Serve
 
155
 
 
 
156
  ```
157
+ flagscale serve <Model>
158
+ ```
159
+
160
 
161
  # Contributing
162
 
163
  We warmly welcome global developers to join us:
164
+
165
  1. Submit Issues to report problems
166
  2. Create Pull Requests to contribute code
167
  3. Improve technical documentation
 
172
  Scan the QR code below to add our WeChat group
173
  send "FlagRelease"
174
 
175
+ ![WeChat](image/group.png)
176
 
177
  # License
178