OPEA
/

DeepSeek-R1-int2-mixed-sym-inc

Model card Files Files and versions

cicdatopea commited on Mar 11, 2025

Commit

fced50c

·

verified ·

1 Parent(s): d4dc034

Update README.md

Files changed (1) hide show

README.md +24 -3

README.md CHANGED Viewed

@@ -457,12 +457,33 @@ Wait, but let me check if there's another angle. Maybe the question is testing s
 ~~~
-### Evaluate the model
-we have no enough resource to evaluate the model
 ### Generate the model
 5*80g and 1.4T-1.6T memory is required
 ~~~python

 ~~~
 ### Generate the model
+**1 add meta data to bf16 model** https://huggingface.co/opensourcerelease/DeepSeek-R1-bf16
+~~~python
+import safetensors
+from safetensors.torch import save_file
+for i in range(1, 164):
+    idx_str = "0" * (5-len(str(i))) + str(i)
+    safetensors_path = f"model-{idx_str}-of-000163.safetensors"
+    print(safetensors_path)
+    tensors = dict()
+    with safetensors.safe_open(safetensors_path, framework="pt") as f:
+        for key in f.keys():
+            tensors[key] = f.get_tensor(key)
+    save_file(tensors, safetensors_path, metadata={'format': 'pt'})
+~~~
+**2 remove torch.no_grad** in  modeling_deepseek.py  as we need some tuning in AutoRound.
+https://github.com/intel/auto-round/blob/deepseekv3/modeling_deepseek.py
 5*80g and 1.4T-1.6T memory is required
 ~~~python