OpenTransformer commited on
Commit
5f88415
·
verified ·
1 Parent(s): efd139a

Add files using upload-large-folder tool

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitattributes +252 -0
  2. deepseek-r1-1.5b-gunary/model_layers_0_input_layernorm_weight.fp16 +0 -0
  3. deepseek-r1-1.5b-gunary/model_layers_0_post_attention_layernorm_weight.fp16 +0 -0
  4. deepseek-r1-1.5b-gunary/model_layers_0_self_attn_q_proj_bias.fp16 +0 -0
  5. deepseek-r1-1.5b-gunary/model_layers_0_self_attn_v_proj_weight.gscales +0 -0
  6. deepseek-r1-1.5b-gunary/model_layers_0_self_attn_v_proj_weight.sign +0 -0
  7. deepseek-r1-1.5b-gunary/model_layers_10_self_attn_k_proj_bias.fp16 +0 -0
  8. deepseek-r1-1.5b-gunary/model_layers_10_self_attn_v_proj_bias.fp16 +0 -0
  9. deepseek-r1-1.5b-gunary/model_layers_10_self_attn_v_proj_weight.gscales +0 -0
  10. deepseek-r1-1.5b-gunary/model_layers_11_input_layernorm_weight.fp16 +0 -0
  11. deepseek-r1-1.5b-gunary/model_layers_11_post_attention_layernorm_weight.fp16 +0 -0
  12. deepseek-r1-1.5b-gunary/model_layers_11_self_attn_k_proj_weight.gscales +0 -0
  13. deepseek-r1-1.5b-gunary/model_layers_11_self_attn_v_proj_weight.gscales +0 -0
  14. deepseek-r1-1.5b-gunary/model_layers_12_post_attention_layernorm_weight.fp16 +0 -0
  15. deepseek-r1-1.5b-gunary/model_layers_12_self_attn_q_proj_bias.fp16 +0 -0
  16. deepseek-r1-1.5b-gunary/model_layers_12_self_attn_v_proj_bias.fp16 +0 -0
  17. deepseek-r1-1.5b-gunary/model_layers_12_self_attn_v_proj_weight.sign +0 -0
  18. deepseek-r1-1.5b-gunary/model_layers_13_post_attention_layernorm_weight.fp16 +0 -0
  19. deepseek-r1-1.5b-gunary/model_layers_13_self_attn_k_proj_weight.gscales +0 -0
  20. deepseek-r1-1.5b-gunary/model_layers_13_self_attn_k_proj_weight.sign +0 -0
  21. deepseek-r1-1.5b-gunary/model_layers_13_self_attn_q_proj_bias.fp16 +0 -0
  22. deepseek-r1-1.5b-gunary/model_layers_13_self_attn_v_proj_bias.fp16 +0 -0
  23. deepseek-r1-1.5b-gunary/model_layers_14_input_layernorm_weight.fp16 +0 -0
  24. deepseek-r1-1.5b-gunary/model_layers_14_post_attention_layernorm_weight.fp16 +0 -0
  25. deepseek-r1-1.5b-gunary/model_layers_14_self_attn_k_proj_weight.sign +0 -0
  26. deepseek-r1-1.5b-gunary/model_layers_14_self_attn_v_proj_weight.gscales +0 -0
  27. deepseek-r1-1.5b-gunary/model_layers_14_self_attn_v_proj_weight.sign +0 -0
  28. deepseek-r1-1.5b-gunary/model_layers_15_post_attention_layernorm_weight.fp16 +0 -0
  29. deepseek-r1-1.5b-gunary/model_layers_15_self_attn_q_proj_bias.fp16 +0 -0
  30. deepseek-r1-1.5b-gunary/model_layers_15_self_attn_v_proj_bias.fp16 +0 -0
  31. deepseek-r1-1.5b-gunary/model_layers_15_self_attn_v_proj_weight.gscales +0 -0
  32. deepseek-r1-1.5b-gunary/model_layers_15_self_attn_v_proj_weight.sign +0 -0
  33. deepseek-r1-1.5b-gunary/model_layers_17_self_attn_k_proj_bias.fp16 +0 -0
  34. deepseek-r1-1.5b-gunary/model_layers_17_self_attn_k_proj_weight.gscales +0 -0
  35. deepseek-r1-1.5b-gunary/model_layers_17_self_attn_k_proj_weight.sign +0 -0
  36. deepseek-r1-1.5b-gunary/model_layers_17_self_attn_v_proj_bias.fp16 +0 -0
  37. deepseek-r1-1.5b-gunary/model_layers_18_post_attention_layernorm_weight.fp16 +0 -0
  38. deepseek-r1-1.5b-gunary/model_layers_18_self_attn_k_proj_bias.fp16 +0 -0
  39. deepseek-r1-1.5b-gunary/model_layers_18_self_attn_k_proj_weight.gscales +0 -0
  40. deepseek-r1-1.5b-gunary/model_layers_18_self_attn_v_proj_weight.gscales +0 -0
  41. deepseek-r1-1.5b-gunary/model_layers_19_post_attention_layernorm_weight.fp16 +0 -0
  42. deepseek-r1-1.5b-gunary/model_layers_19_self_attn_k_proj_bias.fp16 +0 -0
  43. deepseek-r1-1.5b-gunary/model_layers_19_self_attn_k_proj_weight.gscales +0 -0
  44. deepseek-r1-1.5b-gunary/model_layers_19_self_attn_q_proj_bias.fp16 +0 -0
  45. deepseek-r1-1.5b-gunary/model_layers_19_self_attn_v_proj_bias.fp16 +0 -0
  46. deepseek-r1-1.5b-gunary/model_layers_19_self_attn_v_proj_weight.gscales +0 -0
  47. deepseek-r1-1.5b-gunary/model_layers_19_self_attn_v_proj_weight.sign +0 -0
  48. deepseek-r1-1.5b-gunary/model_layers_1_input_layernorm_weight.fp16 +0 -0
  49. deepseek-r1-1.5b-gunary/model_layers_1_post_attention_layernorm_weight.fp16 +0 -0
  50. deepseek-r1-1.5b-gunary/model_layers_1_self_attn_k_proj_bias.fp16 +0 -0
.gitattributes CHANGED
@@ -33,3 +33,255 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ deepseek-r1-1.5b-ternary/model_layers_9_self_attn_q_proj_weight.pos filter=lfs diff=lfs merge=lfs -text
37
+ deepseek-r1-1.5b-ternary/model_layers_22_mlp_gate_proj_weight.neg filter=lfs diff=lfs merge=lfs -text
38
+ deepseek-r1-1.5b-ternary/model_layers_0_mlp_gate_proj_weight.pos filter=lfs diff=lfs merge=lfs -text
39
+ deepseek-r1-1.5b-ternary/model_layers_4_mlp_gate_proj_weight.neg filter=lfs diff=lfs merge=lfs -text
40
+ deepseek-r1-1.5b-ternary/model_layers_12_mlp_up_proj_weight.pos filter=lfs diff=lfs merge=lfs -text
41
+ deepseek-r1-1.5b-ternary/model_layers_20_self_attn_o_proj_weight.neg filter=lfs diff=lfs merge=lfs -text
42
+ deepseek-r1-1.5b-ternary/model_layers_15_self_attn_o_proj_weight.pos filter=lfs diff=lfs merge=lfs -text
43
+ deepseek-r1-1.5b-ternary/model_layers_1_mlp_gate_proj_weight.pos filter=lfs diff=lfs merge=lfs -text
44
+ deepseek-r1-1.5b-ternary/model_layers_9_mlp_down_proj_weight.pos filter=lfs diff=lfs merge=lfs -text
45
+ deepseek-r1-1.5b-ternary/model_layers_6_mlp_gate_proj_weight.neg filter=lfs diff=lfs merge=lfs -text
46
+ deepseek-r1-1.5b-ternary/model_layers_4_mlp_up_proj_weight.neg filter=lfs diff=lfs merge=lfs -text
47
+ deepseek-r1-1.5b-ternary/model_layers_22_mlp_down_proj_weight.pos filter=lfs diff=lfs merge=lfs -text
48
+ deepseek-r1-1.5b-ternary/model_layers_6_mlp_gate_proj_weight.pos filter=lfs diff=lfs merge=lfs -text
49
+ deepseek-r1-1.5b-ternary/model_layers_6_self_attn_q_proj_weight.neg filter=lfs diff=lfs merge=lfs -text
50
+ deepseek-r1-1.5b-ternary/model_layers_3_self_attn_q_proj_weight.neg filter=lfs diff=lfs merge=lfs -text
51
+ deepseek-r1-1.5b-ternary/model_layers_9_self_attn_o_proj_weight.pos filter=lfs diff=lfs merge=lfs -text
52
+ deepseek-r1-1.5b-ternary/model_layers_16_self_attn_q_proj_weight.pos filter=lfs diff=lfs merge=lfs -text
53
+ deepseek-r1-1.5b-ternary/model_layers_15_mlp_gate_proj_weight.neg filter=lfs diff=lfs merge=lfs -text
54
+ deepseek-r1-1.5b-ternary/model_layers_23_mlp_up_proj_weight.neg filter=lfs diff=lfs merge=lfs -text
55
+ deepseek-r1-1.5b-ternary/model_layers_1_mlp_down_proj_weight.neg filter=lfs diff=lfs merge=lfs -text
56
+ deepseek-r1-1.5b-packed/model_layers_20_self_attn_o_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
57
+ deepseek-r1-1.5b-packed/model_layers_15_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
58
+ deepseek-r1-1.5b-packed/model_layers_24_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
59
+ deepseek-r1-1.5b-packed/model_layers_2_self_attn_o_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
60
+ deepseek-r1-1.5b-packed/model_layers_22_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
61
+ deepseek-r1-1.5b-packed/model_layers_12_self_attn_k_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
62
+ deepseek-r1-1.5b-packed/model_layers_12_mlp_gate_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
63
+ deepseek-r1-1.5b-packed/model_layers_21_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
64
+ deepseek-r1-1.5b-packed/model_layers_18_mlp_gate_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
65
+ deepseek-r1-1.5b-packed/model_layers_10_self_attn_o_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
66
+ deepseek-r1-1.5b-packed/model_layers_16_mlp_gate_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
67
+ deepseek-r1-1.5b-packed/model_layers_12_self_attn_v_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
68
+ deepseek-r1-1.5b-packed/model_layers_1_mlp_down_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
69
+ deepseek-r1-1.5b-packed/model_layers_16_mlp_down_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
70
+ deepseek-r1-1.5b-packed/model_layers_18_mlp_up_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
71
+ deepseek-r1-1.5b-packed/model_layers_20_mlp_up_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
72
+ deepseek-r1-1.5b-packed/model_layers_12_self_attn_o_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
73
+ deepseek-r1-1.5b-packed/model_layers_7_mlp_gate_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
74
+ deepseek-r1-1.5b-packed/model_layers_11_self_attn_o_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
75
+ deepseek-r1-1.5b-packed/model_layers_12_mlp_down_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
76
+ deepseek-r1-1.5b-packed/model_layers_8_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
77
+ deepseek-r1-1.5b-packed/model_layers_19_self_attn_k_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
78
+ deepseek-r1-1.5b-packed/model_layers_23_mlp_up_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
79
+ deepseek-r1-1.5b-packed/model_layers_2_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
80
+ deepseek-r1-1.5b-packed/model_layers_7_self_attn_q_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
81
+ deepseek-r1-1.5b-packed/model_layers_0_self_attn_q_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
82
+ deepseek-r1-1.5b-packed/model_layers_11_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
83
+ deepseek-r1-1.5b-packed/model_layers_6_mlp_gate_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
84
+ deepseek-r1-1.5b-packed/model_layers_10_mlp_up_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
85
+ deepseek-r1-1.5b-packed/model_layers_0_mlp_up_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
86
+ deepseek-r1-1.5b-packed/model_layers_16_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
87
+ deepseek-r1-1.5b-packed/model_layers_23_mlp_down_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
88
+ deepseek-r1-1.5b-packed/model_layers_20_mlp_down_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
89
+ deepseek-r1-1.5b-packed/model_layers_14_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
90
+ deepseek-r1-1.5b-packed/model_layers_14_self_attn_k_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
91
+ deepseek-r1-1.5b-packed/model_layers_12_self_attn_q_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
92
+ deepseek-r1-1.5b-packed/model_layers_25_mlp_up_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
93
+ deepseek-r1-1.5b-packed/model_layers_9_self_attn_o_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
94
+ deepseek-r1-1.5b-packed/model_layers_20_mlp_down_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
95
+ deepseek-r1-1.5b-packed/model_layers_26_self_attn_o_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
96
+ deepseek-r1-1.5b-packed/model_layers_5_mlp_up_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
97
+ deepseek-r1-1.5b-packed/model_layers_3_mlp_down_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
98
+ deepseek-r1-1.5b-packed/model_layers_4_mlp_up_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
99
+ deepseek-r1-1.5b-packed/model_layers_26_mlp_gate_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
100
+ deepseek-r1-1.5b-packed/model_layers_16_mlp_up_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
101
+ deepseek-r1-1.5b-packed/model_layers_9_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
102
+ deepseek-r1-1.5b-packed/model_layers_25_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
103
+ deepseek-r1-1.5b-packed/model_layers_10_mlp_gate_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
104
+ deepseek-r1-1.5b-packed/model_layers_24_mlp_up_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
105
+ deepseek-r1-1.5b-packed/model_layers_19_self_attn_o_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
106
+ deepseek-r1-1.5b-packed/model_layers_18_mlp_down_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
107
+ deepseek-r1-1.5b-packed/model_layers_21_self_attn_o_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
108
+ deepseek-r1-1.5b-packed/model_layers_15_mlp_gate_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
109
+ deepseek-r1-1.5b-packed/model_layers_0_mlp_down_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
110
+ deepseek-r1-1.5b-packed/model_layers_21_self_attn_q_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
111
+ deepseek-r1-1.5b-packed/model_layers_24_self_attn_v_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
112
+ deepseek-r1-1.5b-packed/model_layers_1_mlp_gate_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
113
+ deepseek-r1-1.5b-packed/model_layers_6_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
114
+ deepseek-r1-1.5b-packed/model_layers_17_mlp_up_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
115
+ deepseek-r1-1.5b-packed/model_layers_23_self_attn_q_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
116
+ deepseek-r1-1.5b-packed/model_layers_3_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
117
+ deepseek-r1-1.5b-packed/model_layers_22_self_attn_v_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
118
+ deepseek-r1-1.5b-packed/model_layers_22_mlp_gate_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
119
+ deepseek-r1-1.5b-packed/model_layers_3_mlp_gate_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
120
+ deepseek-r1-1.5b-packed/model_layers_24_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
121
+ deepseek-r1-1.5b-packed/model_layers_3_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
122
+ deepseek-r1-1.5b-packed/model_layers_14_mlp_down_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
123
+ deepseek-r1-1.5b-packed/model_layers_23_mlp_down_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
124
+ deepseek-r1-1.5b-packed/model_layers_26_mlp_gate_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
125
+ deepseek-r1-1.5b-packed/model_layers_2_mlp_up_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
126
+ deepseek-r1-1.5b-packed/model_layers_26_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
127
+ deepseek-r1-1.5b-packed/model_layers_12_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
128
+ deepseek-r1-1.5b-packed/model_layers_10_mlp_gate_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
129
+ deepseek-r1-1.5b-packed/model_layers_16_mlp_up_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
130
+ deepseek-r1-1.5b-packed/model_layers_11_mlp_down_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
131
+ deepseek-r1-1.5b-packed/model_layers_5_self_attn_k_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
132
+ deepseek-r1-1.5b-packed/model_layers_21_self_attn_v_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
133
+ deepseek-r1-1.5b-packed/model_layers_19_self_attn_q_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
134
+ deepseek-r1-1.5b-packed/model_layers_8_mlp_down_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
135
+ deepseek-r1-1.5b-packed/model_layers_16_self_attn_q_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
136
+ deepseek-r1-1.5b-packed/model_layers_17_self_attn_v_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
137
+ deepseek-r1-1.5b-packed/model_layers_9_self_attn_k_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
138
+ deepseek-r1-1.5b-packed/model_layers_11_mlp_up_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
139
+ deepseek-r1-1.5b-packed/model_layers_18_self_attn_q_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
140
+ deepseek-r1-1.5b-packed/model_layers_11_self_attn_k_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
141
+ deepseek-r1-1.5b-packed/model_layers_1_mlp_up_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
142
+ deepseek-r1-1.5b-packed/model_layers_24_mlp_down_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
143
+ deepseek-r1-1.5b-packed/model_layers_20_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
144
+ deepseek-r1-1.5b-packed/model_layers_4_mlp_gate_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
145
+ deepseek-r1-1.5b-packed/model_layers_20_mlp_up_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
146
+ deepseek-r1-1.5b-packed/model_layers_16_mlp_gate_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
147
+ deepseek-r1-1.5b-packed/model_layers_5_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
148
+ deepseek-r1-1.5b-packed/model_layers_7_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
149
+ deepseek-r1-1.5b-packed/model_layers_1_self_attn_q_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
150
+ deepseek-r1-1.5b-packed/model_layers_3_self_attn_v_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
151
+ deepseek-r1-1.5b-packed/model_layers_24_self_attn_o_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
152
+ deepseek-r1-1.5b-packed/model_layers_14_mlp_down_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
153
+ deepseek-r1-1.5b-packed/model_layers_26_mlp_up_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
154
+ deepseek-r1-1.5b-packed/model_layers_13_mlp_up_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
155
+ deepseek-r1-1.5b-packed/model_layers_6_self_attn_q_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
156
+ deepseek-r1-1.5b-packed/model_layers_13_self_attn_v_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
157
+ deepseek-r1-1.5b-packed/model_layers_26_self_attn_q_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
158
+ deepseek-r1-1.5b-packed/model_layers_3_mlp_gate_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
159
+ deepseek-r1-1.5b-packed/model_layers_18_self_attn_o_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
160
+ deepseek-r1-1.5b-packed/model_layers_13_mlp_down_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
161
+ deepseek-r1-1.5b-packed/model_layers_6_mlp_gate_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
162
+ deepseek-r1-1.5b-packed/model_layers_10_self_attn_k_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
163
+ deepseek-r1-1.5b-packed/model_layers_2_self_attn_k_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
164
+ deepseek-r1-1.5b-packed/model_layers_14_self_attn_q_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
165
+ deepseek-r1-1.5b-packed/model_layers_19_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
166
+ deepseek-r1-1.5b-packed/model_layers_19_mlp_down_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
167
+ deepseek-r1-1.5b-packed/model_layers_17_mlp_up_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
168
+ deepseek-r1-1.5b-packed/model_layers_11_self_attn_q_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
169
+ deepseek-r1-1.5b-packed/model_layers_23_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
170
+ deepseek-r1-1.5b-packed/model_layers_19_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
171
+ deepseek-r1-1.5b-packed/model_layers_0_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
172
+ deepseek-r1-1.5b-packed/model_layers_5_mlp_gate_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
173
+ deepseek-r1-1.5b-packed/model_layers_8_mlp_gate_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
174
+ deepseek-r1-1.5b-packed/model_layers_9_mlp_gate_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
175
+ deepseek-r1-1.5b-packed/model_layers_16_self_attn_k_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
176
+ deepseek-r1-1.5b-packed/model_layers_10_self_attn_v_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
177
+ deepseek-r1-1.5b-packed/model_layers_16_self_attn_v_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
178
+ deepseek-r1-1.5b-packed/model_layers_25_mlp_gate_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
179
+ deepseek-r1-1.5b-packed/model_layers_24_mlp_gate_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
180
+ deepseek-r1-1.5b-packed/model_layers_20_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
181
+ deepseek-r1-1.5b-packed/model_layers_9_mlp_up_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
182
+ deepseek-r1-1.5b-packed/model_layers_5_mlp_down_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
183
+ deepseek-r1-1.5b-packed/model_layers_10_mlp_down_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
184
+ deepseek-r1-1.5b-packed/model_layers_17_self_attn_k_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
185
+ deepseek-r1-1.5b-packed/model_layers_15_self_attn_k_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
186
+ deepseek-r1-1.5b-packed/model_layers_1_self_attn_k_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
187
+ deepseek-r1-1.5b-packed/model_layers_9_mlp_up_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
188
+ deepseek-r1-1.5b-packed/model_layers_25_self_attn_k_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
189
+ deepseek-r1-1.5b-packed/model_layers_9_mlp_gate_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
190
+ deepseek-r1-1.5b-packed/model_layers_17_mlp_down_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
191
+ deepseek-r1-1.5b-packed/model_layers_27_self_attn_v_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
192
+ deepseek-r1-1.5b-packed/model_layers_3_mlp_down_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
193
+ deepseek-r1-1.5b-packed/model_layers_0_mlp_down_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
194
+ deepseek-r1-1.5b-packed/model_layers_6_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
195
+ deepseek-r1-1.5b-packed/model_layers_0_self_attn_v_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
196
+ deepseek-r1-1.5b-packed/model_layers_2_self_attn_v_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
197
+ deepseek-r1-1.5b-packed/model_layers_8_self_attn_k_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
198
+ deepseek-r1-1.5b-packed/model_layers_0_self_attn_k_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
199
+ deepseek-r1-1.5b-packed/model_layers_9_mlp_down_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
200
+ deepseek-r1-1.5b-packed/model_layers_20_mlp_gate_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
201
+ deepseek-r1-1.5b-packed/model_layers_4_self_attn_q_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
202
+ deepseek-r1-1.5b-packed/model_layers_13_mlp_gate_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
203
+ deepseek-r1-1.5b-packed/model_layers_17_self_attn_o_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
204
+ deepseek-r1-1.5b-packed/model_layers_21_mlp_up_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
205
+ deepseek-r1-1.5b-packed/model_layers_13_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
206
+ deepseek-r1-1.5b-packed/model_layers_27_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
207
+ deepseek-r1-1.5b-packed/model_layers_23_mlp_up_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
208
+ deepseek-r1-1.5b-packed/model_layers_18_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
209
+ deepseek-r1-1.5b-packed/model_layers_7_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
210
+ deepseek-r1-1.5b-packed/model_layers_11_mlp_down_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
211
+ deepseek-r1-1.5b-packed/model_layers_10_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
212
+ deepseek-r1-1.5b-packed/model_layers_7_self_attn_k_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
213
+ deepseek-r1-1.5b-packed/model_layers_23_mlp_gate_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
214
+ deepseek-r1-1.5b-packed/model_layers_15_mlp_down_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
215
+ deepseek-r1-1.5b-packed/model_layers_9_mlp_down_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
216
+ deepseek-r1-1.5b-packed/model_layers_22_mlp_up_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
217
+ deepseek-r1-1.5b-packed/model_layers_8_mlp_up_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
218
+ deepseek-r1-1.5b-packed/model_layers_16_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
219
+ deepseek-r1-1.5b-packed/model_layers_21_mlp_gate_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
220
+ deepseek-r1-1.5b-packed/model_layers_4_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
221
+ deepseek-r1-1.5b-packed/model_layers_22_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
222
+ deepseek-r1-1.5b-packed/model_layers_14_mlp_gate_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
223
+ deepseek-r1-1.5b-packed/model_layers_26_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
224
+ deepseek-r1-1.5b-packed/model_layers_23_self_attn_k_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
225
+ deepseek-r1-1.5b-packed/model_layers_25_self_attn_o_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
226
+ deepseek-r1-1.5b-packed/model_layers_20_self_attn_q_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
227
+ deepseek-r1-1.5b-packed/model_layers_18_mlp_down_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
228
+ deepseek-r1-1.5b-packed/model_layers_20_self_attn_k_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
229
+ deepseek-r1-1.5b-packed/model_layers_19_self_attn_v_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
230
+ deepseek-r1-1.5b-packed/model_layers_18_mlp_gate_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
231
+ deepseek-r1-1.5b-packed/model_layers_18_self_attn_v_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
232
+ deepseek-r1-1.5b-packed/model_layers_8_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
233
+ deepseek-r1-1.5b-packed/model_layers_19_mlp_gate_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
234
+ deepseek-r1-1.5b-packed/model_layers_13_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
235
+ deepseek-r1-1.5b-packed/model_layers_5_mlp_down_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
236
+ deepseek-r1-1.5b-packed/model_layers_12_mlp_down_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
237
+ deepseek-r1-1.5b-packed/model_layers_12_mlp_up_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
238
+ deepseek-r1-1.5b-packed/model_layers_25_self_attn_q_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
239
+ deepseek-r1-1.5b-packed/model_layers_11_mlp_up_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
240
+ deepseek-r1-1.5b-packed/model_layers_15_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
241
+ deepseek-r1-1.5b-packed/model_layers_17_mlp_gate_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
242
+ deepseek-r1-1.5b-packed/model_layers_25_mlp_gate_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
243
+ deepseek-r1-1.5b-packed/model_layers_22_mlp_up_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
244
+ deepseek-r1-1.5b-packed/model_layers_15_self_attn_q_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
245
+ deepseek-r1-1.5b-packed/model_layers_23_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
246
+ deepseek-r1-1.5b-packed/model_layers_22_self_attn_k_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
247
+ deepseek-r1-1.5b-packed/model_layers_13_self_attn_k_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
248
+ deepseek-r1-1.5b-packed/model_layers_2_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
249
+ deepseek-r1-1.5b-packed/model_layers_20_mlp_gate_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
250
+ deepseek-r1-1.5b-packed/model_layers_1_mlp_down_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
251
+ deepseek-r1-1.5b-packed/model_layers_12_mlp_up_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
252
+ deepseek-r1-1.5b-packed/model_layers_17_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
253
+ deepseek-r1-1.5b-packed/model_layers_13_self_attn_o_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
254
+ deepseek-r1-1.5b-packed/model_layers_25_mlp_down_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
255
+ deepseek-r1-1.5b-packed/model_layers_7_mlp_down_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
256
+ deepseek-r1-1.5b-packed/model_layers_23_self_attn_o_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
257
+ deepseek-r1-1.5b-packed/model_layers_5_mlp_gate_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
258
+ deepseek-r1-1.5b-packed/model_layers_7_mlp_up_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
259
+ deepseek-r1-1.5b-packed/model_layers_27_mlp_up_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
260
+ deepseek-r1-1.5b-packed/model_layers_14_self_attn_v_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
261
+ deepseek-r1-1.5b-packed/model_layers_6_mlp_up_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
262
+ deepseek-r1-1.5b-packed/model_layers_15_self_attn_v_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
263
+ deepseek-r1-1.5b-packed/model_layers_7_self_attn_o_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
264
+ deepseek-r1-1.5b-packed/model_layers_1_self_attn_q_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
265
+ deepseek-r1-1.5b-packed/model_layers_24_mlp_up_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
266
+ deepseek-r1-1.5b-packed/model_layers_13_mlp_up_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
267
+ deepseek-r1-1.5b-packed/model_layers_10_self_attn_q_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
268
+ deepseek-r1-1.5b-packed/model_layers_6_self_attn_v_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
269
+ deepseek-r1-1.5b-packed/model_layers_15_mlp_up_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
270
+ deepseek-r1-1.5b-packed/model_layers_4_mlp_gate_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
271
+ deepseek-r1-1.5b-packed/model_layers_6_mlp_up_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
272
+ deepseek-r1-1.5b-packed/model_layers_25_mlp_down_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
273
+ deepseek-r1-1.5b-packed/model_layers_22_mlp_down_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
274
+ deepseek-r1-1.5b-packed/model_layers_24_self_attn_k_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
275
+ deepseek-r1-1.5b-packed/model_layers_17_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
276
+ deepseek-r1-1.5b-packed/model_layers_26_mlp_down_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
277
+ deepseek-r1-1.5b-packed/model_layers_4_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
278
+ deepseek-r1-1.5b-packed/model_layers_6_mlp_down_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
279
+ deepseek-r1-1.5b-packed/model_layers_6_mlp_down_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
280
+ deepseek-r1-1.5b-packed/model_layers_17_self_attn_q_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
281
+ deepseek-r1-1.5b-packed/model_layers_1_self_attn_v_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
282
+ deepseek-r1-1.5b-packed/model_layers_11_self_attn_v_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
283
+ deepseek-r1-1.5b-packed/model_layers_21_mlp_gate_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
284
+ deepseek-r1-1.5b-packed/model_layers_21_mlp_down_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
285
+ deepseek-r1-1.5b-packed/model_layers_6_self_attn_o_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
286
+ deepseek-r1-1.5b-packed/model_layers_1_self_attn_o_proj_weight.mags filter=lfs diff=lfs merge=lfs -text
287
+ deepseek-r1-1.5b-packed/model_layers_14_mlp_up_proj_weight.signs filter=lfs diff=lfs merge=lfs -text
deepseek-r1-1.5b-gunary/model_layers_0_input_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_0_post_attention_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_0_self_attn_q_proj_bias.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_0_self_attn_v_proj_weight.gscales ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_0_self_attn_v_proj_weight.sign ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_10_self_attn_k_proj_bias.fp16 ADDED
Binary file (512 Bytes). View file
 
deepseek-r1-1.5b-gunary/model_layers_10_self_attn_v_proj_bias.fp16 ADDED
Binary file (512 Bytes). View file
 
deepseek-r1-1.5b-gunary/model_layers_10_self_attn_v_proj_weight.gscales ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_11_input_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_11_post_attention_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_11_self_attn_k_proj_weight.gscales ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_11_self_attn_v_proj_weight.gscales ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_12_post_attention_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_12_self_attn_q_proj_bias.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_12_self_attn_v_proj_bias.fp16 ADDED
Binary file (512 Bytes). View file
 
deepseek-r1-1.5b-gunary/model_layers_12_self_attn_v_proj_weight.sign ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_13_post_attention_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_13_self_attn_k_proj_weight.gscales ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_13_self_attn_k_proj_weight.sign ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_13_self_attn_q_proj_bias.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_13_self_attn_v_proj_bias.fp16 ADDED
Binary file (512 Bytes). View file
 
deepseek-r1-1.5b-gunary/model_layers_14_input_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_14_post_attention_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_14_self_attn_k_proj_weight.sign ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_14_self_attn_v_proj_weight.gscales ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_14_self_attn_v_proj_weight.sign ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_15_post_attention_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_15_self_attn_q_proj_bias.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_15_self_attn_v_proj_bias.fp16 ADDED
Binary file (512 Bytes). View file
 
deepseek-r1-1.5b-gunary/model_layers_15_self_attn_v_proj_weight.gscales ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_15_self_attn_v_proj_weight.sign ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_17_self_attn_k_proj_bias.fp16 ADDED
Binary file (512 Bytes). View file
 
deepseek-r1-1.5b-gunary/model_layers_17_self_attn_k_proj_weight.gscales ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_17_self_attn_k_proj_weight.sign ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_17_self_attn_v_proj_bias.fp16 ADDED
Binary file (512 Bytes). View file
 
deepseek-r1-1.5b-gunary/model_layers_18_post_attention_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_18_self_attn_k_proj_bias.fp16 ADDED
Binary file (512 Bytes). View file
 
deepseek-r1-1.5b-gunary/model_layers_18_self_attn_k_proj_weight.gscales ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_18_self_attn_v_proj_weight.gscales ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_19_post_attention_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_19_self_attn_k_proj_bias.fp16 ADDED
Binary file (512 Bytes). View file
 
deepseek-r1-1.5b-gunary/model_layers_19_self_attn_k_proj_weight.gscales ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_19_self_attn_q_proj_bias.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_19_self_attn_v_proj_bias.fp16 ADDED
Binary file (512 Bytes). View file
 
deepseek-r1-1.5b-gunary/model_layers_19_self_attn_v_proj_weight.gscales ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_19_self_attn_v_proj_weight.sign ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_1_input_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_1_post_attention_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_1_self_attn_k_proj_bias.fp16 ADDED
Binary file (512 Bytes). View file