OpenTransformer commited on
Commit
1f839dd
·
verified ·
1 Parent(s): 4cc3b76

Add files using upload-large-folder tool

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitattributes +720 -0
  2. deepseek-r1-1.5b-gunary/config.json +14 -0
  3. deepseek-r1-1.5b-gunary/lm_head_weight.fp16 +3 -0
  4. deepseek-r1-1.5b-gunary/model_embed_tokens_weight.fp16 +3 -0
  5. deepseek-r1-1.5b-gunary/model_layers_0_self_attn_k_proj_bias.fp16 +0 -0
  6. deepseek-r1-1.5b-gunary/model_layers_0_self_attn_k_proj_weight.sign +0 -0
  7. deepseek-r1-1.5b-gunary/model_layers_10_input_layernorm_weight.fp16 +0 -0
  8. deepseek-r1-1.5b-gunary/model_layers_10_self_attn_k_proj_weight.gscales +0 -0
  9. deepseek-r1-1.5b-gunary/model_layers_10_self_attn_q_proj_bias.fp16 +0 -0
  10. deepseek-r1-1.5b-gunary/model_layers_11_self_attn_k_proj_bias.fp16 +0 -0
  11. deepseek-r1-1.5b-gunary/model_layers_11_self_attn_k_proj_weight.sign +0 -0
  12. deepseek-r1-1.5b-gunary/model_layers_11_self_attn_q_proj_bias.fp16 +0 -0
  13. deepseek-r1-1.5b-gunary/model_layers_11_self_attn_v_proj_bias.fp16 +0 -0
  14. deepseek-r1-1.5b-gunary/model_layers_12_input_layernorm_weight.fp16 +0 -0
  15. deepseek-r1-1.5b-gunary/model_layers_12_self_attn_v_proj_weight.gscales +0 -0
  16. deepseek-r1-1.5b-gunary/model_layers_13_input_layernorm_weight.fp16 +0 -0
  17. deepseek-r1-1.5b-gunary/model_layers_13_self_attn_k_proj_bias.fp16 +0 -0
  18. deepseek-r1-1.5b-gunary/model_layers_13_self_attn_v_proj_weight.gscales +0 -0
  19. deepseek-r1-1.5b-gunary/model_layers_13_self_attn_v_proj_weight.sign +0 -0
  20. deepseek-r1-1.5b-gunary/model_layers_14_self_attn_k_proj_weight.gscales +0 -0
  21. deepseek-r1-1.5b-gunary/model_layers_14_self_attn_q_proj_bias.fp16 +0 -0
  22. deepseek-r1-1.5b-gunary/model_layers_14_self_attn_v_proj_bias.fp16 +0 -0
  23. deepseek-r1-1.5b-gunary/model_layers_15_self_attn_k_proj_weight.sign +0 -0
  24. deepseek-r1-1.5b-gunary/model_layers_16_self_attn_k_proj_weight.sign +0 -0
  25. deepseek-r1-1.5b-gunary/model_layers_16_self_attn_q_proj_bias.fp16 +0 -0
  26. deepseek-r1-1.5b-gunary/model_layers_16_self_attn_q_proj_weight.sign +3 -0
  27. deepseek-r1-1.5b-gunary/model_layers_16_self_attn_v_proj_bias.fp16 +0 -0
  28. deepseek-r1-1.5b-gunary/model_layers_16_self_attn_v_proj_weight.gscales +0 -0
  29. deepseek-r1-1.5b-gunary/model_layers_17_input_layernorm_weight.fp16 +0 -0
  30. deepseek-r1-1.5b-gunary/model_layers_17_post_attention_layernorm_weight.fp16 +0 -0
  31. deepseek-r1-1.5b-gunary/model_layers_17_self_attn_q_proj_bias.fp16 +0 -0
  32. deepseek-r1-1.5b-gunary/model_layers_17_self_attn_v_proj_weight.sign +0 -0
  33. deepseek-r1-1.5b-gunary/model_layers_18_input_layernorm_weight.fp16 +0 -0
  34. deepseek-r1-1.5b-gunary/model_layers_18_self_attn_k_proj_weight.sign +0 -0
  35. deepseek-r1-1.5b-gunary/model_layers_19_self_attn_k_proj_weight.sign +0 -0
  36. deepseek-r1-1.5b-gunary/model_layers_19_self_attn_q_proj_weight.planes +3 -0
  37. deepseek-r1-1.5b-gunary/model_layers_1_self_attn_q_proj_bias.fp16 +0 -0
  38. deepseek-r1-1.5b-gunary/model_layers_1_self_attn_v_proj_weight.sign +0 -0
  39. deepseek-r1-1.5b-gunary/model_layers_20_input_layernorm_weight.fp16 +0 -0
  40. deepseek-r1-1.5b-gunary/model_layers_20_post_attention_layernorm_weight.fp16 +0 -0
  41. deepseek-r1-1.5b-gunary/model_layers_20_self_attn_k_proj_bias.fp16 +0 -0
  42. deepseek-r1-1.5b-gunary/model_layers_20_self_attn_k_proj_weight.sign +0 -0
  43. deepseek-r1-1.5b-gunary/model_layers_21_post_attention_layernorm_weight.fp16 +0 -0
  44. deepseek-r1-1.5b-gunary/model_layers_21_self_attn_k_proj_weight.gscales +0 -0
  45. deepseek-r1-1.5b-gunary/model_layers_21_self_attn_v_proj_bias.fp16 +0 -0
  46. deepseek-r1-1.5b-gunary/model_layers_21_self_attn_v_proj_weight.gscales +0 -0
  47. deepseek-r1-1.5b-gunary/model_layers_22_self_attn_k_proj_weight.sign +0 -0
  48. deepseek-r1-1.5b-gunary/model_layers_22_self_attn_q_proj_bias.fp16 +0 -0
  49. deepseek-r1-1.5b-gunary/model_layers_22_self_attn_v_proj_weight.gscales +0 -0
  50. deepseek-r1-1.5b-gunary/model_layers_22_self_attn_v_proj_weight.sign +0 -0
.gitattributes CHANGED
@@ -1797,3 +1797,723 @@ qwen3-4b-thinking-unary/model_layers_20_self_attn_q_proj_weight.planes filter=lf
1797
  qwen3-4b-thinking-unary/model_layers_31_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1798
  qwen3-4b-thinking-unary/model_layers_32_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1799
  qwen3-4b-thinking-unary/model_layers_4_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1797
  qwen3-4b-thinking-unary/model_layers_31_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1798
  qwen3-4b-thinking-unary/model_layers_32_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1799
  qwen3-4b-thinking-unary/model_layers_4_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1800
+ qwen3-4b-thinking-unary/model_layers_8_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1801
+ qwen3-4b-thinking-unary/model_layers_6_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1802
+ qwen3-4b-thinking-unary/model_layers_11_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1803
+ qwen3-4b-thinking-unary/model_layers_3_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1804
+ qwen3-4b-thinking-unary/model_layers_7_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1805
+ qwen3-4b-thinking-unary/model_layers_16_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1806
+ qwen3-4b-thinking-unary/model_layers_11_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1807
+ qwen3-4b-thinking-unary/model_layers_8_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1808
+ qwen3-4b-thinking-unary/model_layers_20_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1809
+ qwen3-4b-thinking-unary/model_layers_7_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1810
+ qwen3-4b-thinking-unary/model_layers_7_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1811
+ qwen3-4b-thinking-unary/model_layers_19_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1812
+ qwen3-4b-thinking-unary/model_layers_7_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1813
+ qwen3-4b-thinking-unary/model_layers_32_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1814
+ qwen3-4b-thinking-unary/model_layers_17_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1815
+ qwen3-4b-thinking-unary/model_layers_21_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1816
+ qwen3-4b-thinking-unary/model_layers_34_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1817
+ qwen3-4b-thinking-unary/model_layers_26_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1818
+ qwen3-4b-thinking-unary/model_layers_22_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1819
+ qwen3-4b-thinking-unary/model_layers_18_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1820
+ qwen3-4b-thinking-unary/model_layers_9_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1821
+ qwen3-4b-thinking-unary/model_layers_19_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1822
+ qwen3-4b-thinking-unary/model_layers_4_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1823
+ qwen3-4b-thinking-unary/model_layers_9_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1824
+ qwen3-4b-thinking-unary/model_layers_8_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1825
+ qwen3-4b-thinking-unary/model_layers_21_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1826
+ qwen3-4b-thinking-unary/model_layers_28_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1827
+ qwen3-4b-thinking-unary/model_layers_35_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1828
+ qwen3-4b-thinking-unary/model_layers_25_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1829
+ qwen3-4b-thinking-unary/model_layers_13_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1830
+ qwen3-4b-thinking-unary/model_layers_1_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1831
+ qwen3-4b-thinking-unary/model_layers_13_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1832
+ qwen3-4b-thinking-unary/model_layers_10_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1833
+ qwen3-4b-thinking-unary/model_layers_11_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1834
+ qwen3-4b-thinking-unary/tokenizer.json filter=lfs diff=lfs merge=lfs -text
1835
+ qwen3-4b-thinking-unary/model_layers_14_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1836
+ qwen3-4b-thinking-unary/model_layers_24_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1837
+ qwen3-4b-thinking-unary/model_layers_17_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1838
+ qwen3-4b-thinking-unary/model_layers_5_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1839
+ qwen3-4b-thinking-unary/model_layers_24_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1840
+ qwen3-4b-thinking-unary/model_layers_9_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1841
+ qwen3-4b-thinking-unary/model_layers_17_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1842
+ qwen3-4b-thinking-unary/model_layers_8_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1843
+ qwen3-4b-thinking-unary/model_layers_24_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1844
+ qwen3-4b-thinking-unary/model_layers_12_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1845
+ qwen3-4b-thinking-unary/model_layers_12_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1846
+ qwen3-4b-thinking-unary/model_layers_23_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1847
+ qwen3-4b-thinking-unary/model_layers_1_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1848
+ qwen3-4b-thinking-unary/model_layers_3_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1849
+ qwen3-4b-thinking-unary/model_layers_23_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1850
+ qwen3-4b-thinking-unary/model_layers_22_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1851
+ qwen3-4b-thinking-unary/model_layers_28_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1852
+ qwen3-4b-thinking-unary/model_layers_32_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1853
+ qwen3-4b-thinking-unary/model_layers_13_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1854
+ qwen3-4b-thinking-unary/model_layers_26_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1855
+ qwen3-4b-thinking-unary/model_layers_1_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1856
+ qwen3-4b-thinking-unary/model_layers_7_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1857
+ qwen3-4b-thinking-unary/model_layers_14_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1858
+ qwen3-4b-thinking-unary/model_layers_23_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1859
+ qwen3-4b-thinking-unary/model_layers_31_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1860
+ qwen3-4b-thinking-unary/model_layers_33_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1861
+ qwen3-4b-thinking-unary/model_layers_18_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1862
+ qwen3-4b-thinking-unary/model_layers_35_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1863
+ qwen3-4b-thinking-unary/model_layers_23_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1864
+ qwen3-4b-thinking-unary/model_layers_2_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1865
+ qwen3-4b-thinking-unary/model_layers_16_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1866
+ qwen3-4b-thinking-unary/model_layers_18_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1867
+ qwen3-4b-thinking-unary/model_layers_8_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1868
+ qwen3-4b-thinking-unary/model_layers_20_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1869
+ deepseek-r1-1.5b-unary/model_embed_tokens_weight.fp16 filter=lfs diff=lfs merge=lfs -text
1870
+ qwen3-4b-thinking-unary/model_layers_6_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1871
+ qwen3-4b-thinking-unary/model_layers_23_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1872
+ qwen3-4b-thinking-unary/model_layers_15_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1873
+ qwen3-4b-thinking-unary/model_layers_9_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1874
+ qwen3-4b-thinking-unary/model_layers_26_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1875
+ qwen3-4b-thinking-unary/model_layers_23_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1876
+ qwen3-4b-thinking-unary/model_layers_29_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1877
+ qwen3-4b-thinking-unary/model_layers_26_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1878
+ qwen3-4b-thinking-unary/model_layers_22_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1879
+ qwen3-4b-thinking-unary/model_layers_21_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1880
+ qwen3-4b-thinking-unary/model_layers_10_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1881
+ qwen3-4b-thinking-unary/model_layers_9_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1882
+ qwen3-4b-thinking-unary/model_layers_20_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1883
+ qwen3-4b-thinking-unary/model_layers_32_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1884
+ qwen3-4b-thinking-unary/model_layers_30_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1885
+ qwen3-4b-thinking-unary/model_layers_31_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1886
+ qwen3-4b-thinking-unary/model_layers_18_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1887
+ qwen3-4b-thinking-unary/model_layers_32_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1888
+ qwen3-4b-thinking-unary/model_layers_21_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1889
+ qwen3-4b-thinking-unary/model_layers_35_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1890
+ qwen3-4b-thinking-unary/model_layers_25_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1891
+ qwen3-4b-thinking-unary/model_layers_25_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1892
+ qwen3-4b-thinking-unary/model_layers_34_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1893
+ qwen3-4b-thinking-unary/model_layers_11_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1894
+ qwen3-4b-thinking-unary/model_layers_6_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1895
+ qwen3-4b-thinking-unary/model_layers_19_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1896
+ qwen3-4b-thinking-unary/model_layers_1_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1897
+ qwen3-4b-thinking-unary/model_layers_0_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1898
+ qwen3-4b-thinking-unary/model_layers_13_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1899
+ qwen3-4b-thinking-unary/model_layers_7_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1900
+ qwen3-4b-thinking-unary/model_layers_30_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1901
+ qwen3-4b-thinking-unary/model_layers_30_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1902
+ qwen3-4b-thinking-unary/model_layers_16_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1903
+ qwen3-4b-thinking-unary/model_layers_1_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1904
+ qwen3-4b-thinking-unary/model_layers_14_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1905
+ qwen3-4b-thinking-unary/model_layers_28_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1906
+ qwen3-4b-thinking-unary/model_layers_30_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1907
+ qwen3-4b-thinking-unary/model_layers_26_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1908
+ qwen3-4b-thinking-unary/model_layers_5_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1909
+ qwen3-4b-thinking-unary/model_layers_0_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1910
+ qwen3-4b-thinking-unary/model_layers_7_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1911
+ qwen3-4b-thinking-unary/model_layers_9_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1912
+ qwen3-4b-thinking-unary/model_layers_29_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1913
+ qwen3-4b-thinking-unary/model_layers_10_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1914
+ qwen3-4b-thinking-unary/model_layers_16_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1915
+ qwen3-4b-thinking-unary/model_layers_19_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1916
+ qwen3-4b-thinking-unary/model_layers_18_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1917
+ qwen3-4b-thinking-unary/model_layers_12_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1918
+ qwen3-4b-thinking-unary/model_layers_26_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1919
+ qwen3-4b-thinking-unary/model_layers_32_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1920
+ qwen3-4b-thinking-unary/model_layers_0_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1921
+ qwen3-4b-thinking-unary/model_layers_10_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1922
+ qwen3-4b-thinking-unary/model_layers_8_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1923
+ qwen3-4b-thinking-unary/model_layers_7_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1924
+ qwen3-4b-thinking-unary/model_layers_25_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1925
+ qwen3-4b-thinking-unary/model_layers_6_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1926
+ qwen3-4b-thinking-unary/model_layers_24_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1927
+ qwen3-4b-thinking-unary/model_layers_29_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1928
+ qwen3-4b-thinking-unary/model_layers_15_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1929
+ qwen3-4b-thinking-unary/model_layers_35_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1930
+ qwen3-4b-thinking-unary/model_layers_16_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1931
+ qwen3-4b-thinking-unary/model_layers_6_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1932
+ qwen3-4b-thinking-unary/model_layers_14_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1933
+ qwen3-4b-thinking-unary/model_layers_29_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1934
+ qwen3-4b-thinking-unary/model_layers_20_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1935
+ qwen3-4b-thinking-unary/model_layers_30_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1936
+ qwen3-4b-thinking-unary/model_layers_25_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1937
+ qwen3-4b-thinking-unary/model_layers_35_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1938
+ qwen3-4b-thinking-unary/model_layers_3_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1939
+ qwen3-4b-thinking-unary/model_layers_3_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1940
+ qwen3-4b-thinking-unary/model_layers_22_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1941
+ qwen3-4b-thinking-unary/model_layers_32_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1942
+ qwen3-4b-thinking-unary/model_layers_33_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1943
+ qwen3-4b-thinking-unary/model_layers_2_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1944
+ qwen3-4b-thinking-unary/model_layers_18_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1945
+ qwen3-4b-thinking-unary/model_layers_12_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1946
+ qwen3-4b-thinking-unary/model_layers_25_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1947
+ qwen3-4b-thinking-unary/model_layers_28_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1948
+ qwen3-4b-thinking-unary/model_layers_10_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1949
+ qwen3-4b-thinking-unary/model_layers_9_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1950
+ qwen3-4b-thinking-unary/model_layers_21_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1951
+ qwen3-4b-thinking-unary/model_layers_26_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1952
+ qwen3-4b-thinking-unary/model_layers_16_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1953
+ qwen3-4b-thinking-unary/model_layers_3_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1954
+ qwen3-4b-thinking-unary/model_layers_22_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1955
+ qwen3-4b-thinking-unary/model_layers_1_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1956
+ qwen3-4b-thinking-unary/model_layers_22_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1957
+ qwen3-4b-thinking-unary/model_layers_22_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1958
+ qwen3-4b-thinking-unary/model_layers_12_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1959
+ qwen3-4b-thinking-unary/model_layers_19_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1960
+ qwen3-4b-thinking-unary/model_layers_24_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1961
+ qwen3-4b-thinking-unary/model_layers_26_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1962
+ qwen3-4b-thinking-unary/model_layers_22_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1963
+ qwen3-4b-thinking-unary/model_layers_10_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1964
+ qwen3-4b-thinking-unary/model_layers_13_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1965
+ qwen3-4b-thinking-unary/model_layers_0_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1966
+ qwen3-4b-thinking-unary/model_layers_15_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1967
+ qwen3-4b-thinking-unary/model_layers_21_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1968
+ qwen3-4b-thinking-unary/model_layers_5_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1969
+ qwen3-4b-thinking-unary/model_layers_26_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1970
+ qwen3-4b-thinking-unary/model_layers_18_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1971
+ qwen3-4b-thinking-unary/model_layers_12_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1972
+ qwen3-4b-thinking-unary/model_layers_11_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1973
+ qwen3-4b-thinking-unary/model_layers_30_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1974
+ qwen3-4b-thinking-unary/model_layers_11_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1975
+ qwen3-4b-thinking-unary/model_layers_7_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1976
+ qwen3-4b-thinking-unary/model_layers_4_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1977
+ qwen3-4b-thinking-unary/model_layers_2_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1978
+ qwen3-4b-thinking-unary/model_layers_14_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1979
+ qwen3-4b-thinking-unary/model_layers_20_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1980
+ qwen3-4b-thinking-unary/model_layers_27_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1981
+ qwen3-4b-thinking-unary/model_layers_19_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1982
+ qwen3-4b-thinking-unary/model_layers_17_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1983
+ qwen3-4b-thinking-unary/model_layers_26_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1984
+ qwen3-4b-thinking-unary/model_layers_14_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1985
+ qwen3-4b-thinking-unary/model_layers_15_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1986
+ qwen3-4b-thinking-unary/model_layers_19_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1987
+ qwen3-4b-thinking-unary/model_layers_29_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1988
+ qwen3-4b-thinking-unary/model_layers_6_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1989
+ qwen3-4b-thinking-unary/model_layers_35_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1990
+ qwen3-4b-thinking-unary/model_layers_9_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1991
+ qwen3-4b-thinking-unary/model_layers_10_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1992
+ qwen3-4b-thinking-unary/model_layers_2_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1993
+ qwen3-4b-thinking-unary/model_layers_31_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1994
+ qwen3-4b-thinking-unary/model_layers_5_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1995
+ qwen3-4b-thinking-unary/model_layers_4_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
1996
+ qwen3-4b-thinking-unary/model_layers_28_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1997
+ qwen3-4b-thinking-unary/model_layers_35_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1998
+ qwen3-4b-thinking-unary/model_layers_0_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
1999
+ qwen3-4b-thinking-unary/model_layers_16_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2000
+ qwen3-4b-thinking-unary/model_layers_27_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2001
+ qwen3-4b-thinking-unary/model_layers_22_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2002
+ qwen3-4b-thinking-unary/model_layers_33_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2003
+ qwen3-4b-thinking-unary/model_layers_33_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2004
+ qwen3-4b-thinking-unary/model_layers_31_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2005
+ qwen3-4b-thinking-unary/model_layers_16_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2006
+ qwen3-4b-thinking-unary/model_layers_15_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2007
+ qwen3-4b-thinking-unary/model_layers_24_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2008
+ qwen3-4b-thinking-unary/model_layers_27_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2009
+ qwen3-4b-thinking-unary/model_layers_16_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2010
+ qwen3-4b-thinking-unary/model_layers_17_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2011
+ qwen3-4b-thinking-unary/model_layers_6_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2012
+ qwen3-4b-thinking-unary/model_layers_4_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2013
+ qwen3-4b-thinking-unary/model_layers_13_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2014
+ qwen3-4b-thinking-unary/model_layers_17_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2015
+ qwen3-4b-thinking-unary/model_layers_17_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2016
+ qwen3-4b-thinking-unary/model_layers_9_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2017
+ qwen3-4b-thinking-unary/model_layers_6_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2018
+ qwen3-4b-thinking-unary/model_layers_9_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2019
+ qwen3-4b-thinking-unary/model_layers_8_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2020
+ qwen3-4b-thinking-unary/model_layers_13_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2021
+ qwen3-4b-thinking-unary/model_layers_28_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2022
+ qwen3-4b-thinking-unary/model_layers_30_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2023
+ qwen3-4b-thinking-unary/model_layers_4_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2024
+ qwen3-4b-thinking-unary/model_layers_19_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2025
+ qwen3-4b-thinking-unary/model_layers_10_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2026
+ qwen3-4b-thinking-unary/model_layers_15_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2027
+ qwen3-4b-thinking-unary/model_layers_4_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2028
+ qwen3-4b-thinking-unary/model_layers_5_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2029
+ qwen3-4b-thinking-unary/model_layers_11_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2030
+ qwen3-4b-thinking-unary/model_layers_21_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2031
+ qwen3-4b-thinking-unary/model_layers_12_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2032
+ qwen3-4b-thinking-unary/model_layers_18_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2033
+ qwen3-4b-thinking-unary/model_layers_25_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2034
+ qwen3-4b-thinking-unary/model_layers_9_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2035
+ qwen3-4b-thinking-unary/model_layers_3_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2036
+ qwen3-4b-thinking-unary/model_layers_0_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2037
+ qwen3-4b-thinking-unary/model_layers_1_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2038
+ qwen3-4b-thinking-unary/model_layers_8_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2039
+ qwen3-4b-thinking-unary/model_layers_3_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2040
+ qwen3-4b-thinking-unary/model_layers_1_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2041
+ qwen3-4b-thinking-unary/model_layers_13_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2042
+ qwen3-4b-thinking-unary/model_layers_15_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2043
+ qwen3-4b-thinking-unary/model_layers_4_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2044
+ qwen3-4b-thinking-unary/model_layers_20_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2045
+ qwen3-4b-thinking-unary/model_layers_17_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2046
+ qwen3-4b-thinking-unary/model_layers_5_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2047
+ qwen3-4b-thinking-unary/model_layers_20_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2048
+ qwen3-4b-thinking-unary/model_layers_34_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2049
+ qwen3-4b-thinking-unary/model_layers_24_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2050
+ qwen3-4b-thinking-unary/model_layers_24_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2051
+ qwen3-4b-thinking-unary/model_layers_20_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2052
+ qwen3-4b-thinking-unary/model_layers_21_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2053
+ qwen3-4b-thinking-unary/model_layers_13_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2054
+ qwen3-4b-thinking-unary/model_layers_16_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2055
+ qwen3-4b-thinking-unary/model_layers_22_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2056
+ qwen3-4b-thinking-unary/model_layers_25_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2057
+ qwen3-4b-thinking-unary/model_layers_30_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2058
+ qwen3-4b-thinking-unary/model_layers_29_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2059
+ qwen3-4b-thinking-unary/model_layers_15_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2060
+ qwen3-4b-thinking-unary/model_layers_22_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2061
+ qwen3-4b-thinking-unary/model_layers_26_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2062
+ qwen3-4b-thinking-unary/model_layers_34_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2063
+ qwen3-4b-thinking-unary/model_layers_29_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2064
+ qwen3-4b-thinking-unary/model_layers_13_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2065
+ qwen3-4b-thinking-unary/model_layers_33_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2066
+ qwen3-4b-thinking-unary/model_layers_35_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2067
+ qwen3-4b-thinking-unary/model_layers_16_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2068
+ qwen3-4b-thinking-unary/model_layers_24_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2069
+ qwen3-4b-thinking-unary/model_layers_33_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2070
+ qwen3-4b-thinking-unary/model_layers_23_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2071
+ qwen3-4b-thinking-unary/model_layers_9_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2072
+ qwen3-4b-thinking-unary/model_layers_29_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2073
+ qwen3-4b-thinking-unary/model_layers_0_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2074
+ qwen3-4b-thinking-unary/model_layers_27_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2075
+ qwen3-4b-thinking-unary/model_layers_35_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2076
+ qwen3-4b-thinking-unary/model_layers_19_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2077
+ qwen3-4b-thinking-unary/model_layers_24_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2078
+ qwen3-4b-thinking-unary/model_layers_28_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2079
+ qwen3-4b-thinking-unary/model_layers_29_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2080
+ qwen3-4b-thinking-unary/model_layers_20_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2081
+ qwen3-4b-thinking-unary/model_layers_15_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2082
+ qwen3-4b-thinking-unary/model_layers_15_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2083
+ qwen3-4b-thinking-unary/model_layers_14_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2084
+ qwen3-4b-thinking-unary/model_layers_2_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2085
+ qwen3-4b-thinking-unary/model_layers_6_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2086
+ qwen3-4b-thinking-unary/model_layers_4_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2087
+ qwen3-4b-thinking-unary/model_layers_1_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2088
+ qwen3-4b-thinking-unary/model_layers_15_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2089
+ qwen3-4b-thinking-unary/model_layers_10_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2090
+ qwen3-4b-thinking-unary/model_layers_2_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2091
+ qwen3-4b-thinking-unary/model_layers_27_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2092
+ qwen3-4b-thinking-unary/model_layers_31_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2093
+ qwen3-4b-thinking-unary/model_layers_31_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2094
+ qwen3-4b-thinking-unary/model_layers_30_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2095
+ qwen3-4b-thinking-unary/model_layers_35_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2096
+ deepseek-r1-1.5b-gunary/model_layers_3_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2097
+ deepseek-r1-1.5b-gunary/model_layers_8_mlp_gate_proj_weight.gscales filter=lfs diff=lfs merge=lfs -text
2098
+ deepseek-r1-1.5b-gunary/model_layers_19_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2099
+ deepseek-r1-1.5b-gunary/model_layers_16_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2100
+ deepseek-r1-1.5b-gunary/model_layers_27_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2101
+ deepseek-r1-1.5b-gunary/model_layers_27_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2102
+ qwen3-4b-proper-unary/model_layers_1_mlp_gate_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2103
+ qwen3-4b-proper-unary/model_layers_0_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2104
+ qwen3-4b-proper-unary/model_layers_0_self_attn_o_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2105
+ qwen3-4b-proper-unary/model_layers_0_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2106
+ qwen3-4b-proper-unary/model_layers_1_self_attn_q_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2107
+ qwen3-4b-proper-unary/model_layers_12_self_attn_q_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2108
+ qwen3-4b-proper-unary/model_layers_10_self_attn_q_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2109
+ qwen3-4b-proper-unary/model_layers_11_mlp_gate_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2110
+ qwen3-4b-proper-unary/model_layers_0_mlp_up_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2111
+ qwen3-4b-proper-unary/model_layers_10_mlp_gate_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2112
+ deepseek-r1-1.5b-gunary/lm_head_weight.fp16 filter=lfs diff=lfs merge=lfs -text
2113
+ qwen3-4b-proper-unary/model_layers_13_mlp_down_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2114
+ qwen3-4b-proper-unary/model_layers_12_self_attn_k_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2115
+ qwen3-4b-proper-unary/tokenizer.json filter=lfs diff=lfs merge=lfs -text
2116
+ qwen3-4b-proper-unary/model_layers_10_self_attn_o_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2117
+ qwen3-4b-proper-unary/model_layers_0_self_attn_q_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2118
+ qwen3-4b-proper-unary/model_layers_13_mlp_down_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2119
+ qwen3-4b-proper-unary/model_layers_10_self_attn_v_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2120
+ qwen3-4b-proper-unary/model_layers_1_self_attn_q_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2121
+ qwen3-4b-proper-unary/model_layers_0_self_attn_o_proj_weight.slots filter=lfs diff=lfs merge=lfs -text
2122
+ qwen3-4b-proper-unary/model_layers_12_self_attn_v_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2123
+ qwen3-4b-proper-unary/model_layers_10_self_attn_o_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2124
+ qwen3-4b-proper-unary/model_layers_11_self_attn_o_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2125
+ qwen3-4b-proper-unary/model_layers_10_self_attn_q_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2126
+ qwen3-4b-proper-unary/model_layers_0_mlp_down_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2127
+ qwen3-4b-proper-unary/model_layers_1_mlp_up_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2128
+ qwen3-4b-proper-unary/model_layers_11_mlp_up_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2129
+ qwen3-4b-proper-unary/model_layers_0_self_attn_v_proj_weight.slots filter=lfs diff=lfs merge=lfs -text
2130
+ qwen3-4b-proper-unary/model_layers_0_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2131
+ qwen3-4b-proper-unary/model_layers_12_mlp_up_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2132
+ deepseek-r1-1.5b-gunary/model_embed_tokens_weight.fp16 filter=lfs diff=lfs merge=lfs -text
2133
+ qwen3-4b-proper-unary/model_layers_0_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2134
+ qwen3-4b-proper-unary/model_layers_12_mlp_down_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2135
+ qwen3-4b-proper-unary/model_layers_0_self_attn_k_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2136
+ qwen3-4b-proper-unary/model_layers_12_mlp_down_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2137
+ qwen3-4b-proper-unary/model_layers_12_self_attn_o_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2138
+ qwen3-4b-proper-unary/model_layers_12_self_attn_o_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2139
+ qwen3-4b-proper-unary/model_layers_10_mlp_down_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2140
+ qwen3-4b-proper-unary/model_layers_0_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2141
+ qwen3-4b-proper-unary/model_layers_0_mlp_down_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2142
+ qwen3-4b-proper-unary/model_layers_12_self_attn_v_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2143
+ qwen3-4b-proper-unary/model_layers_0_self_attn_v_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2144
+ qwen3-4b-proper-unary/model_layers_12_mlp_gate_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2145
+ qwen3-4b-proper-unary/model_layers_11_self_attn_o_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2146
+ qwen3-4b-proper-unary/model_layers_1_self_attn_v_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2147
+ qwen3-4b-proper-unary/model_layers_1_mlp_gate_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2148
+ qwen3-4b-proper-unary/model_layers_0_self_attn_o_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2149
+ qwen3-4b-proper-unary/model_layers_1_mlp_down_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2150
+ qwen3-4b-proper-unary/model_layers_11_self_attn_q_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2151
+ qwen3-4b-proper-unary/model_layers_11_self_attn_v_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2152
+ qwen3-4b-proper-unary/model_layers_1_self_attn_o_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2153
+ qwen3-4b-proper-unary/model_layers_11_mlp_down_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2154
+ qwen3-4b-proper-unary/model_layers_10_mlp_up_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2155
+ qwen3-4b-proper-unary/model_layers_12_mlp_gate_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2156
+ qwen3-4b-proper-unary/model_layers_10_self_attn_k_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2157
+ qwen3-4b-proper-unary/model_layers_0_mlp_up_proj_weight.slots filter=lfs diff=lfs merge=lfs -text
2158
+ qwen3-4b-proper-unary/model_layers_11_mlp_down_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2159
+ qwen3-4b-proper-unary/model_layers_10_mlp_up_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2160
+ qwen3-4b-proper-unary/model_layers_0_self_attn_q_proj_weight.slots filter=lfs diff=lfs merge=lfs -text
2161
+ qwen3-4b-proper-unary/model_layers_11_self_attn_k_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2162
+ qwen3-4b-proper-unary/model_layers_12_self_attn_q_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2163
+ qwen3-4b-proper-unary/model_layers_0_mlp_gate_proj_weight.slots filter=lfs diff=lfs merge=lfs -text
2164
+ qwen3-4b-proper-unary/model_layers_12_self_attn_k_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2165
+ qwen3-4b-proper-unary/model_layers_0_mlp_down_proj_weight.slots filter=lfs diff=lfs merge=lfs -text
2166
+ qwen3-4b-proper-unary/model_layers_0_self_attn_k_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2167
+ qwen3-4b-proper-unary/model_layers_0_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2168
+ qwen3-4b-proper-unary/model_layers_11_mlp_up_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2169
+ qwen3-4b-proper-unary/model_layers_11_self_attn_v_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2170
+ qwen3-4b-proper-unary/model_layers_0_self_attn_k_proj_weight.slots filter=lfs diff=lfs merge=lfs -text
2171
+ qwen3-4b-proper-unary/model_layers_11_mlp_gate_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2172
+ qwen3-4b-proper-unary/model_layers_10_self_attn_v_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2173
+ qwen3-4b-proper-unary/model_layers_1_mlp_down_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2174
+ qwen3-4b-proper-unary/model_layers_1_self_attn_k_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2175
+ qwen3-4b-proper-unary/model_layers_10_mlp_gate_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2176
+ qwen3-4b-proper-unary/model_layers_1_self_attn_k_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2177
+ qwen3-4b-proper-unary/model_layers_12_mlp_up_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2178
+ qwen3-4b-proper-unary/model_layers_10_self_attn_k_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2179
+ qwen3-4b-proper-unary/model_layers_1_self_attn_o_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2180
+ qwen3-4b-proper-unary/model_layers_0_mlp_gate_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2181
+ qwen3-4b-proper-unary/model_layers_0_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2182
+ qwen3-4b-proper-unary/model_layers_1_self_attn_v_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2183
+ qwen3-4b-proper-unary/model_layers_10_mlp_down_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2184
+ qwen3-4b-proper-unary/model_layers_11_self_attn_k_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2185
+ qwen3-4b-proper-unary/model_layers_0_self_attn_q_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2186
+ qwen3-4b-proper-unary/model_layers_0_self_attn_v_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2187
+ qwen3-4b-proper-unary/model_layers_11_self_attn_q_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2188
+ qwen3-4b-proper-unary/model_layers_1_mlp_up_proj_weight.usign filter=lfs diff=lfs merge=lfs -text
2189
+ deepseek-r1-1.5b-unary/model_layers_11_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2190
+ deepseek-r1-1.5b-unary/model_layers_18_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2191
+ deepseek-r1-1.5b-unary/model_layers_6_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2192
+ deepseek-r1-1.5b-unary/model_layers_5_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2193
+ deepseek-r1-1.5b-unary/model_layers_25_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2194
+ deepseek-r1-1.5b-unary/model_layers_0_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2195
+ deepseek-r1-1.5b-unary/model_layers_17_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2196
+ deepseek-r1-1.5b-unary/model_layers_5_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2197
+ deepseek-r1-1.5b-unary/model_layers_18_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2198
+ deepseek-r1-1.5b-unary/model_layers_17_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2199
+ deepseek-r1-1.5b-unary/model_layers_25_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2200
+ deepseek-r1-1.5b-unary/model_layers_21_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2201
+ deepseek-r1-1.5b-unary/model_layers_2_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2202
+ deepseek-r1-1.5b-unary/model_layers_6_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2203
+ qwen3-4b-proper-unary/model_layers_0_mlp_up_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2204
+ deepseek-r1-1.5b-unary/model_layers_23_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2205
+ deepseek-r1-1.5b-unary/model_layers_14_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2206
+ deepseek-r1-1.5b-unary/model_layers_4_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2207
+ deepseek-r1-1.5b-unary/model_layers_0_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2208
+ deepseek-r1-1.5b-unary/model_layers_10_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2209
+ deepseek-r1-1.5b-unary/model_layers_24_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2210
+ deepseek-r1-1.5b-unary/model_layers_14_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2211
+ deepseek-r1-1.5b-unary/model_layers_0_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2212
+ deepseek-r1-1.5b-unary/model_layers_22_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2213
+ deepseek-r1-1.5b-unary/model_layers_20_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2214
+ deepseek-r1-1.5b-unary/model_layers_4_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2215
+ deepseek-r1-1.5b-unary/model_layers_6_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2216
+ deepseek-r1-1.5b-unary/model_layers_3_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2217
+ deepseek-r1-1.5b-unary/model_layers_11_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2218
+ deepseek-r1-1.5b-unary/model_layers_16_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2219
+ deepseek-r1-1.5b-unary/model_layers_20_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2220
+ deepseek-r1-1.5b-unary/model_layers_7_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2221
+ deepseek-r1-1.5b-unary/model_layers_8_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2222
+ qwen3-4b-proper-unary/model_layers_0_mlp_gate_proj_weight.uslots filter=lfs diff=lfs merge=lfs -text
2223
+ deepseek-r1-1.5b-unary/model_layers_7_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2224
+ deepseek-r1-1.5b-unary/model_layers_7_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2225
+ deepseek-r1-1.5b-unary/model_layers_19_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2226
+ deepseek-r1-1.5b-unary/model_layers_11_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2227
+ deepseek-r1-1.5b-unary/model_layers_7_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2228
+ deepseek-r1-1.5b-unary/model_layers_21_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2229
+ deepseek-r1-1.5b-unary/model_layers_26_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2230
+ deepseek-r1-1.5b-unary/model_layers_19_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2231
+ deepseek-r1-1.5b-unary/model_layers_17_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2232
+ deepseek-r1-1.5b-unary/model_layers_9_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2233
+ deepseek-r1-1.5b-unary/model_layers_18_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2234
+ deepseek-r1-1.5b-unary/model_layers_4_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2235
+ deepseek-r1-1.5b-unary/model_layers_9_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2236
+ deepseek-r1-1.5b-unary/model_layers_8_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2237
+ deepseek-r1-1.5b-unary/model_layers_21_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2238
+ deepseek-r1-1.5b-unary/model_layers_13_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2239
+ deepseek-r1-1.5b-unary/model_layers_13_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2240
+ deepseek-r1-1.5b-unary/model_layers_25_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2241
+ deepseek-r1-1.5b-unary/model_layers_1_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2242
+ deepseek-r1-1.5b-unary/model_layers_10_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2243
+ deepseek-r1-1.5b-unary/model_layers_14_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2244
+ deepseek-r1-1.5b-unary/model_layers_24_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2245
+ deepseek-r1-1.5b-unary/model_layers_17_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2246
+ deepseek-r1-1.5b-unary/model_layers_9_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2247
+ deepseek-r1-1.5b-unary/model_layers_8_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2248
+ deepseek-r1-1.5b-unary/model_layers_24_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2249
+ deepseek-r1-1.5b-unary/model_layers_12_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2250
+ deepseek-r1-1.5b-unary/model_layers_23_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2251
+ deepseek-r1-1.5b-unary/model_layers_12_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2252
+ deepseek-r1-1.5b-unary/model_layers_3_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2253
+ deepseek-r1-1.5b-unary/model_layers_23_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2254
+ deepseek-r1-1.5b-unary/model_layers_26_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2255
+ deepseek-r1-1.5b-unary/model_layers_1_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2256
+ deepseek-r1-1.5b-unary/model_layers_13_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2257
+ deepseek-r1-1.5b-unary/model_layers_14_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2258
+ deepseek-r1-1.5b-unary/model_layers_1_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2259
+ deepseek-r1-1.5b-unary/model_layers_23_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2260
+ deepseek-r1-1.5b-unary/model_layers_3_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2261
+ deepseek-r1-1.5b-unary/model_layers_7_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2262
+ deepseek-r1-1.5b-unary/model_layers_23_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2263
+ deepseek-r1-1.5b-unary/model_layers_2_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2264
+ deepseek-r1-1.5b-unary/model_layers_18_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2265
+ deepseek-r1-1.5b-unary/model_layers_23_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2266
+ deepseek-r1-1.5b-unary/model_layers_9_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2267
+ deepseek-r1-1.5b-unary/model_layers_23_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2268
+ deepseek-r1-1.5b-unary/model_layers_8_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2269
+ deepseek-r1-1.5b-unary/model_layers_21_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2270
+ deepseek-r1-1.5b-unary/model_layers_22_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2271
+ deepseek-r1-1.5b-unary/model_layers_10_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2272
+ deepseek-r1-1.5b-unary/model_layers_20_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2273
+ deepseek-r1-1.5b-unary/model_layers_21_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2274
+ deepseek-r1-1.5b-unary/model_layers_9_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2275
+ deepseek-r1-1.5b-unary/model_layers_18_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2276
+ deepseek-r1-1.5b-unary/model_layers_25_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2277
+ deepseek-r1-1.5b-unary/model_layers_25_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2278
+ deepseek-r1-1.5b-unary/model_layers_19_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2279
+ deepseek-r1-1.5b-unary/model_layers_11_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2280
+ deepseek-r1-1.5b-unary/model_layers_6_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2281
+ deepseek-r1-1.5b-unary/model_layers_1_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2282
+ deepseek-r1-1.5b-unary/model_layers_1_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2283
+ deepseek-r1-1.5b-unary/model_layers_14_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2284
+ deepseek-r1-1.5b-unary/model_layers_7_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2285
+ deepseek-r1-1.5b-unary/model_layers_16_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2286
+ deepseek-r1-1.5b-unary/model_layers_5_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2287
+ deepseek-r1-1.5b-unary/model_layers_26_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2288
+ deepseek-r1-1.5b-unary/model_layers_0_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2289
+ deepseek-r1-1.5b-unary/model_layers_10_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2290
+ deepseek-r1-1.5b-unary/model_layers_9_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2291
+ deepseek-r1-1.5b-unary/model_layers_16_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2292
+ deepseek-r1-1.5b-unary/model_layers_19_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2293
+ deepseek-r1-1.5b-unary/model_layers_26_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2294
+ deepseek-r1-1.5b-unary/model_layers_12_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2295
+ deepseek-r1-1.5b-unary/model_layers_10_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2296
+ deepseek-r1-1.5b-unary/model_layers_18_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2297
+ deepseek-r1-1.5b-unary/model_layers_7_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2298
+ deepseek-r1-1.5b-unary/model_layers_8_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2299
+ deepseek-r1-1.5b-unary/model_layers_6_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2300
+ deepseek-r1-1.5b-unary/model_layers_24_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2301
+ deepseek-r1-1.5b-unary/model_layers_16_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2302
+ deepseek-r1-1.5b-unary/model_layers_15_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2303
+ deepseek-r1-1.5b-unary/model_layers_0_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2304
+ deepseek-r1-1.5b-unary/model_layers_25_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2305
+ deepseek-r1-1.5b-unary/model_layers_3_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2306
+ deepseek-r1-1.5b-unary/model_layers_20_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2307
+ deepseek-r1-1.5b-unary/model_layers_3_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2308
+ deepseek-r1-1.5b-unary/model_layers_22_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2309
+ deepseek-r1-1.5b-unary/model_layers_14_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2310
+ deepseek-r1-1.5b-unary/model_layers_2_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2311
+ deepseek-r1-1.5b-unary/model_layers_9_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2312
+ deepseek-r1-1.5b-unary/model_layers_26_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2313
+ deepseek-r1-1.5b-unary/model_layers_16_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2314
+ deepseek-r1-1.5b-unary/model_layers_3_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2315
+ deepseek-r1-1.5b-unary/model_layers_22_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2316
+ deepseek-r1-1.5b-unary/model_layers_25_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2317
+ deepseek-r1-1.5b-unary/model_layers_1_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2318
+ deepseek-r1-1.5b-unary/model_layers_22_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2319
+ deepseek-r1-1.5b-unary/model_layers_12_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2320
+ deepseek-r1-1.5b-unary/model_layers_22_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2321
+ deepseek-r1-1.5b-unary/model_layers_19_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2322
+ deepseek-r1-1.5b-unary/model_layers_26_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2323
+ deepseek-r1-1.5b-unary/model_layers_24_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2324
+ deepseek-r1-1.5b-unary/model_layers_22_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2325
+ deepseek-r1-1.5b-unary/model_layers_13_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2326
+ deepseek-r1-1.5b-unary/model_layers_0_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2327
+ deepseek-r1-1.5b-unary/model_layers_15_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2328
+ deepseek-r1-1.5b-unary/model_layers_18_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2329
+ deepseek-r1-1.5b-unary/model_layers_5_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2330
+ deepseek-r1-1.5b-unary/model_layers_21_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2331
+ deepseek-r1-1.5b-unary/model_layers_12_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2332
+ deepseek-r1-1.5b-unary/model_layers_11_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2333
+ deepseek-r1-1.5b-unary/model_layers_7_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2334
+ deepseek-r1-1.5b-unary/model_layers_26_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2335
+ deepseek-r1-1.5b-unary/model_layers_11_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2336
+ deepseek-r1-1.5b-unary/model_layers_4_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2337
+ deepseek-r1-1.5b-unary/model_layers_2_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2338
+ deepseek-r1-1.5b-unary/model_layers_14_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2339
+ deepseek-r1-1.5b-unary/model_layers_20_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2340
+ deepseek-r1-1.5b-unary/model_layers_19_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2341
+ deepseek-r1-1.5b-unary/model_layers_27_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2342
+ deepseek-r1-1.5b-unary/model_layers_14_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2343
+ deepseek-r1-1.5b-unary/model_layers_26_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2344
+ deepseek-r1-1.5b-unary/model_layers_15_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2345
+ deepseek-r1-1.5b-unary/model_layers_17_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2346
+ deepseek-r1-1.5b-unary/model_layers_19_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2347
+ deepseek-r1-1.5b-unary/model_layers_10_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2348
+ deepseek-r1-1.5b-unary/model_layers_2_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2349
+ deepseek-r1-1.5b-unary/model_layers_5_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2350
+ deepseek-r1-1.5b-unary/model_layers_16_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2351
+ qwen3-4b-thinking-unary/model_embed_tokens_weight.fp16 filter=lfs diff=lfs merge=lfs -text
2352
+ qwen3-4b-thinking-unary/model_layers_17_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2353
+ qwen3-4b-thinking-unary/model_layers_31_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2354
+ qwen3-4b-thinking-unary/model_layers_14_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2355
+ qwen3-4b-thinking-unary/model_layers_1_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2356
+ qwen3-4b-thinking-unary/model_layers_20_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2357
+ qwen3-4b-thinking-unary/model_layers_2_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2358
+ qwen3-4b-thinking-unary/model_layers_30_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2359
+ qwen3-4b-thinking-unary/model_layers_5_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2360
+ qwen3-4b-thinking-unary/model_layers_30_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2361
+ qwen3-4b-thinking-unary/model_layers_7_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2362
+ qwen3-4b-thinking-unary/model_layers_35_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2363
+ qwen3-4b-thinking-unary/model_layers_33_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2364
+ qwen3-4b-thinking-unary/model_layers_35_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2365
+ qwen3-4b-thinking-unary/model_layers_34_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2366
+ qwen3-4b-thinking-unary/model_layers_23_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2367
+ qwen3-4b-thinking-unary/model_layers_28_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2368
+ qwen3-4b-thinking-unary/model_layers_16_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2369
+ qwen3-4b-thinking-unary/model_layers_11_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2370
+ qwen3-4b-thinking-unary/model_layers_32_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2371
+ qwen3-4b-thinking-unary/model_layers_7_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2372
+ qwen3-4b-thinking-unary/model_layers_4_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2373
+ qwen3-4b-thinking-unary/model_layers_34_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2374
+ qwen3-4b-thinking-unary/model_layers_13_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2375
+ qwen3-4b-thinking-unary/model_layers_28_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2376
+ qwen3-4b-thinking-unary/model_layers_15_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2377
+ qwen3-4b-thinking-unary/model_layers_1_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2378
+ qwen3-4b-thinking-unary/model_layers_12_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2379
+ qwen3-4b-thinking-unary/model_layers_3_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2380
+ qwen3-4b-thinking-unary/model_layers_20_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2381
+ qwen3-4b-thinking-unary/model_layers_8_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2382
+ qwen3-4b-thinking-unary/model_layers_8_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2383
+ qwen3-4b-thinking-unary/model_layers_33_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2384
+ qwen3-4b-thinking-unary/model_layers_33_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2385
+ qwen3-4b-thinking-unary/model_layers_24_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2386
+ qwen3-4b-thinking-unary/model_layers_30_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2387
+ qwen3-4b-thinking-unary/model_layers_0_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2388
+ qwen3-4b-thinking-unary/model_layers_23_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2389
+ qwen3-4b-thinking-unary/model_layers_12_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2390
+ qwen3-4b-thinking-unary/model_layers_25_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2391
+ qwen3-4b-thinking-unary/model_layers_13_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2392
+ qwen3-4b-thinking-unary/model_layers_2_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2393
+ qwen3-4b-thinking-unary/model_layers_24_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2394
+ qwen3-4b-thinking-unary/model_layers_2_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2395
+ qwen3-4b-thinking-unary/model_layers_33_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2396
+ qwen3-4b-thinking-unary/model_layers_26_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2397
+ qwen3-4b-thinking-unary/model_layers_14_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2398
+ qwen3-4b-thinking-unary/model_layers_18_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2399
+ qwen3-4b-thinking-unary/model_layers_3_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2400
+ qwen3-4b-thinking-unary/model_layers_12_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2401
+ qwen3-4b-thinking-unary/model_layers_5_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2402
+ qwen3-4b-thinking-unary/model_layers_11_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2403
+ qwen3-4b-thinking-unary/model_layers_28_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2404
+ qwen3-4b-thinking-unary/model_layers_29_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2405
+ qwen3-4b-thinking-unary/model_layers_31_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2406
+ qwen3-4b-thinking-unary/model_layers_20_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2407
+ qwen3-4b-thinking-unary/model_layers_17_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2408
+ qwen3-4b-thinking-unary/model_layers_0_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2409
+ qwen3-4b-thinking-unary/model_layers_14_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2410
+ qwen3-4b-thinking-unary/model_layers_12_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2411
+ qwen3-4b-thinking-unary/model_layers_6_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2412
+ qwen3-4b-thinking-unary/model_layers_35_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2413
+ qwen3-4b-thinking-unary/model_layers_33_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2414
+ qwen3-4b-thinking-unary/model_layers_10_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2415
+ qwen3-4b-thinking-unary/model_layers_34_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2416
+ qwen3-4b-thinking-unary/model_layers_32_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2417
+ qwen3-4b-thinking-unary/model_layers_8_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2418
+ qwen3-4b-thinking-unary/model_layers_35_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2419
+ qwen3-4b-thinking-unary/model_layers_14_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2420
+ qwen3-4b-thinking-unary/model_layers_31_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2421
+ qwen3-4b-thinking-unary/model_layers_12_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2422
+ qwen3-4b-thinking-unary/model_layers_21_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2423
+ qwen3-4b-thinking-unary/model_layers_6_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2424
+ qwen3-4b-thinking-unary/model_layers_12_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2425
+ qwen3-4b-thinking-unary/model_layers_15_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2426
+ qwen3-4b-thinking-unary/model_layers_27_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2427
+ qwen3-4b-thinking-unary/model_layers_16_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2428
+ qwen3-4b-thinking-unary/model_layers_31_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2429
+ qwen3-4b-thinking-unary/model_layers_26_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2430
+ qwen3-4b-thinking-unary/model_layers_7_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2431
+ qwen3-4b-thinking-unary/model_layers_11_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2432
+ qwen3-4b-thinking-unary/model_layers_15_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2433
+ qwen3-4b-thinking-unary/model_layers_10_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2434
+ qwen3-4b-thinking-unary/model_layers_8_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2435
+ qwen3-4b-thinking-unary/model_layers_9_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2436
+ qwen3-4b-thinking-unary/model_layers_1_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2437
+ qwen3-4b-thinking-unary/model_layers_32_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2438
+ qwen3-4b-thinking-unary/model_layers_19_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2439
+ qwen3-4b-thinking-unary/model_layers_21_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2440
+ qwen3-4b-thinking-unary/model_layers_4_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2441
+ qwen3-4b-thinking-unary/model_layers_32_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2442
+ qwen3-4b-thinking-unary/model_layers_10_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2443
+ qwen3-4b-thinking-unary/model_layers_9_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2444
+ qwen3-4b-thinking-unary/model_layers_13_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2445
+ qwen3-4b-thinking-unary/model_layers_21_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2446
+ qwen3-4b-thinking-unary/model_layers_5_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2447
+ qwen3-4b-thinking-unary/model_layers_4_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2448
+ qwen3-4b-thinking-unary/model_layers_13_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2449
+ qwen3-4b-thinking-unary/model_layers_27_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2450
+ qwen3-4b-thinking-unary/model_layers_10_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2451
+ qwen3-4b-thinking-unary/model_layers_11_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2452
+ qwen3-4b-thinking-unary/model_layers_4_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2453
+ qwen3-4b-thinking-unary/model_layers_19_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2454
+ qwen3-4b-thinking-unary/model_layers_29_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2455
+ qwen3-4b-thinking-unary/model_layers_18_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2456
+ qwen3-4b-thinking-unary/model_layers_23_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2457
+ qwen3-4b-thinking-unary/model_layers_28_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2458
+ qwen3-4b-thinking-unary/model_layers_26_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2459
+ qwen3-4b-thinking-unary/model_layers_32_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2460
+ qwen3-4b-thinking-unary/model_layers_28_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2461
+ qwen3-4b-thinking-unary/model_layers_6_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2462
+ qwen3-4b-thinking-unary/model_layers_27_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2463
+ qwen3-4b-thinking-unary/model_layers_18_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2464
+ qwen3-4b-thinking-unary/model_layers_27_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2465
+ qwen3-4b-thinking-unary/model_layers_17_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2466
+ qwen3-4b-thinking-unary/model_layers_34_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2467
+ qwen3-4b-thinking-unary/model_layers_31_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2468
+ qwen3-4b-thinking-unary/model_layers_18_self_attn_k_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2469
+ qwen3-4b-thinking-unary/model_layers_19_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2470
+ qwen3-4b-thinking-unary/model_layers_22_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2471
+ qwen3-4b-thinking-unary/model_layers_3_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2472
+ qwen3-4b-thinking-unary/model_layers_31_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2473
+ qwen3-4b-thinking-unary/model_layers_22_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2474
+ qwen3-4b-thinking-unary/model_layers_11_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2475
+ qwen3-4b-thinking-unary/model_layers_7_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2476
+ qwen3-4b-thinking-unary/model_layers_3_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2477
+ qwen3-4b-thinking-unary/model_layers_32_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2478
+ qwen3-4b-thinking-unary/model_layers_8_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2479
+ qwen3-4b-thinking-unary/model_layers_21_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2480
+ qwen3-4b-thinking-unary/model_layers_32_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2481
+ qwen3-4b-thinking-unary/model_layers_34_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2482
+ qwen3-4b-thinking-unary/model_layers_25_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2483
+ qwen3-4b-thinking-unary/model_layers_27_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2484
+ qwen3-4b-thinking-unary/model_layers_17_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2485
+ qwen3-4b-thinking-unary/model_layers_0_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2486
+ qwen3-4b-thinking-unary/model_layers_3_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2487
+ qwen3-4b-thinking-unary/model_layers_2_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2488
+ qwen3-4b-thinking-unary/model_layers_29_mlp_down_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2489
+ qwen3-4b-thinking-unary/model_layers_5_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2490
+ qwen3-4b-thinking-unary/model_layers_25_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2491
+ qwen3-4b-thinking-unary/model_layers_27_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2492
+ qwen3-4b-thinking-unary/model_layers_11_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2493
+ qwen3-4b-thinking-unary/model_layers_24_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2494
+ qwen3-4b-thinking-unary/model_layers_23_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2495
+ qwen3-4b-thinking-unary/model_layers_23_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2496
+ qwen3-4b-thinking-unary/model_layers_5_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2497
+ qwen3-4b-thinking-unary/model_layers_14_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2498
+ qwen3-4b-thinking-unary/model_layers_2_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2499
+ qwen3-4b-thinking-unary/model_layers_33_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2500
+ qwen3-4b-thinking-unary/model_layers_34_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2501
+ qwen3-4b-thinking-unary/model_layers_12_self_attn_o_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2502
+ qwen3-4b-thinking-unary/model_layers_23_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2503
+ qwen3-4b-thinking-unary/model_layers_2_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2504
+ qwen3-4b-thinking-unary/model_layers_21_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2505
+ qwen3-4b-thinking-unary/model_layers_20_self_attn_v_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2506
+ qwen3-4b-thinking-unary/model_layers_34_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2507
+ qwen3-4b-thinking-unary/model_layers_1_self_attn_o_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2508
+ qwen3-4b-thinking-unary/model_layers_25_mlp_up_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2509
+ qwen3-4b-thinking-unary/model_layers_2_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2510
+ qwen3-4b-thinking-unary/model_layers_5_mlp_down_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2511
+ qwen3-4b-thinking-unary/model_layers_0_mlp_gate_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2512
+ qwen3-4b-thinking-unary/model_layers_3_mlp_up_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2513
+ qwen3-4b-thinking-unary/model_layers_7_self_attn_k_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2514
+ qwen3-4b-thinking-unary/model_layers_19_self_attn_q_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2515
+ qwen3-4b-thinking-unary/model_layers_30_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2516
+ qwen3-4b-thinking-unary/model_layers_27_self_attn_v_proj_weight.planes filter=lfs diff=lfs merge=lfs -text
2517
+ qwen3-4b-thinking-unary/model_layers_16_self_attn_q_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2518
+ qwen3-4b-thinking-unary/model_layers_27_mlp_gate_proj_weight.sign filter=lfs diff=lfs merge=lfs -text
2519
+ qwen3-4b-thinking-hf/tokenizer.json filter=lfs diff=lfs merge=lfs -text
deepseek-r1-1.5b-gunary/config.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "hidden_size": 1536,
3
+ "intermediate_size": 8960,
4
+ "num_attention_heads": 12,
5
+ "num_key_value_heads": 2,
6
+ "num_hidden_layers": 28,
7
+ "vocab_size": 151936,
8
+ "head_dim": 128,
9
+ "rope_theta": 1000000.0,
10
+ "rms_norm_eps": 1e-06,
11
+ "n_planes": 7,
12
+ "group_size": 32,
13
+ "quant_type": "unary_group"
14
+ }
deepseek-r1-1.5b-gunary/lm_head_weight.fp16 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cca68befcc8201afc0eb54623dd20bd2af92acfe3cff767e6f8e6c0ddad2a397
3
+ size 466747392
deepseek-r1-1.5b-gunary/model_embed_tokens_weight.fp16 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e55610c68685326d482c594ff3bb16141e71a0d219fe729211562ab630953c6e
3
+ size 466747392
deepseek-r1-1.5b-gunary/model_layers_0_self_attn_k_proj_bias.fp16 ADDED
Binary file (512 Bytes). View file
 
deepseek-r1-1.5b-gunary/model_layers_0_self_attn_k_proj_weight.sign ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_10_input_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_10_self_attn_k_proj_weight.gscales ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_10_self_attn_q_proj_bias.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_11_self_attn_k_proj_bias.fp16 ADDED
Binary file (512 Bytes). View file
 
deepseek-r1-1.5b-gunary/model_layers_11_self_attn_k_proj_weight.sign ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_11_self_attn_q_proj_bias.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_11_self_attn_v_proj_bias.fp16 ADDED
Binary file (512 Bytes). View file
 
deepseek-r1-1.5b-gunary/model_layers_12_input_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_12_self_attn_v_proj_weight.gscales ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_13_input_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_13_self_attn_k_proj_bias.fp16 ADDED
Binary file (512 Bytes). View file
 
deepseek-r1-1.5b-gunary/model_layers_13_self_attn_v_proj_weight.gscales ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_13_self_attn_v_proj_weight.sign ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_14_self_attn_k_proj_weight.gscales ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_14_self_attn_q_proj_bias.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_14_self_attn_v_proj_bias.fp16 ADDED
Binary file (512 Bytes). View file
 
deepseek-r1-1.5b-gunary/model_layers_15_self_attn_k_proj_weight.sign ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_16_self_attn_k_proj_weight.sign ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_16_self_attn_q_proj_bias.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_16_self_attn_q_proj_weight.sign ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e2b813fc3e9787f2d0bc03c73674ad124a10e24382e18b0b29acba670d9f8e33
3
+ size 294912
deepseek-r1-1.5b-gunary/model_layers_16_self_attn_v_proj_bias.fp16 ADDED
Binary file (512 Bytes). View file
 
deepseek-r1-1.5b-gunary/model_layers_16_self_attn_v_proj_weight.gscales ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_17_input_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_17_post_attention_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_17_self_attn_q_proj_bias.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_17_self_attn_v_proj_weight.sign ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_18_input_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_18_self_attn_k_proj_weight.sign ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_19_self_attn_k_proj_weight.sign ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_19_self_attn_q_proj_weight.planes ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a0b6dafcb3699083eab377e291ab7c7db974f7d4b5328d082e9d095667dc39d6
3
+ size 2064384
deepseek-r1-1.5b-gunary/model_layers_1_self_attn_q_proj_bias.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_1_self_attn_v_proj_weight.sign ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_20_input_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_20_post_attention_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_20_self_attn_k_proj_bias.fp16 ADDED
Binary file (512 Bytes). View file
 
deepseek-r1-1.5b-gunary/model_layers_20_self_attn_k_proj_weight.sign ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_21_post_attention_layernorm_weight.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_21_self_attn_k_proj_weight.gscales ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_21_self_attn_v_proj_bias.fp16 ADDED
Binary file (512 Bytes). View file
 
deepseek-r1-1.5b-gunary/model_layers_21_self_attn_v_proj_weight.gscales ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_22_self_attn_k_proj_weight.sign ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_22_self_attn_q_proj_bias.fp16 ADDED
Binary file (3.07 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_22_self_attn_v_proj_weight.gscales ADDED
Binary file (49.2 kB). View file
 
deepseek-r1-1.5b-gunary/model_layers_22_self_attn_v_proj_weight.sign ADDED
Binary file (49.2 kB). View file