Sweaterdog commited on
Commit
f997800
·
verified ·
1 Parent(s): 361291c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -41,8 +41,9 @@ Andy-3.5 also knows how to build / use !newAction to perform commands, it was tr
41
 
42
  # What models can I choose?
43
 
44
- There are going to be 2 *(maybe 3)* model sizes avaliable, Regular, Mini *(And possibly large)*
45
  * Regular is a 7B parameter model, tuned from [Deepseek-R1 Distilled](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)
 
46
  * Mini is a 1.5B parameter model, also tuned from [Deepseek-R1 Distilled](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B)
47
  * Large *(Might)* be a 32b parameter model, again tuned from [Deepseek-R1 Distilled](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B) *- This model may not exist,* ***ever***
48
 
@@ -70,7 +71,7 @@ The preview model of Andy-3.5, is Andy-3.5-teensy, a small model and tune with o
70
  I would not recommend Andy-3.5-teensy, I felt like making a joke, and a joke was made, *(The Andy-3.5-teensy model was a big hope, but it sucks, try out the q2_k model!)*
71
 
72
 
73
- When the full versions of Andy-3.5 and Andy-3.5-mini *(And possibly Andy-3.5-large)* release, they will both be trained on a context length of 32,000 to ensure proper usage during playing.
74
 
75
  # 🔥UPDATE🔥
76
 
 
41
 
42
  # What models can I choose?
43
 
44
+ There are going to be 3 *(maybe 4)* model sizes avaliable, Regular, Small, and Mini *(possibly large)*
45
  * Regular is a 7B parameter model, tuned from [Deepseek-R1 Distilled](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)
46
+ * Small is a 3B parameter model, tuned from [Qwen2.5 3B](Qwen/Qwen2.5-3B-Instruct)
47
  * Mini is a 1.5B parameter model, also tuned from [Deepseek-R1 Distilled](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B)
48
  * Large *(Might)* be a 32b parameter model, again tuned from [Deepseek-R1 Distilled](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B) *- This model may not exist,* ***ever***
49
 
 
71
  I would not recommend Andy-3.5-teensy, I felt like making a joke, and a joke was made, *(The Andy-3.5-teensy model was a big hope, but it sucks, try out the q2_k model!)*
72
 
73
 
74
+ The full versions of Andy-3.5, Andy-3.5-small, Andy-3.5-mini *(And possibly Andy-3.5-large)* have been released, and they all have a context length of 32,000 to ensure proper usage during playing.
75
 
76
  # 🔥UPDATE🔥
77