Sweaterdog
/

Andy-3.5

Model card Files Files and versions

Sweaterdog commited on Feb 6, 2025

Commit

f997800

·

verified ·

1 Parent(s): 361291c

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -41,8 +41,9 @@ Andy-3.5 also knows how to build / use !newAction to perform commands, it was tr
 # What models can I choose?
-There are going to be 2 *(maybe 3)* model sizes avaliable, Regular, Mini *(And possibly large)*
 * Regular is a 7B parameter model, tuned from [Deepseek-R1 Distilled](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)
 * Mini is a 1.5B parameter model, also tuned from [Deepseek-R1 Distilled](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B)
 * Large *(Might)* be a 32b parameter model, again tuned from [Deepseek-R1 Distilled](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B) *- This model may not exist,* ***ever***
@@ -70,7 +71,7 @@ The preview model of Andy-3.5, is Andy-3.5-teensy, a small model and tune with o
 I would not recommend Andy-3.5-teensy, I felt like making a joke, and a joke was made, *(The Andy-3.5-teensy model was a big hope, but it sucks, try out the q2_k model!)*
-When the full versions of Andy-3.5 and Andy-3.5-mini *(And possibly Andy-3.5-large)* release, they will both be trained on a context length of 32,000 to ensure proper usage during playing.
 # 🔥UPDATE🔥

 # What models can I choose?
+There are going to be 3 *(maybe 4)* model sizes avaliable, Regular, Small, and Mini *(possibly large)*
 * Regular is a 7B parameter model, tuned from [Deepseek-R1 Distilled](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)
+* Small is a 3B parameter model, tuned from [Qwen2.5 3B](Qwen/Qwen2.5-3B-Instruct)
 * Mini is a 1.5B parameter model, also tuned from [Deepseek-R1 Distilled](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B)
 * Large *(Might)* be a 32b parameter model, again tuned from [Deepseek-R1 Distilled](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B) *- This model may not exist,* ***ever***
 I would not recommend Andy-3.5-teensy, I felt like making a joke, and a joke was made, *(The Andy-3.5-teensy model was a big hope, but it sucks, try out the q2_k model!)*
+The full versions of Andy-3.5, Andy-3.5-small, Andy-3.5-mini *(And possibly Andy-3.5-large)* have been released, and they all have a context length of 32,000 to ensure proper usage during playing.
 # 🔥UPDATE🔥