|
|
--- |
|
|
base_model: |
|
|
- unsloth/Qwen2.5-7B-bnb-4bit |
|
|
tags: |
|
|
- text-generation-inference |
|
|
- transformers |
|
|
- unsloth |
|
|
- qwen2 |
|
|
- trl |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
datasets: |
|
|
- Sweaterdog/Andy-v3.5-Beta |
|
|
--- |
|
|
|
|
|
# Uploaded models |
|
|
|
|
|
- **Developed by:** Sweaterdog |
|
|
- **License:** apache-2.0 |
|
|
- **Finetuned from model :** unsloth/Qwen2.5-7B-bnb-4bit |
|
|
|
|
|
The MindCraft LLM tuning CSV file can be found here, this can be tweaked as needed. [MindCraft-LLM](https://huggingface.co/datasets/Sweaterdog/Andy-v3.5-Beta) |
|
|
|
|
|
# This is a very very early access Beta Model |
|
|
This model is NOT a final version, but instead is a test to see how well models can be with a small dataset. This dataset is also a test of how smaller models can be improved from extremely high quality, and as close to real-world scenarios as possible. |
|
|
|
|
|
This small dataset finally allows the model to code, and to store history, of course the crux of this dataset is in the playing part. |
|
|
|
|
|
The storing memory parts are real examples from in-game interactions |
|
|
|
|
|
The coding is artifical and was generated by GPT-o1, with the instruction to include reasoning and thinking in the comments of the code |
|
|
|
|
|
The playing is artificial and was generated by me, a human, and used prompts focusing on points where some models fail, such as mining. |
|
|
|
|
|
This model should not be a reflection on how smaller models play Minecraft, if it performs well, and better than Andy-v2-qwen, then Yay! If not, I wasn't expecting it to be better, (And neither should you!) |
|
|
|
|
|
You are totally allowed to test the beta model. |
|
|
|
|
|
I hope this model performs well for you! |
|
|
|
|
|
**ALSO** |
|
|
|
|
|
The models are going to change, I am changing hyperparameters on tuning to *(hopefully)* increase performance and decrease hallucinations |
|
|
|
|
|
*BTW, if you want to download this model, I suggest using llama.cpp to make a quantization of it, I would have done it during tuning but I ran out of GPU time on google colab* |
|
|
|
|
|
Attempt 4 of Andy-3.5 is out, I pruned the dataset, only taking the highest quality prompts from the dataset |