Qwen3 Technical Report
Paper • 2505.09388 • Published • 339
This is a fine-tune of Qwen3-4B-Instruct-2507 on a dataset consisting entirely of Linux questions and answers curated from a bunch of random public datasets.
For now this is very basic, I'm just testing to see if this'll even work; don't expect anything amazing!
I plan on trying to create some sort of Linux-expert model that can help beginners - this model is not that...
The dataset mostly consists of Qwen3-235B-A22B and GPT-5 responses + a bit of GPT-4.
I'll see about releasing the dataset once it's actually decent.
-- Credit to the Qwen team for making the original model:
@misc{qwen3technicalreport,
title={Qwen3 Technical Report},
author={Qwen Team},
year={2025},
eprint={2505.09388},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2505.09388},
}