About:

This is a fine-tune of Qwen3-4B-Instruct-2507 on a dataset consisting entirely of Linux questions and answers curated from a bunch of random public datasets.

For now this is very basic, I'm just testing to see if this'll even work; don't expect anything amazing!

I plan on trying to create some sort of Linux-expert model that can help beginners - this model is not that...

The dataset mostly consists of Qwen3-235B-A22B and GPT-5 responses + a bit of GPT-4.

I'll see about releasing the dataset once it's actually decent.

-- Credit to the Qwen team for making the original model:

@misc{qwen3technicalreport,
      title={Qwen3 Technical Report}, 
      author={Qwen Team},
      year={2025},
      eprint={2505.09388},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.09388}, 
}

Downloads last month: 7

Safetensors

Model size

4B params

Tensor type

BF16

Model tree for fish-is-fishin/Qwen3-4B-2507-LinuxNerd-V1

Quantizations

1 model

Paper for fish-is-fishin/Qwen3-4B-2507-LinuxNerd-V1

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 339