Qwisine / README.md

Wasim

Update README.md

cd54f7a verified 8 months ago

5.74 kB

	---
	base_model: unsloth/Qwen3-14B-unsloth-bnb-4bit
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- qwen3
	- gguf
	license: apache-2.0
	language:
	- en
	datasets:
	- mugivara1/convex-reasoning-new-train
	- mugivara1/convex-reasoning
	---
	# Qwisine 14B
	<!-- Smaller mascot image -->
	<div align="center">
	<img src="https://i.imgur.com/JbfjKSy.png" alt="Qwisine Mascot" style="width:140px; margin-bottom:20px;" />
	</div>

	<!-- Banner with logo and evaluation chart -->
	<div align="center">
	<img src="https://i.imgur.com/AxRWCK3.png" alt="Qwisine Banner" style="width:100%; height:350px; object-fit: cover; margin-bottom:20px;" />
	</div>



	---

	## Model details

	\| Field \| Description \|
	\| ----------------- \| --------------------------------------------------------------------------------------------------------------- \|
	\| Base model \| [Qwen‑3-14B](https://huggingface.co/Qwen/Qwen3-14B) (pre‑trained) \|
	\| Fine‑tuned by \| Mugi \|
	\| Task \| Question‑Answering & Code Generation for the [Convex](https://convex.dev) TypeScript backend/database framework \|
	\| Language(s) \| English (developer‑oriented) \|
	\| License \| NAH just use it. \|
	\| Model name \| Qwisine \|

	Qwisine is a specialised version of Qwen‑3 fine‑tuned on curated Convex documentation & synthethic code and community Q\&A. The model understands Convex‐specific concepts (data modelling, mutations, actions, idioms, client usage, etc.) and can generate code snippets or explain behaviour in plain English.

	---

	## Intended use & limitations

	Primary use‑case

	* Conversational assistant for developers building on Convex.
	* Drafting / Helping with convex orientated questions & tasks.
	* Documentation chatbots or support assistants.

	Out‑of‑scope

	* Production‑critical decision making without human review.

	---

	## Dataset

	* Size : 938 Q\&A pairs
	* Source: Convex official docs, example apps, public issues, community Discord, and synthetic edge‑cases.
	* Question types (distilled)

	* `what_is` – factual look‑ups (no reasoning)
	* `why` – causal explanations (no reasoning)
	* `task` – recipe‑style how‑to (with reasoning)
	* `edge_case` – tricky or undocumented scenarios (with reasoning)
	* `v‑task` – verbose multi‑step tasks (with reasoning)

	Reasoning‑bearing examples represent \~85 % of the dataset.

	---

	## Training procedure -- will add later since i ran & experimented MANY RUNS 😭😭😭😭

	* Epochs : **
	* Batch : **
	* LR / schedule : **
	* Optimizer : **

	Fine‑tuning followed standard QLORA with unsloth. No additional RLHF was applied.

	---

	## Evaluation results

	\| Category \| Think mode \| Fully Non‑Think mode \|
	\| -------------- \| --------------------------- \| ------------------------ \|
	\| Fundamentals \| 75.05 % \| 73.44 % \|
	\| Data modelling \| 82.82 % \| 87.36 % \|
	\| Queries \| 74.38 % \| 74.19 % \|
	\| Mutations \| 71.04 % \| 73.59 % \|
	\| Actions \| 63.05 % \| 49.27 % \|
	\| Idioms \| 75.06 % \| 75.06 % \|
	\| Clients \| 69.84 % \| 69.84 % \|
	\| Average \| 73.03 % \| 71.82 % \|

	---

	### Think Mode

	\| Parameter \| Value \| Notes \|
	\| ------------- \| ----- \| ------------------------------- \|
	\| `temperature` \| 0.6 \| Reasoned answers with structure \|
	\| `top_p` \| 0.95 \| Wider beam of sampling \|
	\| `top_k` \| 20 \| \|
	\| `min_p` \| 0 \| \|

	### Non-Think Mode

	\| Parameter \| Value \| Notes \|
	\| ------------- \| ----- \| --------------------------------- \|
	\| `temperature` \| 0.7 \| More diversity for simple prompts \|
	\| `top_p` \| 0.8 \| Slightly tighter sampling \|
	\| `top_k` \| 20 \| \|
	\| `min_p` \| 0 \| \|

	<sub>Adjust as needed for your deployment; these were used in LM Studio during evaluation.</sub>

	---

	## How to run locally

	```bash
	# LM Studio
	search "Qwisine" in models menu.

	# Ollama
	il add soon.
	# Llama‑cpp
	il add soon.
	```

	---

	## Limitations & biases

	* Training data is entirely Convex‑centred; the model may hallucinate.
	* The dataset size is modest (938 samples); edge‑case coverage is still incomplete and so is more complex prompts like create project from scratch with multiple steps and instructions.

	---

	## Future work

	not sure yet

	---

	## Citation

	```bibtex
	@misc{qwisine2025,
	title = {Qwisine: A Qwen‑3 model fine‑tuned for Convex},
	author = {mugi},
	year = {2025},
	url = {https://huggingface.co/mugivara1/Qwisine},
	}
	```

	---

	## Acknowledgements

	(To be completed)

	Convex • Qwen‑3 • ...