Meet7 0.6B — Experimental

A continued fine-tune of Meet7 0.6B, trained at a lower learning rate on the same 600-sample dataset. Trades Meet7's sharp BoolQ spike for more balanced commonsense and reasoning gains across the board.

Benchmarks

0-shot evaluation, scores are acc_norm.

Task	Qwen3-0.6B (Base)	Meet7 0.6B	Experimental	Δ vs Base
BoolQ	0.3798	0.5554	0.3991	+01.93%
ARC Easy	0.3384	0.3952	0.3965	+05.81%
ARC Challenge	0.2841	0.3285	0.3259	+04.18%
HellaSwag	0.3981	0.4205	0.4265	+02.84%
PIQA	0.6338	0.6583	0.6687	+03.49%
Winogrande	0.5225	0.5201	0.5304	+00.79%

What these measure

BoolQ — Reading comprehension and yes/no factual grounding
ARC Easy / Challenge — Grade-school science reasoning; Challenge is the retrieval-resistant subset
HellaSwag — Commonsense sentence completion
PIQA — Physical world intuition
Winogrande — Commonsense pronoun resolution

vs Meet7 0.6B

This model is more balanced than Meet7. It outperforms Meet7 on HellaSwag, PIQA, and Winogrande — the physical and commonsense intuition tasks — at the cost of Meet7's large BoolQ advantage. If you need consistent commonsense reasoning, prefer this model. If yes/no QA is your primary use case, prefer Meet7.

Model Details


Developed by	Ma7ee7
License	Apache-2.0
Base model	Ma7ee7/Meet7_0.6b
Original base	unsloth/Qwen3-0.6B-unsloth-bnb-4bit
Training samples	600
Training	Continued LoRA fine-tune, lower LR