bananafish-522
Release: May 2025
Base Model: Qwen3
Type: Instruction-tuned GLT (General Language Transformer)
Overview
bananafish-522 is an update over bananafish-0517, trained for 5,000 steps on the 49,000-row ITP dataset. This is roughly one-fifth of a full epoch, so only a partial pass over the dataset. To improve coverage, the dataset was shuffled โ ensuring samples from later parts were still seen during training.
What's Improved
- Stronger coherence
- Fewer output artifacts
- More consistent instruction-following
- Better formatting across responses
- More usable as a base for creative or chat fine-tuning
Remaining Issues
Post-response artifacts
Sometimes, the model appends:- One word in a foreign language
- Or repeats a single Chinese character until max tokens
Much rarer than in bananafish-0517. Likely caused by:
- Incomplete training (only 5,000 steps)
- Or possibly improper EOS handling (generating past
<|im_end|>)
To address this,
<|im_end|>was explicitly added as the EOS token in the generation config.High hallucination rate
Especially for anything dated 2024 or later. Always verify facts before use.
Purpose
bananafish-522 was built as a clean instruction-following base model without Qwen3โs "thinking toggle." It aims to support creative writing and roleplay fine-tuning where chain-of-thought generation is unwanted or intrusive.
Training Details
- Dataset: ITP (Instruction Tuning Public)
- Size: 49,000 examples
- Steps: 5,000
- Coverage: ~1/5 of a full epoch
- Shuffled: Yes โ to include later samples early in training
- Contents: Instructions, chat, Q&A, STEM, writing, etc.
Compared to the older 10k-row NoRobots dataset, ITP is larger, more diverse, and showed better alignment โ even with fewer total steps. The difference came from dataset quality, not scale.
Usage
Use in any transformer-compatible chat interface. Make sure to stop generation at <|im_end|>.
Chat Template
<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
What's the capital of France?<|im_end|>
<|im_start|>assistant
Generation should stop at the next
<|im_end|>token.
Notes
- This is still a proof of concept
- Only partially trained โ fine-tuning for specific tasks is recommended
- A full version trained on all 49k rows with style tuning is planned
- Downloads last month
- 4