bananafish-522

Release: May 2025
Base Model: Qwen3
Type: Instruction-tuned GLT (General Language Transformer)


Overview

bananafish-522 is an update over bananafish-0517, trained for 5,000 steps on the 49,000-row ITP dataset. This is roughly one-fifth of a full epoch, so only a partial pass over the dataset. To improve coverage, the dataset was shuffled โ€” ensuring samples from later parts were still seen during training.


What's Improved

  • Stronger coherence
  • Fewer output artifacts
  • More consistent instruction-following
  • Better formatting across responses
  • More usable as a base for creative or chat fine-tuning

Remaining Issues

  • Post-response artifacts
    Sometimes, the model appends:

    • One word in a foreign language
    • Or repeats a single Chinese character until max tokens

    Much rarer than in bananafish-0517. Likely caused by:

    • Incomplete training (only 5,000 steps)
    • Or possibly improper EOS handling (generating past <|im_end|>)

    To address this, <|im_end|> was explicitly added as the EOS token in the generation config.

  • High hallucination rate
    Especially for anything dated 2024 or later. Always verify facts before use.


Purpose

bananafish-522 was built as a clean instruction-following base model without Qwen3โ€™s "thinking toggle." It aims to support creative writing and roleplay fine-tuning where chain-of-thought generation is unwanted or intrusive.


Training Details

  • Dataset: ITP (Instruction Tuning Public)
  • Size: 49,000 examples
  • Steps: 5,000
  • Coverage: ~1/5 of a full epoch
  • Shuffled: Yes โ€” to include later samples early in training
  • Contents: Instructions, chat, Q&A, STEM, writing, etc.

Compared to the older 10k-row NoRobots dataset, ITP is larger, more diverse, and showed better alignment โ€” even with fewer total steps. The difference came from dataset quality, not scale.


Usage

Use in any transformer-compatible chat interface. Make sure to stop generation at <|im_end|>.

Chat Template

<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
What's the capital of France?<|im_end|>
<|im_start|>assistant

Generation should stop at the next <|im_end|> token.


Notes

  • This is still a proof of concept
  • Only partially trained โ€” fine-tuning for specific tasks is recommended
  • A full version trained on all 49k rows with style tuning is planned

Downloads last month
4
Safetensors
Model size
0.6B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for marcuscedricridia/bananafish-522

Quantizations
1 model