chimbiwide's picture
Update README.md
9b3eceb verified
metadata
base_model: unsloth/gemma-3n-e4b-it-unsloth-bnb-4bit
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - gemma3n
license: apache-2.0
language:
  - en
datasets:
  - chimbiwide/RolePlay-NPC

Gemma3NPC-it-beta

A test model with less convervative training parameters

As mentioned in our original article, we employed a very conservative training parameters for Gemma3NPC

Ever since then, we have always wanted to test the performance of the model when we make the training parameters less conservative.

So we present Gemma3NPC-it-beta.

Check out our training notebook here


Training parameters compared to Gemma3NPC-it

Parameter Gemma3NPC-it Gemma3NPC-it-beta
Learning Rate 2e-5 2.5e-5 (+25%)
Warmup Steps 800 100
gradient clipping 0.4 1.0

Here is a graph of the Step Training Loss, saved every 10 steps:

chart