| | --- |
| | base_model: unsloth/gemma-3-1b-it |
| | tags: |
| | - text-generation-inference |
| | - transformers |
| | - unsloth |
| | - gemma3_text |
| | license: gemma |
| | language: |
| | - en |
| | --- |
| | |
| | # Gemma3NPC-1b |
| |
|
| | **A new attempt in training Gemma3NPC.** |
| |
|
| | ***Tensorboard data are available!*** |
| |
|
| | --- |
| |
|
| | It's been a while since the last Gemma3NPC model release, in the mean while we were working on some other models like [GemmaThink](https://huggingface.co/collections/chimbiwide/gemmathink). |
| |
|
| | Now we are back with the newest **Gemma3NPC-1b**, trained using our [RolePlay-NPCv2](https://huggingface.co/datasets/chimbiwide/RolePlay-NPCv2) dataset. |
| |
|
| | --- |
| |
|
| | ### Training Parameters |
| |
|
| | We trained this model as a rank-32 LoRA adapter with two epoches over `RolePlay-NPCv2` using a 80GB A100 in Google Colab. For this run, we employed a learning rate of `2e-5` and a total batch size of 8 and gradient accumulation steps of 4. A cosine learning rate scheduler was used with an 150-step warmup. With a gradient clipping of 1.0. |
| |
|
| | Check out our training notebook [here](https://github.com/chimbiwide/Gemma3NPC/blob/main/Training/Gemma3NPC_1b.ipynb). |
| |
|
| | --- |
| |
|
| | ### Changes & Performance |
| |
|
| | With this new 1b model, we used much more aggresive training parameters and added some NSFW dataset to experiment with the results. We noticed a few really interesting responses: |
| |
|
| | - **There seems to be some sign of "reasoning"** |
| |
|
| |  |
| |
|
| |  |
| |
|
| | - **The model is less likely to break out of character** |
| | - |
| | Something up to the users to explore for themselves, remember to provide a roleplaying prompt first! |
| |
|
| | --- |
| |
|
| | ### Future Work |
| |
|
| | Now, we will be focusing on further improving Gemma3NPC, not only just through training parameters. |
| | 1. Better data (most of our data are old and need an update), either collected or synthetically generated. |
| | 2. Better & new models, expand beyond Gemma3 model family, our next goal is a Qwen3 based model. |
| | 3. Adding GRPO into the training loop. |
| |
|
| | These improvements serve our ultimate goal of creating an small agentic NPC model, with good RP quality and tool-calling for dynamic in-game interactions. |
| |
|
| | We also plan to create some sort of a Unity game demo,it's on its way. |
| |
|