deltakitsune
/

squishy

Safetensors

English

llama

Model card Files Files and versions

xet

Community

deltakitsune commited on Apr 16

Commit

c88502e

verified ·

1 Parent(s): aa82703

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +61 -22

README.md CHANGED Viewed

@@ -1,32 +1,71 @@
-# TinyStories-150M-Starter
-This is a small decoder-only language model exported from the Kitsune Fine-Tuning Suite `Create` pipeline.
-## What This Export Is
-- Architecture family for tooling compatibility: `LlamaForCausalLM`
-- Original project type: manual-first small transformer scaffold
-- Export source checkpoint: `final.pt`
-## Notes
-This export is meant to behave like a normal Hugging Face causal LM checkpoint so it can be loaded by standard tooling.
-It was not originally trained as an off-the-shelf Llama release; it was mapped into a Llama-compatible layout for portability.
-## Training Context
-- Dataset ID: `roneneldan/TinyStories`
-- Dataset Path: `None`
-- Train Split: `train`
-- Max Steps: `20000`
-- Learning Rate: `0.0003`
-## Example
-```python
-from transformers import AutoModelForCausalLM, AutoTokenizer
-model_id = "hf_llama"
-tokenizer = AutoTokenizer.from_pretrained(model_id)
-model = AutoModelForCausalLM.from_pretrained(model_id)
-```

+### Squishy 150m
+![](/assets/fauxpaslife/squishy-150m-TinyStories/36c44ed6-cc72-47b3-b70c-9af2b33d3e08)
+Created/Trained from "scratch" locally. Its framework sets up the architecture for future models and specialists. This is a **COMPLETIONIST** model. That means, you start a sentence, it will finish it. While it works ok for single turns + "Once upon a time", it can go off track with multiple-turn "conversations." Baby brain. Baby memory and this one is not chat tuned.
+Temp does matter, or you will get a "Ow, bad Squishy!" <br><br>
+**0.8 Temp - Little Lilly loved a mouse and then buried it alive.** <br>
+`Once upon a time, there was a little girl named Lily. She loved to play outside and explore the world around her. One day, she found a small hole in the ground. She looked inside and saw that there was a little mouse inside. Lily wanted to help the mouse, so she gently put it in her pocket and took it home. She made a little house for the mouse and gave it some food and water. The mouse was very happy and grateful. After a few days, the mouse's family was very busy. They had to work hard all day long. Lily wanted to help, but she was too little. So, she decided to bury the mouse in the ground. She dug a hole in the ground and carefully put the mouse inside. She covered the hole with dirt and patted it down. Lily felt happy that she could help the mouse and make the world a little better.`
+**0.7 Temp - Little Lilly actually made a coherent (very) childlike response.** <br>
+`Once upon a time, there was a little girl named Lily. She loved to play outside in her backyard. One day, she saw a snake slithering around in the grass. She was scared and ran to her mommy. "Mommy, mommy! There's a snake in the grass!" Lily said. Her mommy came outside and saw the snake too. "Don't worry, Lily. The snake is harmless. It won't hurt you," her mommy said. Lily felt better and went back to playing. Later that day, Lily saw her friend Billy in her backyard. "Hi Billy!" Lily said. "Do you want to play with me?" Billy said yes and they played together. They had so much fun that Lily forgot about the snake. She was happy that she was brave enough to play with her friend.`
+**Prompt: _Once upon a time_**
+![image.png](/assets/fauxpaslife/squishy-150m-TinyStories/683e684e-c1f9-444c-a173-d9eefb217818)
+Created/Trained using the TinyStories dataset using the **delta: kitsune : forge** tools.
+![](/assets/fauxpaslife/squishy-150m-TinyStories/f9ffe1d1-ebe5-4a3d-9f21-55ba1383e63a)
+  Kitsune TinyStories 150M
+  A small decoder-only transformer trained from scratch in the Kitsune Fine Tuning Suite
+  as a proof-of-concept consumer-hardware pipeline run. This model was trained on
+  TinyStories-style data to validate the end-to-end workflow: tokenizer training, raw
+  PyTorch training loop, checkpointing, sampling, Hugging Face export, GGUF export, and
+  Ollama deployment.
+  This is a small completion model, not a general-purpose assistant and not a chat-tuned
+  model. It works best for short story generation, simple continuations, and lightweight
+  creative experiments. Output quality is sensitive to sampling temperature; around 0.7
+  is a good default, while higher values may become unstable or surreal.
+  Recommended usage:
+  - short story prompts
+  - completion/generation tasks
+  - educational or pipeline demonstration use
+  - small-model experimentation
+  Not intended for:
+  - factual QA
+  - long-form reasoning
+  - reliable instruction following
+  - safety-critical use
+  Architecture:
+  - decoder-only transformer
+  - approximately 150M parameters
+  - modern small-model design with RoPE, RMSNorm, and SwiGLU
+  - trained locally on consumer GPU hardware
+  Suggested inference settings:
+  - temperature: 0.7
+  - top_p: 0.95
+  - keep prompts short and concrete
+  Lineage:
+  - created and trained in the Kitsune Fine Tuning Suite
+  - exported as a Hugging Face-compatible model and GGUF for local deployment