deltakitsune commited on
Commit
c88502e
·
verified ·
1 Parent(s): aa82703

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +61 -22
README.md CHANGED
@@ -1,32 +1,71 @@
1
- # TinyStories-150M-Starter
2
 
3
- This is a small decoder-only language model exported from the Kitsune Fine-Tuning Suite `Create` pipeline.
4
 
5
- ## What This Export Is
6
 
7
- - Architecture family for tooling compatibility: `LlamaForCausalLM`
8
- - Original project type: manual-first small transformer scaffold
9
- - Export source checkpoint: `final.pt`
10
 
11
- ## Notes
12
 
13
- This export is meant to behave like a normal Hugging Face causal LM checkpoint so it can be loaded by standard tooling.
14
- It was not originally trained as an off-the-shelf Llama release; it was mapped into a Llama-compatible layout for portability.
15
 
16
- ## Training Context
17
 
18
- - Dataset ID: `roneneldan/TinyStories`
19
- - Dataset Path: `None`
20
- - Train Split: `train`
21
- - Max Steps: `20000`
22
- - Learning Rate: `0.0003`
23
 
24
- ## Example
25
 
26
- ```python
27
- from transformers import AutoModelForCausalLM, AutoTokenizer
28
 
29
- model_id = "hf_llama"
30
- tokenizer = AutoTokenizer.from_pretrained(model_id)
31
- model = AutoModelForCausalLM.from_pretrained(model_id)
32
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ### Squishy 150m
2
 
3
+ ![](/assets/fauxpaslife/squishy-150m-TinyStories/36c44ed6-cc72-47b3-b70c-9af2b33d3e08)
4
 
 
5
 
 
 
 
6
 
7
+ Created/Trained from "scratch" locally. Its framework sets up the architecture for future models and specialists. This is a **COMPLETIONIST** model. That means, you start a sentence, it will finish it. While it works ok for single turns + "Once upon a time", it can go off track with multiple-turn "conversations." Baby brain. Baby memory and this one is not chat tuned.
8
 
9
+ Temp does matter, or you will get a "Ow, bad Squishy!" <br><br>
 
10
 
11
+ **0.8 Temp - Little Lilly loved a mouse and then buried it alive.** <br>
12
 
13
+ `Once upon a time, there was a little girl named Lily. She loved to play outside and explore the world around her. One day, she found a small hole in the ground. She looked inside and saw that there was a little mouse inside. Lily wanted to help the mouse, so she gently put it in her pocket and took it home. She made a little house for the mouse and gave it some food and water. The mouse was very happy and grateful. After a few days, the mouse's family was very busy. They had to work hard all day long. Lily wanted to help, but she was too little. So, she decided to bury the mouse in the ground. She dug a hole in the ground and carefully put the mouse inside. She covered the hole with dirt and patted it down. Lily felt happy that she could help the mouse and make the world a little better.`
 
 
 
 
14
 
15
+ **0.7 Temp - Little Lilly actually made a coherent (very) childlike response.** <br>
16
 
17
+ `Once upon a time, there was a little girl named Lily. She loved to play outside in her backyard. One day, she saw a snake slithering around in the grass. She was scared and ran to her mommy. "Mommy, mommy! There's a snake in the grass!" Lily said. Her mommy came outside and saw the snake too. "Don't worry, Lily. The snake is harmless. It won't hurt you," her mommy said. Lily felt better and went back to playing. Later that day, Lily saw her friend Billy in her backyard. "Hi Billy!" Lily said. "Do you want to play with me?" Billy said yes and they played together. They had so much fun that Lily forgot about the snake. She was happy that she was brave enough to play with her friend.`
 
18
 
19
+ **Prompt: _Once upon a time_**
20
+
21
+ ![image.png](/assets/fauxpaslife/squishy-150m-TinyStories/683e684e-c1f9-444c-a173-d9eefb217818)
22
+
23
+
24
+ Created/Trained using the TinyStories dataset using the **delta: kitsune : forge** tools.
25
+
26
+ ![](/assets/fauxpaslife/squishy-150m-TinyStories/f9ffe1d1-ebe5-4a3d-9f21-55ba1383e63a)
27
+
28
+ Kitsune TinyStories 150M
29
+
30
+ A small decoder-only transformer trained from scratch in the Kitsune Fine Tuning Suite
31
+ as a proof-of-concept consumer-hardware pipeline run. This model was trained on
32
+ TinyStories-style data to validate the end-to-end workflow: tokenizer training, raw
33
+ PyTorch training loop, checkpointing, sampling, Hugging Face export, GGUF export, and
34
+ Ollama deployment.
35
+
36
+ This is a small completion model, not a general-purpose assistant and not a chat-tuned
37
+ model. It works best for short story generation, simple continuations, and lightweight
38
+ creative experiments. Output quality is sensitive to sampling temperature; around 0.7
39
+ is a good default, while higher values may become unstable or surreal.
40
+
41
+ Recommended usage:
42
+
43
+ - short story prompts
44
+ - completion/generation tasks
45
+ - educational or pipeline demonstration use
46
+ - small-model experimentation
47
+
48
+ Not intended for:
49
+
50
+ - factual QA
51
+ - long-form reasoning
52
+ - reliable instruction following
53
+ - safety-critical use
54
+
55
+ Architecture:
56
+
57
+ - decoder-only transformer
58
+ - approximately 150M parameters
59
+ - modern small-model design with RoPE, RMSNorm, and SwiGLU
60
+ - trained locally on consumer GPU hardware
61
+
62
+ Suggested inference settings:
63
+
64
+ - temperature: 0.7
65
+ - top_p: 0.95
66
+ - keep prompts short and concrete
67
+
68
+ Lineage:
69
+
70
+ - created and trained in the Kitsune Fine Tuning Suite
71
+ - exported as a Hugging Face-compatible model and GGUF for local deployment