FlameF0X commited on
Commit
2209baa
·
verified ·
1 Parent(s): 798dbed

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -50
README.md CHANGED
@@ -25,63 +25,30 @@ The model is **intentionally experimental** — it’s not aligned, fact-checked
25
 
26
  ## Training Details
27
 
28
- * **Dataset:** ~45,830 characters (a curated text corpus repeated for exposure)
29
- * **Vocabulary:** 34 characters (all lowercased)
30
- * **Sequence length:** 128
31
- * **Training iterations:** 2,000
32
- * **Batch size:** 2
33
- * **Optimizer:** AdamW, learning rate 3e-4
34
- * **Model parameters:** 711,106
35
  * **Performance notes:** Each iteration takes roughly 400–500 ms; 100 iterations take ~45 s on average. Loss steadily decreased from 3.53 to 2.15 over training.
36
 
37
- **Example generation (iteration 1200):**
38
-
39
- ```
40
-
41
- Prompt: "The quick"
42
- Generated: the quick efehn. dethe cans the fice the fpeens antary of eathetint, an thadat hitimes the and cow thig, and
43
-
44
- ````
45
-
46
- These outputs capture the **chaotic creativity** of a character-level model: a mixture of readable words, invented forms, and surprising sequences.
47
-
48
- ---
49
-
50
- ## Intended Uses
51
-
52
- * **Character-level text generation experiments**
53
- * **Research and education:** studying lightweight language models and sequence learning
54
- * **Creative exploration:** generating quirky text or procedural content for games, demos, or artistic projects
55
-
56
- > ⚠️ i3-tiny is experimental and **not intended for production or high-stakes applications**. Text may be repetitive, nonsensical, or inconsistent.
57
 
58
- ---
59
-
60
- ## Limitations
61
 
62
- * Small vocabulary and character-level modeling limit natural language fluency
63
- * Outputs are **highly experimental** and not fact-checked
64
- * Generated sequences can be repetitive, garbled, or unpredictable
65
- * Not aligned or safety-checked
66
-
67
- ---
68
 
69
- ## Model Weights
70
-
71
- * Stored in `pytorch_model.bin` (or `model.safetensors`)
72
- * Compatible with PyTorch and Hugging Face Transformers
73
- * Requires `modeling_i3.py` and `config.json` to instantiate
74
-
75
- ---
76
 
77
- ## Usage Example
78
 
79
- ```python
80
- # not available
81
- ````
82
 
83
- ---
 
84
 
85
- ## Citation
86
 
87
- If you use i3-tiny for research or experimentation, please cite this repository and acknowledge it as an experimental character-level model.
 
25
 
26
  ## Training Details
27
 
28
+ * **Dataset:** ~45,830 characters (a curated text corpus repeated for exposure)  
29
+ * **Vocabulary:** 34 characters (all lowercased)  
30
+ * **Sequence length:** 128  
31
+ * **Training iterations:** 2,000  
32
+ * **Batch size:** 2  
33
+ * **Optimizer:** AdamW, learning rate 3e-4  
34
+ * **Model parameters:** 711,106  
35
  * **Performance notes:** Each iteration takes roughly 400–500 ms; 100 iterations take ~45 s on average. Loss steadily decreased from 3.53 to 2.15 over training.
36
 
37
+ ### Training Analysis
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
 
39
+ The charts below illustrate the model's performance over the 2,000 training iterations.
 
 
40
 
41
+ The **Training Loss Over Iterations** plot shows a clear learning trend, with the 50-iteration moving average (red line) confirming a steady decrease in Cross-Entropy loss from $\sim3.5$ to $\sim2.1$. The **Training Time Performance** plot shows a consistent block time per 100 iterations, resulting in a nearly linear increase in cumulative training time, demonstrating stable and predictable training execution.
 
 
 
 
 
42
 
43
+ ![image](https://cdn-uploads.huggingface.co/production/uploads/6615494716917dfdc645c44e/Z0r9xl1cY5KZo3ztnmS7Z.png)
 
 
 
 
 
 
44
 
45
+ **Example generation (iteration 1200):**
46
 
47
+ ```
 
 
48
 
49
+ Prompt: "The quick"
50
+ Generated: the quick efehn. dethe cans the fice the fpeens antary of eathetint, an thadat hitimes the and cow thig, and
51
 
52
+ ```
53
 
54
+ These outputs capture the **chaotic creativity** of a character-level model: a mixture of readable words, invented forms, and surprising sequences.