i3-lab
/

i3-tiny

@@ -25,63 +25,30 @@ The model is **intentionally experimental** — it’s not aligned, fact-checked
 ## Training Details
-* **Dataset:** ~45,830 characters (a curated text corpus repeated for exposure)
-* **Vocabulary:** 34 characters (all lowercased)
-* **Sequence length:** 128
-* **Training iterations:** 2,000
-* **Batch size:** 2
-* **Optimizer:** AdamW, learning rate 3e-4
-* **Model parameters:** 711,106
 * **Performance notes:** Each iteration takes roughly 400–500 ms; 100 iterations take ~45 s on average. Loss steadily decreased from 3.53 to 2.15 over training.
-**Example generation (iteration 1200):**
-```
-Prompt: "The quick"
-Generated: the quick efehn. dethe cans the fice the fpeens antary of eathetint, an thadat hitimes the and cow thig, and
-````
-These outputs capture the **chaotic creativity** of a character-level model: a mixture of readable words, invented forms, and surprising sequences.
----
-## Intended Uses
-* **Character-level text generation experiments**
-* **Research and education:** studying lightweight language models and sequence learning
-* **Creative exploration:** generating quirky text or procedural content for games, demos, or artistic projects
-> ⚠️ i3-tiny is experimental and **not intended for production or high-stakes applications**. Text may be repetitive, nonsensical, or inconsistent.
----
-## Limitations
-* Small vocabulary and character-level modeling limit natural language fluency
-* Outputs are **highly experimental** and not fact-checked
-* Generated sequences can be repetitive, garbled, or unpredictable
-* Not aligned or safety-checked
----
-## Model Weights
-* Stored in `pytorch_model.bin` (or `model.safetensors`)
-* Compatible with PyTorch and Hugging Face Transformers
-* Requires `modeling_i3.py` and `config.json` to instantiate
----
-## Usage Example
-```python
-# not available
-````
----
-## Citation
-If you use i3-tiny for research or experimentation, please cite this repository and acknowledge it as an experimental character-level model.

 ## Training Details
+* **Dataset:** ~45,830 characters (a curated text corpus repeated for exposure)
+* **Vocabulary:** 34 characters (all lowercased)
+* **Sequence length:** 128
+* **Training iterations:** 2,000
+* **Batch size:** 2
+* **Optimizer:** AdamW, learning rate 3e-4
+* **Model parameters:** 711,106
 * **Performance notes:** Each iteration takes roughly 400–500 ms; 100 iterations take ~45 s on average. Loss steadily decreased from 3.53 to 2.15 over training.
+### Training Analysis
+The charts below illustrate the model's performance over the 2,000 training iterations.
+The **Training Loss Over Iterations** plot shows a clear learning trend, with the 50-iteration moving average (red line) confirming a steady decrease in Cross-Entropy loss from $\sim3.5$ to $\sim2.1$. The **Training Time Performance** plot shows a consistent block time per 100 iterations, resulting in a nearly linear increase in cumulative training time, demonstrating stable and predictable training execution.
+![image](https://cdn-uploads.huggingface.co/production/uploads/6615494716917dfdc645c44e/Z0r9xl1cY5KZo3ztnmS7Z.png)
+**Example generation (iteration 1200):**
+```
+Prompt: "The quick"
+Generated: the quick efehn. dethe cans the fice the fpeens antary of eathetint, an thadat hitimes the and cow thig, and
+```
+These outputs capture the **chaotic creativity** of a character-level model: a mixture of readable words, invented forms, and surprising sequences.