File size: 2,753 Bytes
d3bd287
 
 
 
 
 
 
 
 
 
f9723b3
d3bd287
f9723b3
d3bd287
1b930d8
d3bd287
f9723b3
d3bd287
f9723b3
d3bd287
f9723b3
d3bd287
 
 
f9723b3
d3bd287
f9723b3
 
 
 
 
 
 
 
d3bd287
f9723b3
1b930d8
f9723b3
 
 
 
d3bd287
f9723b3
d3bd287
 
 
f9723b3
1b930d8
f9723b3
 
 
d3bd287
f9723b3
d3bd287
 
 
1b930d8
d3bd287
f9723b3
 
 
 
d3bd287
 
 
f9723b3
d3bd287
f9723b3
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
---
license: apache-2.0
language:
- en
pipeline_tag: text-generation
library_name: transformers
tags:
- i3-arhitecture
---

# i3-tiny

**i3-tiny** is a compact, efficient character-level language model designed for experimentation and exploration in text generation. Despite its small size, it packs a surprising punch for creative and research-oriented tasks, generating sequences that are quirky, unpredictable, and full of “human-like” character-level errors.

---

## Model Overview

i3-tiny is trained to predict the next character in a sequence, making it ideal for **character-level language modeling**, **creative text generation**, and **research on lightweight, efficient models**. Its small footprint allows rapid experimentation, even on modest hardware, and it provides a playground for studying how models learn patterns in sequences of characters.

The model is **intentionally experimental** — it’s not aligned, fact-checked, or polished. Instead, it showcases how a compact architecture can capture patterns in text, learn from repetition, and generate outputs that are sometimes surprisingly coherent, sometimes hilariously garbled.

---

## Training Details

* **Dataset:** ~45,830 characters (a curated text corpus repeated to improve exposure)
* **Vocabulary:** 34 characters (all lowercased)
* **Sequence length:** 128
* **Training iterations:** 2,000
* **Batch size:** 2
* **Optimizer:** AdamW, learning rate 3e-4
* **Model parameters:** 711,106
* **Performance notes:** Each iteration takes roughly 400–500 ms; 100 iterations take ~45 s on average. Loss steadily decreased from 3.53 to 2.15 over training.

**Example generation (iteration 1200):**

```
Prompt: "The quick"
Generated: the quick efehn. dethe cans the fice the fpeens antary of eathetint, an thadat hitimes the and cow thig, and
```

These outputs capture the **chaotic creativity** of a character-level model: a mixture of readable words, invented forms, and surprising sequences.

---

## Intended Uses

* **Character-level text generation experiments**
* **Research and education**: studying lightweight language models, sequence learning, and text modeling
* **Creative exploration**: generating quirky text or procedural content for games, demos, or artistic projects

> ⚠️ i3-tiny is experimental and **not intended for production or high-stakes applications**. Text may be repetitive, nonsensical, or inconsistent.

---

## Limitations

* Small vocabulary and character-level modeling limit natural language fluency
* Outputs are **highly experimental** and not fact-checked
* Generated sequences can be repetitive or unexpectedly garbled
* Not aligned or safety-checked

---

## Model Weights

* Stored in `model.bin`
* Compatible with PyTorch