FlameF0X commited on
Commit
3fd1fa2
·
verified ·
1 Parent(s): 9a47333

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +55 -28
README.md CHANGED
@@ -1,16 +1,17 @@
1
- ---
2
- license: apache-2.0
3
- language:
4
- - en
5
- pipeline_tag: text-generation
6
- library_name: transformers
7
- tags:
8
- - i3-arhitecture
9
- ---
 
10
 
11
  # i3-tiny
12
 
13
- **i3-tiny** is a compact, efficient character-level language model designed for experimentation and exploration in text generation. Despite its small size, it packs a surprising punch for creative and research-oriented tasks, generating sequences that are quirky, unpredictable, and full of “human-like” character-level errors.
14
 
15
  ---
16
 
@@ -18,27 +19,29 @@ tags:
18
 
19
  i3-tiny is trained to predict the next character in a sequence, making it ideal for **character-level language modeling**, **creative text generation**, and **research on lightweight, efficient models**. Its small footprint allows rapid experimentation, even on modest hardware, and it provides a playground for studying how models learn patterns in sequences of characters.
20
 
21
- The model is **intentionally experimental** — it’s not aligned, fact-checked, or polished. Instead, it showcases how a compact architecture can capture patterns in text, learn from repetition, and generate outputs that are sometimes surprisingly coherent, sometimes hilariously garbled.
22
 
23
  ---
24
 
25
  ## Training Details
26
 
27
- * **Dataset:** ~45,830 characters (a curated text corpus repeated to improve exposure)
28
- * **Vocabulary:** 34 characters (all lowercased)
29
- * **Sequence length:** 128
30
- * **Training iterations:** 2,000
31
- * **Batch size:** 2
32
- * **Optimizer:** AdamW, learning rate 3e-4
33
- * **Model parameters:** 711,106
34
  * **Performance notes:** Each iteration takes roughly 400–500 ms; 100 iterations take ~45 s on average. Loss steadily decreased from 3.53 to 2.15 over training.
35
 
36
  **Example generation (iteration 1200):**
37
 
38
  ```
 
39
  Prompt: "The quick"
40
  Generated: the quick efehn. dethe cans the fice the fpeens antary of eathetint, an thadat hitimes the and cow thig, and
41
- ```
 
42
 
43
  These outputs capture the **chaotic creativity** of a character-level model: a mixture of readable words, invented forms, and surprising sequences.
44
 
@@ -46,9 +49,9 @@ These outputs capture the **chaotic creativity** of a character-level model: a m
46
 
47
  ## Intended Uses
48
 
49
- * **Character-level text generation experiments**
50
- * **Research and education**: studying lightweight language models, sequence learning, and text modeling
51
- * **Creative exploration**: generating quirky text or procedural content for games, demos, or artistic projects
52
 
53
  > ⚠️ i3-tiny is experimental and **not intended for production or high-stakes applications**. Text may be repetitive, nonsensical, or inconsistent.
54
 
@@ -56,14 +59,38 @@ These outputs capture the **chaotic creativity** of a character-level model: a m
56
 
57
  ## Limitations
58
 
59
- * Small vocabulary and character-level modeling limit natural language fluency
60
- * Outputs are **highly experimental** and not fact-checked
61
- * Generated sequences can be repetitive or unexpectedly garbled
62
- * Not aligned or safety-checked
63
 
64
  ---
65
 
66
  ## Model Weights
67
 
68
- * Stored in `model.bin`
69
- * Compatible with PyTorch
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ pipeline_tag: text-generation
6
+ library_name: transformers
7
+ tags:
8
+ - i3-architecture
9
+ - custom_code
10
+ ---
11
 
12
  # i3-tiny
13
 
14
+ **i3-tiny** is a compact, efficient character-level language model designed for experimentation and exploration in text generation. Despite its small size, it can generate sequences that are quirky, unpredictable, and full of “human-like” character-level errors.
15
 
16
  ---
17
 
 
19
 
20
  i3-tiny is trained to predict the next character in a sequence, making it ideal for **character-level language modeling**, **creative text generation**, and **research on lightweight, efficient models**. Its small footprint allows rapid experimentation, even on modest hardware, and it provides a playground for studying how models learn patterns in sequences of characters.
21
 
22
+ The model is **intentionally experimental** — it’s not aligned, fact-checked, or polished. Outputs may be coherent, partially readable, or amusingly garbled.
23
 
24
  ---
25
 
26
  ## Training Details
27
 
28
+ * **Dataset:** ~45,830 characters (a curated text corpus repeated for exposure)
29
+ * **Vocabulary:** 34 characters (all lowercased)
30
+ * **Sequence length:** 128
31
+ * **Training iterations:** 2,000
32
+ * **Batch size:** 2
33
+ * **Optimizer:** AdamW, learning rate 3e-4
34
+ * **Model parameters:** 711,106
35
  * **Performance notes:** Each iteration takes roughly 400–500 ms; 100 iterations take ~45 s on average. Loss steadily decreased from 3.53 to 2.15 over training.
36
 
37
  **Example generation (iteration 1200):**
38
 
39
  ```
40
+
41
  Prompt: "The quick"
42
  Generated: the quick efehn. dethe cans the fice the fpeens antary of eathetint, an thadat hitimes the and cow thig, and
43
+
44
+ ````
45
 
46
  These outputs capture the **chaotic creativity** of a character-level model: a mixture of readable words, invented forms, and surprising sequences.
47
 
 
49
 
50
  ## Intended Uses
51
 
52
+ * **Character-level text generation experiments**
53
+ * **Research and education:** studying lightweight language models and sequence learning
54
+ * **Creative exploration:** generating quirky text or procedural content for games, demos, or artistic projects
55
 
56
  > ⚠️ i3-tiny is experimental and **not intended for production or high-stakes applications**. Text may be repetitive, nonsensical, or inconsistent.
57
 
 
59
 
60
  ## Limitations
61
 
62
+ * Small vocabulary and character-level modeling limit natural language fluency
63
+ * Outputs are **highly experimental** and not fact-checked
64
+ * Generated sequences can be repetitive, garbled, or unpredictable
65
+ * Not aligned or safety-checked
66
 
67
  ---
68
 
69
  ## Model Weights
70
 
71
+ * Stored in `pytorch_model.bin` (or `model.safetensors`)
72
+ * Compatible with PyTorch and Hugging Face Transformers
73
+ * Requires `modeling_i3.py` and `config.json` to instantiate
74
+
75
+ ---
76
+
77
+ ## Usage Example
78
+
79
+ ```python
80
+ from modeling_i3 import i3, i3Config
81
+ import torch
82
+
83
+ config = i3Config.from_pretrained("i3-hf-model")
84
+ model = i3.from_pretrained("i3-hf-model", config=config)
85
+
86
+ prompt = "Hello"
87
+ input_ids = torch.tensor([[c for c in range(len(prompt))]]) # replace with your dataset encoding
88
+ generated_ids = model.model.generate(input_ids, max_new_tokens=100, temperature=0.8, top_k=20)
89
+ print(generated_ids) # decode using your dataset method
90
+ ````
91
+
92
+ ---
93
+
94
+ ## Citation
95
+
96
+ If you use i3-tiny for research or experimentation, please cite this repository and acknowledge it as an experimental character-level model.