LisaMegaWatts commited on
Commit
7795248
·
verified ·
1 Parent(s): 0fc7537

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +28 -41
README.md CHANGED
@@ -1,52 +1,39 @@
1
  ---
2
- language: en
3
- license: mit
4
- library_name: custom
 
5
  tags:
6
- - gpt
7
- - character-level
8
- - transformer
9
- - from-scratch
10
- - ancient-scripts
11
- - classical-texts
 
 
12
  datasets:
13
- - custom
14
- pipeline_tag: text-generation
15
  ---
16
 
17
  # JuliaGPT
18
 
19
- An optimized character-level GPT in Julia for training on ancient scripts and classical texts. Evolution of [MicroJulia](https://github.com/DavinciDreams/micro-julia).
20
-
21
- [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DavinciDreams/JuliaGPT/blob/main/juliagpt.ipynb)
22
-
23
- ## Roadmap
24
-
25
- Starting from MicroJulia's minimal scalar-autograd GPT, optimizing toward:
26
-
27
- - Array-based autograd for 100-1000x speedup
28
- - Multi-layer transformers with GELU activations
29
- - Learnable RMSNorm, gradient clipping, cosine LR schedule
30
- - Ancient script support (Greek, Latin, Cuneiform, etc.)
31
- - Flexible vocabulary configuration per script
32
- - Batched training and proper attention masking
33
-
34
- ## Current Architecture
35
-
36
- - Custom autograd engine in pure Julia
37
- - Transformer with multi-head attention
38
- - Character-level tokenization
39
- - Adam optimizer with LR decay
40
- - W&B logging + HuggingFace Hub integration
41
 
42
- ## Quick Start
 
 
 
 
 
43
 
44
- 1. Click "Open in Colab" above
45
- 2. Add Colab secrets: `HF_TOKEN`, `WANDB_KEY`, `HF_REPO`
46
- 3. Run Python login cell, install Julia, switch to Julia 1.10
47
- 4. Run all cells
48
 
49
- ## Related
 
 
50
 
51
- - [micro-julia](https://github.com/DavinciDreams/micro-julia) - Original minimal implementation
52
- - [text-pipeline](https://github.com/DavinciDreams/text-pipeline) - Text processing pipeline for training data
 
 
1
  ---
2
+ language:
3
+ - en
4
+ library_name: julia
5
+ pipeline_tag: text-generation
6
  tags:
7
+ - character-level
8
+ - philosophy
9
+ - mathematics
10
+ - julia
11
+ - scalar-autograd
12
+ - pure-julia
13
+ - scriptio-continua
14
+ - reduced-vocab
15
  datasets:
16
+ - LisaMegaWatts/juliagpt-data
 
17
  ---
18
 
19
  # JuliaGPT
20
 
21
+ An experimental character-level GPT in pure Julia exploring minimal vocabularies inspired by ancient Greek *scriptio continua*. Built with scalar autograd, no external ML dependencies.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
+ ## Architecture
24
+ - 1 transformer layer, 4 attention heads
25
+ - n_embd=16, block_size=256
26
+ - RMSNorm, ReLU, KV cache for causal masking
27
+ - Adam optimizer with linear LR decay
28
+ - ~5K parameters
29
 
30
+ ## Vocabulary
31
+ 28 characters (a-z + space + period) + BOS = 29 vocab. Numerals converted to words, all punctuation removed except period.
 
 
32
 
33
+ ## Training
34
+ - **Dataset:** Aristotle's Rhetoric + Euclid's Elements (8,461 chunks)
35
+ - **Current checkpoint:** step 650, val_loss=2.3414
36
 
37
+ ## Links
38
+ - [Training data](https://huggingface.co/datasets/LisaMegaWatts/juliagpt-data)
39
+ - [Source code](https://github.com/DavinciDreams/JuliaGPT)