Update README.md
Browse files
README.md
CHANGED
|
@@ -31,16 +31,46 @@ This model is designed for scalable training, long-context understanding, and ef
|
|
| 31 |
|
| 32 |
## π Project Structure
|
| 33 |
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
βββ
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
βββ
|
| 40 |
-
βββ
|
| 41 |
-
βββ
|
| 42 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 43 |
|
|
|
|
| 44 |
|
| 45 |
---
|
| 46 |
|
|
|
|
| 31 |
|
| 32 |
## π Project Structure
|
| 33 |
|
| 34 |
+
```bash
|
| 35 |
+
MEM_TRANSFORMER/
|
| 36 |
+
βββ configs/
|
| 37 |
+
β βββ config.json # Model + training hyperparameters
|
| 38 |
+
β
|
| 39 |
+
βββ data/
|
| 40 |
+
β βββ edu_fineweb/ # Token-sharded training data
|
| 41 |
+
β β βββ train_000001.npy
|
| 42 |
+
β β βββ train_000002.npy
|
| 43 |
+
β β βββ test_000001.npy
|
| 44 |
+
β βββ hellaswag/
|
| 45 |
+
β β βββ hellaswag_val.jsonl
|
| 46 |
+
β βββ fineweb.py # Sharding logic with memory-aligned sequence control
|
| 47 |
+
β
|
| 48 |
+
βββ model_core/
|
| 49 |
+
β βββ __init__.py
|
| 50 |
+
β βββ attention.py # Grouped Query Attention, KNN & XL attention logic.Rotary Positional Encoding implementation
|
| 51 |
+
β βββ model.py # Transformer model with memory and RoPE support
|
| 52 |
+
β βββ dataloader.py # Memory-aware DataLoader
|
| 53 |
+
β βββ training.py # train_memgpt function
|
| 54 |
+
β
|
| 55 |
+
βββ scripts/
|
| 56 |
+
β βββ train.py # Training script (DDP-compatible)
|
| 57 |
+
β βββ evaluate.py # Evaluation on benchmarks
|
| 58 |
+
β βββ generate.py # Text generation from trained model
|
| 59 |
+
β
|
| 60 |
+
βββ evaluation/
|
| 61 |
+
β βββ __init__.py
|
| 62 |
+
β βββ hellaswag.py # HellaSwag data loader
|
| 63 |
+
β βββ val_hellaswag.py # Evaluation logic with loss-based scoring
|
| 64 |
+
β
|
| 65 |
+
βββ logs/
|
| 66 |
+
β βββ log.txt # Training logs
|
| 67 |
+
β βββ model_*.pt # Checkpoints
|
| 68 |
+
β
|
| 69 |
+
βββ .gitignore
|
| 70 |
+
βββ README.md
|
| 71 |
+
βββ requirements.txt
|
| 72 |
|
| 73 |
+
```
|
| 74 |
|
| 75 |
---
|
| 76 |
|