File size: 1,537 Bytes
720b0d6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
985d0a7
720b0d6
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
---
license: apache-2.0
language:
- en
---
# 🧠 Fine-Tuned Nano LLM  
This repository contains a fine-tuned version of a small LLM trained using **LoRA / QLoRA**.

---

## 📌 Model Overview
- **Base Model:** Nano LLM  
- **Fine-tuning Method:** LoRA / QLoRA  
- **Dataset:** Custom synthetic + curated dataset  
- **Task:** Text generation  
- **Framework:** PyTorch + Hugging Face Transformers  
- **Training Environment:** Google Colab Free Tier  

---

## 🧪 Training Details
- Mixed precision (fp16 / nf4)
- Gradient checkpointing enabled
- LoRA rank & parameters carefully optimized for small hardware
- Trained on DatasetDict format with proper splits

---

## 📂 Files Included
This repo contains:
- `config.json`
- `adapter_config.json`
- `adapter_model.bin`
- `tokenizer.json`
- `tokenizer.model`
- `special_tokens_map.json`
- `generation_config.json`
- `training_args.bin`

---

## How to load model

from modeling_tinygpt import TinyGPT
import torch
import json

with open("config.json") as f:
    cfg = json.load(f)

model = TinyGPT(
    vocab_size=cfg["vocab_size"],
    d_model=cfg["d_model"],
    n_heads=cfg["n_heads"],
    n_layers=cfg["n_layers"],
    d_ff=cfg["d_ff"],
    max_seq_len=cfg["max_seq_len"]
)

state = torch.load("pytorch_model.bin", map_location="cpu")
model.load_state_dict(state)
model.eval()

---

## 📈 Benchmark & Notes
This model is designed for experimentation, education, and small-scale inference.
This model uses a custom architecture. Please load using `trust_remote_code=True`.

---