nishantup's picture
Upload README.md with huggingface_hub
53d5ead verified
---
license: mit
tags:
- pytorch
- nanogpt
- instruction-tuning
- sft
- slm
- from-scratch
---
# nanoGPT SLM Instruct -- 123.849984 Million Parameters
Instruction fine-tuned Small Language Model, trained from scratch -> pretrained on 133 classic english fiction books -> SFT on Alpaca-format instructions.
## Quick Start
### Option 1: Run directly (downloads model + runs 5 examples)
```bash
pip install torch tiktoken huggingface_hub
python nanogpt_slm_instruct_inference.py
```
### Option 2: Import and use `ask()` in your own code
```python
# Import loads the model automatically (one-time download from HuggingFace)
from nanogpt_slm_instruct_inference import ask
## First time execution will O/P prefed 5 examples with model responses
# Simple question
print(ask("What is the capital of France?"))
print()
# With input context
print(ask(
instruction="Summarize the following text.",
input_text="Machine learning enables systems to learn from data rather than being explicitly programmed."
))
print()
# Control generation
print(ask(
"Write a short poem about the ocean.",
temperature=1.0, # higher = more creative
top_k=100, # wider sampling pool
max_tokens=150 # longer output
))
print()
```
### Option 3: Load weights manually
```python
from huggingface_hub import hf_hub_download
import torch, tiktoken
repo_id= "nishantup/nanogpt-slm-instruct"
filename = "nanogpt_slm_instruct.pth"
model_path = hf_hub_download(repo_id=repo_id, filename=filename)
# Build model (full architecture in nanogpt_slm_instruct_inference.py)
from nanogpt_slm_instruct_inference import GPT, GPTConfig, generate, format_input
config = GPTConfig()
model = GPT(config)
model.load_state_dict(torch.load(model_path, map_location="cpu"))
model.eval()
```
## Model Details
| Attribute | Value |
|:---|:---|
| Parameters | 123.849984 |
| Architecture | nanoGPT (12 layers, 12 heads, 768 dim) |
| Context length | 256 tokens |
| Tokenizer | tiktoken GPT-2 BPE (50,257 tokens) |
| Fine-tuning | Supervised (Alpaca format) |
| Framework | PyTorch |
## Prompt Format
```
Below is an instruction that describes a task.
### Instruction:
{instruction}
### Response:
```
With optional input:
```
Below is an instruction that describes a task, paired with further context.
### Instruction:
{instruction}
### Input:
{input}
### Response:
```
## Files
| File | Description |
|:---|:---|
| `nanogpt_slm_instruct.pth` | SFT fine-tuned weights |
| `nanogpt_slm_instruct_inference.py` | Standalone inference script -- import and call `ask()` |
| `config.json` | Model configuration |
## `ask()` API Reference
```python
ask(instruction, input_text="", max_tokens=256, temperature=0.7, top_k=40)
```
| Parameter | Default | Description |
|:---|:---|:---|
| `instruction` | (required) | The task instruction |
| `input_text` | `""` | Optional additional context |
| `max_tokens` | `256` | Maximum tokens to generate |
| `temperature` | `0.7` | 0.0 = greedy, 0.7 = balanced, 1.5 = creative |
| `top_k` | `40` | Top-k filtering (None = no filtering) |