Abdurrahmanesc
/

tinygpt-base-model

Model card Files Files and versions

tinygpt-base-model / README.md

Abdurrahmanesc's picture

Update README.md

985d0a7 verified about 2 months ago

|

history blame contribute delete

1.54 kB

	---
	license: apache-2.0
	language:
	- en
	---
	# 🧠 Fine-Tuned Nano LLM
	This repository contains a fine-tuned version of a small LLM trained using LoRA / QLoRA.

	---

	## 📌 Model Overview
	- Base Model: Nano LLM
	- Fine-tuning Method: LoRA / QLoRA
	- Dataset: Custom synthetic + curated dataset
	- Task: Text generation
	- Framework: PyTorch + Hugging Face Transformers
	- Training Environment: Google Colab Free Tier

	---

	## 🧪 Training Details
	- Mixed precision (fp16 / nf4)
	- Gradient checkpointing enabled
	- LoRA rank & parameters carefully optimized for small hardware
	- Trained on DatasetDict format with proper splits

	---

	## 📂 Files Included
	This repo contains:
	- `config.json`
	- `adapter_config.json`
	- `adapter_model.bin`
	- `tokenizer.json`
	- `tokenizer.model`
	- `special_tokens_map.json`
	- `generation_config.json`
	- `training_args.bin`

	---

	## How to load model

	from modeling_tinygpt import TinyGPT
	import torch
	import json

	with open("config.json") as f:
	cfg = json.load(f)

	model = TinyGPT(
	vocab_size=cfg["vocab_size"],
	d_model=cfg["d_model"],
	n_heads=cfg["n_heads"],
	n_layers=cfg["n_layers"],
	d_ff=cfg["d_ff"],
	max_seq_len=cfg["max_seq_len"]
	)

	state = torch.load("pytorch_model.bin", map_location="cpu")
	model.load_state_dict(state)
	model.eval()

	---

	## 📈 Benchmark & Notes
	This model is designed for experimentation, education, and small-scale inference.
	This model uses a custom architecture. Please load using `trust_remote_code=True`.

	---