--- license: apache-2.0 language: - en --- # ๐Ÿง  Fine-Tuned Nano LLM This repository contains a fine-tuned version of a small LLM trained using **LoRA / QLoRA**. --- ## ๐Ÿ“Œ Model Overview - **Base Model:** Nano LLM - **Fine-tuning Method:** LoRA / QLoRA - **Dataset:** Custom synthetic + curated dataset - **Task:** Text generation - **Framework:** PyTorch + Hugging Face Transformers - **Training Environment:** Google Colab Free Tier --- ## ๐Ÿงช Training Details - Mixed precision (fp16 / nf4) - Gradient checkpointing enabled - LoRA rank & parameters carefully optimized for small hardware - Trained on DatasetDict format with proper splits --- ## ๐Ÿ“‚ Files Included This repo contains: - `config.json` - `adapter_config.json` - `adapter_model.bin` - `tokenizer.json` - `tokenizer.model` - `special_tokens_map.json` - `generation_config.json` - `training_args.bin` --- ## How to load model from modeling_tinygpt import TinyGPT import torch import json with open("config.json") as f: cfg = json.load(f) model = TinyGPT( vocab_size=cfg["vocab_size"], d_model=cfg["d_model"], n_heads=cfg["n_heads"], n_layers=cfg["n_layers"], d_ff=cfg["d_ff"], max_seq_len=cfg["max_seq_len"] ) state = torch.load("pytorch_model.bin", map_location="cpu") model.load_state_dict(state) model.eval() --- ## ๐Ÿ“ˆ Benchmark & Notes This model is designed for experimentation, education, and small-scale inference. This model uses a custom architecture. Please load using `trust_remote_code=True`. ---