chatbot / README.md
JigneshPrajapati18's picture
Upload folder using huggingface_hub
d625fb9 verified
---
license: mit
tags:
- language-model
- instruction-tuning
- lora
- tinyllama
- text-generation
---
# TinyLlama-1.1B-Chat LoRA Fine-Tuned Model
![LoRA Diagram](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/peft/lora_diagram.png)
## Table of Contents
- [Model Overview](#overview)
- [Key Features](#key-features)
- [Installation](#installation)
## Overview
This repository contains a LoRA (Low-Rank Adaptation) fine-tuned version of the `TinyLlama/TinyLlama-1.1B-Chat-v0.6` model, optimized for instruction-following and question-answering tasks. The model has been adapted using Parameter-Efficient Fine-Tuning (PEFT) techniques to specialize in conversational AI applications while maintaining the base model's general capabilities.
### Model Architecture
- **Base Model**: TinyLlama-1.1B-Chat (Transformer-based)
- **Layers**: 22
- **Attention Heads**: 32
- **Hidden Size**: 2048
- **Context Length**: 2048 tokens (limited to 256 during fine-tuning)
- **Vocab Size**: 32,000
## Key Features
- πŸš€ **Parameter-Efficient Fine-Tuning**: Only 0.39% of parameters (4.2M) trained
- πŸ’Ύ **Memory Optimization**: 8-bit quantization via BitsAndBytes
- ⚑ **Fast Inference**: Optimized for conversational response times
- πŸ€– **Instruction-Tuned**: Specialized for Q&A and instructional tasks
- πŸ”§ **Modular Design**: Easy to adapt for different use cases
- πŸ“¦ **Hugging Face Integration**: Fully compatible with Transformers ecosystem
## Installation
### Prerequisites
- Python 3.8+
- PyTorch 2.0+ (with CUDA 11.7+ if GPU acceleration desired)
- NVIDIA GPU (recommended for training and inference)
### Package Installation
```bash
pip install torch transformers peft accelerate bitsandbytes pandas datasets