sshleifer_tiny-gpt2_project_guide / document.txt

Upload 10 files

f62f52e verified 6 months ago

8.56 kB

	Tiny-GPT2 Text Generation Project Documentation
	=============================================

	This project enables students to run, fine-tune, and experiment with the `sshleifer/tiny-gpt2`
	model locally on a laptop with 8GB or 16GB RAM, using CPU (GPU optional). The goal is to provide
	hands-on experience with AI model workflows, including downloading, fine-tuning, and deploying a
	text generation model via a web interface. This document covers all steps to set up and run the
	project, with credits to the original model and organization.

	---

	1. Project Overview
	The project uses the `sshleifer/tiny-gpt2` model, a lightweight version of GPT-2, for text generation.
	It includes scripts to:
	- Download model weights from Hugging Face.
	- Test the model with a sample prompt.
	- Fine-tune the model on a custom dataset.
	- Deploy a web app to generate text interactively.
	The setup is optimized for low-memory systems (8GB RAM) and defaults to CPU execution, but includes
	instructions for GPU users.

	---

	2. Prerequisites
	- Hardware: Laptop with at least 8GB RAM (16GB recommended). GPU (e.g., NVIDIA GTX) is optional;
	scripts default to CPU.
	- Operating System: Windows, macOS, or Linux.
	- Software:
	- Python 3.10.9 (recommended) or 3.9.10. Download from https://www.python.org/downloads/.
	- Visual Studio Code (VS Code) for development (optional but recommended). Download from
	https://code.visualstudio.com/.
	- Hugging Face Account: Required to download model weights.

	---

	3. Step-by-Step Setup Instructions

	.1. Obtain a Hugging Face Token
	1. Go to https://huggingface.co/ and sign up or log in.
	2. Navigate to https://huggingface.co/settings/tokens.
	3. Click "New token", select "Read" or "Write" access, and copy the token
	(e.g., hf_XXXXXXXXXXXXXXXXXXXXXXXXXX).
	4. Store the token securely; you’ll use it in the download script.

	3.2. Install Python
	1. Download Python 3.10.9 from https://www.python.org/downloads/release/python-3109/.
	2. Install Python, ensuring "Add Python to PATH" is checked.
	3. Verify installation in a terminal:
	```
	python --version
	```
	Expected output: Python 3.10.9

	3.3. Set Up a Virtual Environment
	1. Open a terminal in your project folder (e.g., C:\Users\YourName\Documents\project).
	2. Create a virtual environment:
	```
	python -m venv venv
	```
	3. Activate the virtual environment:
	- Windows: `venv\Scripts\activate`
	- macOS/Linux: `source venv/bin/activate`
	4. Confirm activation (you’ll see `(venv)` in the terminal prompt).

	3.4. Install Dependencies
	1. In the activated virtual environment, create a file named `requirements.txt` with the following
	content:
	```
	torch==2.3.0
	transformers==4.38.2
	huggingface_hub==0.22.2
	datasets==2.21.0
	numpy==1.26.4
	matplotlib==3.8.3
	flask==3.0.3
	```
	2. Install the libraries:
	```
	pip install -r requirements.txt
	```
	3. For GPU users (optional):
	- Uninstall CPU PyTorch: `pip uninstall torch -y`
	- Install GPU PyTorch: `pip install torch==2.3.0+cu121`
	- Verify CUDA: `python -c "import torch; print(torch.cuda.is_available())"` (should print `True`).
	Note: Scripts default to CPU, so GPU users don’t need to change this unless desired.

	3.5. Download Model Weights
	1. Create a folder named `dalle` (or any name) for the project.
	2. Copy the `download_model.py` script from the repository (or create it):
	```
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from huggingface_hub import login
	import os

	hf_token = "YOUR_HUGGINGFACE_TOKEN" # Replace with your token
	login(token=hf_token)

	model_name = "sshleifer/tiny-gpt2"
	save_directory = "./tiny-gpt2-model"
	os.makedirs(save_directory, exist_ok=True)

	model = AutoModelForCausalLM.from_pretrained(model_name, cache_dir=save_directory)
	tokenizer = AutoTokenizer.from_pretrained(model_name, cache_dir=save_directory)
	print(f"Model and tokenizer downloaded to {save_directory}")
	```
	3. Replace `YOUR_HUGGINGFACE_TOKEN` with your Hugging Face token.
	4. Run the script:
	```
	python download_model.py
	```
	5. Verify the model files in
	`tiny-gpt2-model/models--sshleifer--tiny-gpt2/snapshots/5f91d94bd9cd7190a9f3216ff93cd1dd95f2c7be`
	(contains `config.json`, `pytorch_model.bin`, `vocab.json`, `merges.txt`).

	3.6. Test the Model
	1. Copy the `test_model.py` script from the repository to the `dalle` folder.
	2. Run the script:
	```
	python test_model.py
	```
	3. Expected output: Generated text starting with "Once upon a time" (e.g., may be semi-coherent due
	to the model’s small size).

	3.7. Fine-Tune the Model
	1. Create a `fine_tune` folder inside `dalle`:
	```
	mkdir fine_tune
	cd fine_tune
	```
	2. Create a dataset file `sample_data.txt` (or use your own text). Example content:
	```
	Once upon a time, there was a brave knight who explored a magical forest.
	The forest was filled with mystical creatures and ancient ruins.
	The knight discovered a hidden treasure guarded by a wise dragon.
	With courage and wisdom, the knight befriended the dragon and shared the treasure with the village.
	```
	3. Copy the `fine_tune_model.py` script from the repository to `fine_tune`.
	4. Run the script:
	```
	python fine_tune_model.py
	```
	5. The script fine-tunes the model, saves it to `fine_tuned_model`, and generates a `loss_plot.png`
	showing training loss.
	6. Verify `fine_tuned_model` contains model files and check `loss_plot.png`.

	3.8. Run the Web App
	1. In the `fine_tune` folder, copy `app.py` and create a `templates` folder with `index.html` from the
	repository.
	2. Run the web app:
	```
	python app.py
	```
	3. Open a browser and go to `http://127.0.0.1:5000`.
	4. Enter a prompt (e.g., "Once upon a time") and click "Generate Text" to see the output from the
	fine-tuned model.

	---

	4. Notes for Students
	- Model Limitations: `tiny-gpt2` is a small model, so generated text may not be highly coherent. For
	better results, consider larger models like `gpt2` (requires more memory or GPU).
	- Memory Management: On 8GB RAM systems, close other applications to free memory. The scripts use a
	small batch size to minimize memory usage.
	- GPU Support: Scripts default to CPU for compatibility. To use an NVIDIA GPU:
	- Install `torch==2.3.0+cu121` (see step 3.4).
	- Remove `os.environ["CUDA_VISIBLE_DEVICES"] = ""` from `fine_tune_model.py` and `app.py`.
	- Change `use_cpu=True` to `use_cpu=False` in `fine_tune_model.py`.
	- Experimentation: Try different prompts, datasets, or fine-tuning parameters (e.g., `num_train_epochs`,
	`learning_rate`) to explore AI model behavior.

	---

	5. Troubleshooting
	- Library Conflicts: Use the exact versions in `requirements.txt` to avoid issues.
	- File Not Found: Ensure model files are in the correct path
	(`tiny-gpt2-model/models--sshleifer--tiny-gpt2/snapshots/5f91d94bd9cd7190a9f3216ff93cd1dd95f2c7be`).
	- Memory Errors: Reduce `max_length` in `fine_tune_model.py` (e.g., from 128 to 64) for 8GB RAM systems.
	- Hugging Face Token Issues: Verify your token has "Read" or "Write" access at
	https://huggingface.co/settings/tokens.

	---

	6. Credits and Attribution
	- Original Model: `sshleifer/tiny-gpt2`, a distilled version of GPT-2, created by Steven Shleifer.
	Available at https://huggingface.co/sshleifer/tiny-gpt2.
	- Organization: Hugging Face, Inc. (https://huggingface.co/) provides the model weights, `transformers`
	library, and `huggingface_hub` for model access.
	- Project Creator: Remiai3 (GitHub/Hugging Face username). This project was developed to facilitate AI
	learning and experimentation for students.
	- AI Assistance: Grok 3, created by xAI (https://x.ai/), assisted in generating and debugging the code,
	ensuring compatibility for low-resource systems.

	---

	7. Next Steps for Students
	- Experiment with different datasets in `sample_data.txt` to fine-tune the model for specific tasks
	(e.g., storytelling, dialogue).
	- Modify `fine_tune_model.py` parameters (e.g., `learning_rate`, `num_train_epochs`) to study their
	impact.
	- Enhance `index.html` or `app.py` to add features like multiple prompt inputs or generation options.
	- Explore larger models on Hugging Face (e.g., `gpt2-medium`) if you have a GPU or more RAM.

	For questions or issues, contact Remiai3 via Hugging Face or check the repository for updates.