gpt2-Transfer-euact / README.md
sssdddwd's picture
Update README.md
30875d9 verified
metadata
{}

GPT-2 Fine-Tuning on Custom Dataset

πŸ“Œ Overview

This project fine-tunes a GPT-2 model on a custom dataset extracted from EU_ACT.pdf. The model is trained using transfer learning and uploaded to Hugging Face Hub for further deployment.

πŸš€ Features

  • Uses a pre-trained GPT-2 model (Transfer Learning)
  • Processes text data from a PDF
  • Tokenizes and fine-tunes the model
  • Uploads the trained model to Hugging Face
  • Automatically disables Weights & Biases logging

πŸ“‚ Project Structure

|-- fine_tune_gpt2.py   # Main script for training
|-- EU_ACT.pdf          # Custom dataset (input PDF)
|-- README.md           # Documentation
|-- image.png           # Hugging Face Metadata UI (optional)

πŸ”§ Installation

Run the following command to install required libraries:

pip install transformers datasets torch tokenizers accelerate huggingface_hub

πŸ”‘ Hugging Face Authentication

Replace 'your-api-key-here' with your actual Hugging Face token:

from huggingface_hub import login
login(token='your-api-key-here')

πŸ‹οΈβ€β™‚οΈ Training the Model

Run the script to start training:

python fine_tune_gpt2.py

πŸ“€ Uploading the Model

After training, the model is automatically uploaded to Hugging Face Hub. Ensure you have set:

hf_username = "your_hf_username"  # Replace with your HF username
repo_name = "gpt2-Transfer-euact"

Pipeline

from transformers import pipeline

# Load fine-tuned model from Hugging Face Hub
repo_name = "sssdddwd/gpt2-Transfer-euact"  # Replace with your actual repo
model_pipeline = pipeline("text-generation", model=repo_name)

# Generate text
prompt = "The European AI Act aims to"
output = model_pipeline(prompt, max_length=100, num_return_sequences=1)

# Print output
print(output[0]["generated_text"])

πŸ“Š Metadata UI Reference

Hugging Face Metadata UI

πŸ“Œ License

You can add a license by modifying the Hugging Face license field.

πŸ“¬ Contact

For issues or improvements, reach out via GitHub or Hugging Face discussions.