You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

GPT-2 Fine-Tuning on Custom Dataset

📌 Overview

This project fine-tunes a GPT-2 model on a custom dataset extracted from EU_ACT.pdf. The model is trained using transfer learning and uploaded to Hugging Face Hub for further deployment.

🚀 Features

Uses a pre-trained GPT-2 model (Transfer Learning)
Processes text data from a PDF
Tokenizes and fine-tunes the model
Uploads the trained model to Hugging Face
Automatically disables Weights & Biases logging

📂 Project Structure

|-- fine_tune_gpt2.py   # Main script for training
|-- EU_ACT.pdf          # Custom dataset (input PDF)
|-- README.md           # Documentation
|-- image.png           # Hugging Face Metadata UI (optional)

🔧 Installation

Run the following command to install required libraries:

pip install transformers datasets torch tokenizers accelerate huggingface_hub

🔑 Hugging Face Authentication

Replace 'your-api-key-here' with your actual Hugging Face token:

from huggingface_hub import login
login(token='your-api-key-here')

🏋️‍♂️ Training the Model

Run the script to start training:

python fine_tune_gpt2.py

📤 Uploading the Model

After training, the model is automatically uploaded to Hugging Face Hub. Ensure you have set:

hf_username = "your_hf_username"  # Replace with your HF username
repo_name = "gpt2-Transfer-euact"

Pipeline

from transformers import pipeline

# Load fine-tuned model from Hugging Face Hub
repo_name = "sssdddwd/gpt2-Transfer-euact"  # Replace with your actual repo
model_pipeline = pipeline("text-generation", model=repo_name)

# Generate text
prompt = "The European AI Act aims to"
output = model_pipeline(prompt, max_length=100, num_return_sequences=1)

# Print output
print(output[0]["generated_text"])

📊 Metadata UI Reference

📌 License

You can add a license by modifying the Hugging Face license field.

📬 Contact

For issues or improvements, reach out via GitHub or Hugging Face discussions.

Downloads last month: -

Safetensors

Model size

0.1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support