| --- |
| {} |
| --- |
| # GPT-2 Fine-Tuning on Custom Dataset |
|
|
| ## π Overview |
| This project fine-tunes a **GPT-2** model on a custom dataset extracted from `EU_ACT.pdf`. The model is trained using **transfer learning** and uploaded to **Hugging Face Hub** for further deployment. |
|
|
| ## π Features |
| - **Uses a pre-trained GPT-2 model** (Transfer Learning) |
| - **Processes text data from a PDF** |
| - **Tokenizes and fine-tunes the model** |
| - **Uploads the trained model to Hugging Face** |
| - **Automatically disables Weights & Biases logging** |
|
|
| ## π Project Structure |
| ``` |
| |-- fine_tune_gpt2.py # Main script for training |
| |-- EU_ACT.pdf # Custom dataset (input PDF) |
| |-- README.md # Documentation |
| |-- image.png # Hugging Face Metadata UI (optional) |
| ``` |
|
|
| ## π§ Installation |
| Run the following command to install required libraries: |
| ```bash |
| pip install transformers datasets torch tokenizers accelerate huggingface_hub |
| ``` |
|
|
| ## π Hugging Face Authentication |
| Replace `'your-api-key-here'` with your actual Hugging Face token: |
| ```python |
| from huggingface_hub import login |
| login(token='your-api-key-here') |
| ``` |
|
|
| ## ποΈββοΈ Training the Model |
| Run the script to start training: |
| ```bash |
| python fine_tune_gpt2.py |
| ``` |
|
|
| ## π€ Uploading the Model |
| After training, the model is automatically uploaded to **Hugging Face Hub**. Ensure you have set: |
| ```python |
| hf_username = "your_hf_username" # Replace with your HF username |
| repo_name = "gpt2-Transfer-euact" |
| ``` |
|
|
| ## Pipeline |
| ```python |
| from transformers import pipeline |
| |
| # Load fine-tuned model from Hugging Face Hub |
| repo_name = "sssdddwd/gpt2-Transfer-euact" # Replace with your actual repo |
| model_pipeline = pipeline("text-generation", model=repo_name) |
| |
| # Generate text |
| prompt = "The European AI Act aims to" |
| output = model_pipeline(prompt, max_length=100, num_return_sequences=1) |
| |
| # Print output |
| print(output[0]["generated_text"]) |
| ``` |
| ## π Metadata UI Reference |
|  |
|
|
| ## π License |
| You can add a license by modifying the Hugging Face `license` field. |
|
|
| ## π¬ Contact |
| For issues or improvements, reach out via GitHub or Hugging Face discussions. |