srikanth1579
/

Transformers

Model card Files Files and versions

Transformers / README.md

srikanth1579's picture

Create README.md

42b4c73 verified about 1 year ago

|

history blame contribute delete

2.58 kB

	---
	language:
	- is
	---
	README

	Overview

	This project implements a language translation model using GPT-2, capable of translating between Icelandic and English. The pipeline includes data preprocessing, model training, evaluation, and an interactive user interface for translations.

	Features

	Text Preprocessing: Tokenization and padding for uniform input size.

	Model Training: Training a GPT-2 model on paired Icelandic-English sentences.

	Evaluation: Perplexity-based validation of model performance.

	Interactive Interface: An easy-to-use widget for real-time translations.

	Installation

	Prerequisites

	Ensure you have the following installed:

	Python (>= 3.8)

	PyTorch

	Transformers library by Hugging Face

	ipywidgets (for the translation interface)

	Steps

	Clone the repository:

	git clone <repository_url>
	cd <repository_name>

	Install the required libraries:

	pip install -r requirements.txt

	Ensure GPU availability for faster training (optional but recommended).

	Usage

	Training the Model

	Prepare your dataset with English-Icelandic sentence pairs.

	Run the script to preprocess the data and train the model:

	python train_model.py

	The trained model and tokenizer will be saved in the ./trained_gpt2 directory.

	Evaluating the Model

	Evaluate the trained model using validation data:

	python evaluate_model.py

	The script computes perplexity to measure model performance.

	Running the Interactive Interface

	Launch a Jupyter Notebook or Jupyter Lab.

	Open the file interactive_translation.ipynb.

	Enter a sentence in English or Icelandic, and view the translation in real-time.

	File Structure

	train_model.py: Contains code for data preprocessing, model training, and saving.

	evaluate_model.py: Evaluates model performance using perplexity.

	interactive_translation.ipynb: Interactive interface for testing translations.

	requirements.txt: List of required Python packages.

	trained_gpt2/: Directory to save trained model and tokenizer.

	Key Parameters

	Max Length: Maximum token length for inputs (default: 128).

	Learning Rate: .

	Batch Size: 4 (both training and validation).

	Epochs: 10.

	Beam Search: Used for generating translations, with a beam size of 5.

	Future Improvements

	Expand dataset to include additional language pairs.

	Optimize the model for faster inference.

	Integrate the application into a web-based interface.

	Acknowledgements

	Hugging Face for providing the GPT-2 model and libraries.

	PyTorch for enabling seamless implementation and training.

	License

	This project is licensed under the MIT License. See the LICENSE file for details.