Spaces:

pkraman06
/

PyTorch_Transformer_model

Sleeping

App Files Files Community

PyTorch_Transformer_model / README.md

pkraman06

Update README.md

bf1a105 verified 17 days ago

preview code

Raw

History Blame Contribute Delete

1.89 kB

	---
	title: PyTorch_Transformer_model
	emoji: 🌐
	colorFrom: blue
	colorTo: indigo
	sdk: streamlit
	sdk_version: 1.35.0
	app_file: app.py
	pinned: false
	---
	# 🌐 Seq2Seq Transformer English-to-Spanish Translator

	An interactive web application deployed on Hugging Face Spaces that demonstrates a Sequence-to-Sequence (Seq2Seq) Transformer model built from scratch using PyTorch and served via Streamlit.

	The application automatically builds custom word-level vocabularies from a training dataset, handles variable-length sequence padding, trains a classic multi-head attention Transformer network, and performs auto-regressive decoding for real-time translation inference.

	---

	## 🚀 Features

	* Custom Tokenization Pipeline: Processes raw text data, builds independent source and target vocabularies, and converts sentences into tensor indices with `<SOS>` and `<EOS>` boundaries.
	* Transformer from Scratch: Implements Sinusoidal Positional Encodings and a standard PyTorch Multi-Head Attention Transformer (`nn.Transformer`) with proper causal masks (`tgt_mask`) to prevent lookahead during training.
	* On-the-Fly UI Training: Automatically fits the model to the provided dataset on the initial load and caches the trained weights to provide instant inference for users.
	* Streamlit Web Interface: A clean, user-friendly text field interface allowing anyone to input English sentences and see Spanish translations instantaneously.

	---

	## 📂 Project Structure

	To run properly on Hugging Face Spaces, ensure your repository contains the following files in the root directory:

	```text
	├── app.py # Main Streamlit web application & PyTorch script
	├── requirements.txt # Python runtime dependencies
	├── data.csv # Custom English/Spanish translation dataset
	└── README.md # Project documentation (this file)