torinriley
/

OratioAI

Model card Files Files and versions

OratioAI / README.md

torinriley's picture

Update README.md

55c91cc verified about 1 year ago

|

history blame contribute delete

1.99 kB

	---
	license: mit
	language:
	- it
	- en
	pipeline_tag: translation
	---

	# OratioAI
	Sequecne to Sequence anguage translation, implimenting the methodes outlined in 'attention is all you need'

	1. Input Tokenization:
	The source and target sentences are tokenized using custom WordPiece tokenizers. Tokens are mapped to embeddings via the InputEmbeddings module, scaled by the model dimension.
	2. Positional Encoding:
	Positional information is added to token embeddings using a fixed sinusoidal encoding strategy.
	3. Encoding Phase:
	The encoder processes the source sequence, transforming token embeddings into contextual representations using stacked EncoderBlock modules.
	4. Decoding Phase:
	The decoder autoregressively generates target tokens by attending to both previous tokens and encoder outputs. Cross-attention layers align source and target sequences effectively.
	5. Projection:
	Final decoder outputs are projected into the target vocabulary space for token prediction.
	6. Output Generation:
	Decoding is performed using a beam search or greedy approach to produce the final translated sentence.




	\| Resource \| Description \|
	\|-----------------------------------\|----------------------------------------------------------\|
	\| [Training Space](https://huggingface.co/spaces/torinriley/OratioAI) \| Hugging Face Space for training and testing the model. \|
	\| [GitHub Source Code](https://github.com/torinriley/OratioAI) \| Source code repository for the translation project. \|
	\| [Attention Is All You Need](https://arxiv.org/pdf/1706.03762) \| Original paper on the transformer architecture published from google \|

	\| Dataset \| Description \|
	\|-----------------------------------\|----------------------------------------------------------\|
	\| [Dataset](https://opus.nlpl.eu/Europarl/en&it/v8/Europarl) \| Dataset Used for main model training. \|