allmalab
/

gpt2-aze

Text Generation

text-generation-inference

Model card Files Files and versions

gpt2-aze / README.md

kavsar's picture

Update README.md

e2b267a verified about 1 year ago

|

history blame contribute delete

1.16 kB

	---
	library_name: transformers
	license: apache-2.0
	datasets:
	- allmalab/DOLLMA
	- allmalab/aze-books
	language:
	- az
	base_model:
	- openai-community/gpt2
	pipeline_tag: text-generation
	---

	# Azerbaijani GPT2 Model

	The model is based on the GPT-2 architecture, specifically trained on Azerbaijani text. It serves as one of the first foundational models designed to generate and understand Azerbaijani language content. Built with the autoregressive transformer decoder architecture, the model generates text token by token, predicting the next word based on the input context.

	- Developed by : aLLMA Lab
	- Funded by : PRODATA LLC
	- Model type: Decoder-only foundational LLM
	- Language: Azerbaijani

	## Uses

	The model can be directly used for text generation, sentence completion, next token prediction tasks by providing an input prompt. Additionally, it can be fine-tuned on an Azerbaijani instruction dataset to develop an interactive question-answering model.

	## Training Details

	```python
	context_window=1024
	stride=512

	lr=1e-3
	warmup_steps = 10000
	weight_decay=0.1,
	adam_beta1 = 0.9,
	adam_beta2 = 0.999
	batch_size=512
	max_steps=178000
	```