metadata
language:
- en
- ru
license: other
tags:
- causal-lm
- gpt
- transformer
- chat
model_type: gpt
datasets:
- wikitext
ViorikaLM-CHAT
π§ Experimental test model (~200M parameters), trained on wiki-text data.
Supports English π¬π§. (this my first model)
π Description
ViorikaLM-CHAT is a small experimental GPT model designed for text generation and dialogue tasks.
The main goal of this project is to test the full pipeline: training, saving, and uploading models to the Hugging Face Hub.
βοΈ Model Details
- Architecture: GPT (simplified GPT-1/2 style)
- Model size: ~200M parameters
- Languages: English
- License: β none
ποΈ Training Details
- Dataset:
wiki.train.tokens(WikiText format) - Hardware: NVIDIA GTX 1070 (8GB VRAM)
- Epochs: 2
- Batch size: 8
- Optimizer: Adam, lr = 3e-4
- Max sequence length: 128 tokens
π Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "ViorikaAI/ViorikaLM-CHAT"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
inputs = tokenizer("Hello! How are you?", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))