Update README.md

5ee87dc verified 8 months ago

1.34 kB

language:
  - en
  - ru
license: other
tags:
  - causal-lm
  - gpt
  - transformer
  - chat
model_type: gpt
datasets:
  - wikitext

ViorikaLM-CHAT

🚧 Experimental test model (~200M parameters), trained on wiki-text data.
Supports English 🇬🇧. (this my first model)

📖 Description

ViorikaLM-CHAT is a small experimental GPT model designed for text generation and dialogue tasks.
The main goal of this project is to test the full pipeline: training, saving, and uploading models to the Hugging Face Hub.

⚙️ Model Details

Architecture: GPT (simplified GPT-1/2 style)
Model size: ~200M parameters
Languages: English
License: ❌ none

🏋️ Training Details

Dataset: wiki.train.tokens (WikiText format)
Hardware: NVIDIA GTX 1070 (8GB VRAM)
Epochs: 2
Batch size: 8
Optimizer: Adam, lr = 3e-4
Max sequence length: 128 tokens

🚀 Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "ViorikaAI/ViorikaLM-CHAT"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

inputs = tokenizer("Hello! How are you?", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))