ViorikaLM-CHAT / README.md
ViorikaAI's picture
Update README.md
5ee87dc verified
|
raw
history blame
1.34 kB
metadata
language:
  - en
  - ru
license: other
tags:
  - causal-lm
  - gpt
  - transformer
  - chat
model_type: gpt
datasets:
  - wikitext

ViorikaLM-CHAT

🚧 Experimental test model (~200M parameters), trained on wiki-text data.
Supports English πŸ‡¬πŸ‡§. (this my first model)

πŸ“– Description

ViorikaLM-CHAT is a small experimental GPT model designed for text generation and dialogue tasks.
The main goal of this project is to test the full pipeline: training, saving, and uploading models to the Hugging Face Hub.

βš™οΈ Model Details

  • Architecture: GPT (simplified GPT-1/2 style)
  • Model size: ~200M parameters
  • Languages: English
  • License: ❌ none

πŸ‹οΈ Training Details

  • Dataset: wiki.train.tokens (WikiText format)
  • Hardware: NVIDIA GTX 1070 (8GB VRAM)
  • Epochs: 2
  • Batch size: 8
  • Optimizer: Adam, lr = 3e-4
  • Max sequence length: 128 tokens

πŸš€ Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "ViorikaAI/ViorikaLM-CHAT"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

inputs = tokenizer("Hello! How are you?", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))