|
|
--- |
|
|
language: ru |
|
|
license: mit |
|
|
tags: |
|
|
- tiny-model |
|
|
- russian |
|
|
- alphagpt |
|
|
- nano-gpt |
|
|
- experimental |
|
|
- transformers |
|
|
datasets: |
|
|
- prostochel097/ru_qa_dialog |
|
|
widget: |
|
|
- text: Привет |
|
|
example_title: Приветствие |
|
|
- text: Санкт |
|
|
example_title: Города |
|
|
library_name: transformers |
|
|
--- |
|
|
|
|
|
# AlphaGPT-Photon |
|
|
|
|
|
Сверхкомпактная русскоязычная языковая модель на архитектуре GPT2. |
|
|
|
|
|
## Технические характеристики |
|
|
|
|
|
| Параметр | Значение | |
|
|
|----------|----------| |
|
|
| **Архитектура** | GPT2-nano | |
|
|
| **Параметры** | 4,634 | |
|
|
| **Размер модели** | ~18.1 KB | |
|
|
| **Словарь** | 500 токенов | |
|
|
| **Контекст** | 32 токена | |
|
|
| **Скрытый размер** | 8 | |
|
|
| **Слои** | 1 | |
|
|
| **Головы внимания** | 1 | |
|
|
| **Активация** | gelu_new | |
|
|
| **Обучена на** | 53 диалогах | |
|
|
| **Эпох обучения** | 500 | |
|
|
|
|
|
## Использование |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
import torch |
|
|
|
|
|
# Загрузка модели |
|
|
model_name = "prostochel097/alphagpt-ultramini" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
model = AutoModelForCausalLM.from_pretrained(model_name) |
|
|
|
|
|
# Генерация текста |
|
|
prompt = "Привет" |
|
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
|
|
|
|
with torch.no_grad(): |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=20, |
|
|
temperature=0.8, |
|
|
do_sample=True, |
|
|
pad_token_id=tokenizer.pad_token_id, |
|
|
eos_token_id=tokenizer.eos_token_id |
|
|
) |
|
|
|
|
|
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
print(f"Сгенерировано: {generated_text}") |