唐诗生成模型 / Tang Poetry Generator
A character-level Transformer (GPT-like) trained on 全唐诗 (Complete Tang Poetry, ~57,000 poems).
Model Description
| Property | Value |
|---|---|
| Architecture | 6-layer Transformer decoder |
| Attention | 6-head MultiheadAttention |
| Embedding dim | 384 |
| Parameters | 17.5M |
| Context length | 128 characters |
| Vocab size | 8,879 (Chinese chars + punctuation) |
| Training data | ~3.9M chars from 全唐诗 (simplified Chinese) |
Quick Start
pip install torch zhconv
from inference import load_model
gen = load_model("poetry_model.pt")
print(gen("床前明月光", max_new_tokens=100))
Files
| File | Description |
|---|---|
poetry_model.pt |
Model checkpoint (weights, optimizer, config, tokenizer) |
config.json |
Hyperparameters |
tokenizer.json |
Character vocabulary (stoi / itos) |
inference.py |
Standalone inference script |
train_poetry.py |
Full training script |
Training
Trained on Apple MPS (M4 MacBook Pro) with:
- Optimizer: AdamW (lr=5e-4)
- Schedule: 200-step linear warmup → cosine decay
- Gradient clipping: max_norm=1.0
- Dropout: 0.25
- Best validation loss: 4.51 (perplexity ~91)
Example Output
Seed: 大漠孤烟直
Generated:
大漠孤烟直,秋霜未落残。
后期殊灼得,前事有堆劳。
恩势已云懦,幽深不可逃。
王程未去绝,左右空踟蹰。
门径微月色,庭盘晓云初。
License
MIT
- Downloads last month
- 60
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support