Character-Based Language Model (GRU)
A character-level text generation model trained on ~2,700 business news articles.
Generates text one character at a time using a GRU recurrent neural network.
Model Architecture
| Component |
Details |
| Embedding |
106 chars -> 128d vectors |
| GRU |
512 units, dropout=0.2 |
| Dense |
106 output classes |
| Total params |
1,054,058 |
Usage
import tensorflow as tf
import json
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(repo_id="ismail99/char-based-language-model", filename="model.keras")
vocab_path = hf_hub_download(repo_id="ismail99/char-based-language-model", filename="vocab.json")
with open(vocab_path) as f:
vocab = json.load(f)
get_ids = tf.keras.layers.StringLookup(vocabulary=vocab, mask_token=None)
get_chars = tf.keras.layers.StringLookup(
vocabulary=get_ids.get_vocabulary(), invert=True, mask_token=None
)
model = tf.keras.models.load_model(model_path)
Training Details
- Dataset: 2,692 business news articles (~4,483,812 characters after cleaning)
- Sequence length: 100
- Epochs: 30 (with early stopping, patience=5)
- Optimizer: Adam
- Validation split: 10%
Limitations
This is a small character-level model trained on a narrow domain (business news).
It produces plausible-looking news-style text but not factually accurate content.
Out-of-domain seeds will produce lower quality output.
license: mit
language:
- en
pipeline_tag: text-generation