# MIRAS Language Model A character-level language model trained on Shakespeare using the MIRAS (Memory-Integrated Recurrent Attention System) architecture. ## Model Details - **Embedding dimension**: 384 - **Layers**: 4 - **Block size**: 128 - **Memory type**: deep - **Attentional bias**: l2 - **Retention**: l2 - **Vocabulary size**: 65 ## Installation ```bash pip install torch huggingface_hub ``` ## Usage ### Quick Start ```python from huggingface_hub import hf_hub_download import torch # Download files for f in ["modeling_miras.py", "model.pt", "config.json"]: hf_hub_download(repo_id="av-codes/miras-shakespeare", filename=f, local_dir="./miras") # Import and load import sys sys.path.insert(0, "./miras") from modeling_miras import load_miras_model model, encode, decode, config = load_miras_model("./miras") model.eval() # Generate text context = torch.zeros((1, 1), dtype=torch.long) output = model.generate(context, max_new_tokens=200, temperature=0.8) print(decode(output[0].tolist())) ``` ### Using the Helper Function ```python from modeling_miras import load_miras_model # Load directly from Hub model, encode, decode, config = load_miras_model("av-codes/miras-shakespeare") # Generate import torch context = torch.zeros((1, 1), dtype=torch.long) generated = model.generate(context, max_new_tokens=100) print(decode(generated[0].tolist())) ``` ## Files - `model.pt` - Model weights and architecture config - `config.json` - Full configuration including vocabulary - `modeling_miras.py` - Complete model architecture code ## Training Trained for 5000 iterations on the TinyShakespeare dataset. ## Architecture MIRAS uses a novel memory-based attention mechanism with configurable: - **Memory type**: `linear` (matrix memory) or `deep` (MLP memory) - **Attentional bias**: `l2`, `lp`, or `huber` loss functions - **Retention**: `l2`, `kl`, or `elastic` weight update rules