longnet-shakespeare / README.md
akakumar42's picture
Updated Readme.md
a4c9001 verified
metadata
license: apache-2.0
language:
  - en
tags:
  - pytorch
  - longnet
  - character-level
  - shakespeare
  - text-generation
pipeline_tag: text-generation

LongNet-Char-Shakespeare

A character-level LongNet language model trained on the full Tiny Shakespeare dataset (~1.1M characters).

Model Details

  • Architecture: LongNet (dilated attention Transformer) – supports theoretically up to 1B tokens context length
  • Base model: Custom from-scratch implementation (no transformers library dependency)
  • Parameters: ~6.3M
  • Context length used in training: 8192 tokens (character-level)
  • Training data: Tiny Shakespeare (full text of Shakespeare's plays)
  • Tokenizer: Character-level (65 tokens)
  • Training steps: [insert your final step count, e.g. 5000+]
  • Hardware: Single GPU (RTX/equivalent with 4GB VRAM)

Usage

from longnet_model import LongNetLM

model = LongNetLM.from_pretrained("your-username/longnet-char-shakespeare")
model.eval()