You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

LongNet-Char-Shakespeare

A character-level LongNet language model trained on the full Tiny Shakespeare dataset (~1.1M characters).

Model Details

  • Architecture: LongNet (dilated attention Transformer) – supports theoretically up to 1B tokens context length
  • Base model: Custom from-scratch implementation (no transformers library dependency)
  • Parameters: ~6.3M
  • Context length used in training: 8192 tokens (character-level)
  • Training data: Tiny Shakespeare (full text of Shakespeare's plays)
  • Tokenizer: Character-level (65 tokens)
  • Training steps: [insert your final step count, e.g. 5000+]
  • Hardware: Single GPU (RTX/equivalent with 4GB VRAM)

Usage

from longnet_model import LongNetLM

model = LongNetLM.from_pretrained("your-username/longnet-char-shakespeare")
model.eval()
Downloads last month
16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support