Braille256-v4: 8-dot Grade Infinity Universal Braille Model

The first generative language model trained natively on 8-dot Braille Unicode (U+2800-U+28FF), enabling 256 base patterns for richer multimodal computation.

Why 8-dot Braille?

Feature 6-dot (v3) 8-dot (v4)
Base patterns 64 256
Byte encoding Partial Full
Computer Braille Limited Native
Compression potential Lower Higher
Multimodal computation Basic Rich

8-dot Braille enables direct byte-to-Braille mapping, making it ideal for:

  • Binary data encoding in Braille
  • Computer Braille compatibility
  • Multi-layer computation with richer pattern space
  • Multimodal AI with tactile output

Model Details

Metric Value
Parameters 29.9M
Vocabulary 8192 (Unigram)
Base Patterns 256 (8-dot)
Training Steps 15,000
Training Time 3h 37m (MPS)
Corpus 163M chars (7 languages)
Tokens using dots 7/8 913

Architecture

Hidden Size: 512
Layers: 8
Attention Heads: 8
Max Sequence Length: 1024
Tokenizer: SentencePiece Unigram (8192 vocab)

Learned Contractions (8-dot)

Token Braille Meaning
6 ⠞⠓⠑ the
10 ⠁⠝⠙ and
11 ⠕⠋ of
12 ⠞⠕ to
18 ⠞⠓⠁⠞ that
13 8-dot pattern

Usage

import torch
from train_grade_infinity_8dot import Braille8DotModel, Braille8DotConfig, Braille8DotTokenizer

# Load model
model_dir = "ryanscottbarrett/braille256-v4"
config = Braille8DotConfig.from_pretrained(model_dir)
model = Braille8DotModel(config)
model.load_state_dict(torch.load(f"{model_dir}/pytorch_model.bin", map_location="cpu"))
model.eval()

# Load tokenizer
tokenizer = Braille8DotTokenizer(f"{model_dir}/tokenizer.model")

# Generate
prompt = "⠞⠓⠑⠀"
input_ids = torch.tensor([tokenizer.encode(prompt)])
output = model.generate(input_ids, max_length=50, temperature=0.8)
print(tokenizer.decode(output[0].tolist()))

Byte-to-Braille Encoding

8-dot Braille enables direct binary encoding:

def byte_to_braille(byte: int) -> str:
    return chr(0x2800 + byte)

def bytes_to_braille(data: bytes) -> str:
    return ''.join(byte_to_braille(b) for b in data)

# Encode any binary data as Braille!
text = "Hello"
braille = bytes_to_braille(text.encode('utf-8'))
print(braille)  # ⡈⡥⡬⡬⡯

Research Goals

This model advances the Grade Infinity Braille research:

  1. 256-pattern vocabulary: Full 8-dot Braille utilization
  2. Multimodal computation: Richer pattern space for AI
  3. Universal encoding: Any data → Braille → AI processing
  4. Accessibility-first AI: Native Braille computation

Comparison with v3

Model Patterns Vocab Params Use Case
v3 (6-dot) 64 4096 27.8M Literary Braille
v4 (8-dot) 256 8192 29.9M Computer/Multimodal

Citation

@misc{braille256v4,
  author = {Ryan Barrett},
  title = {Braille256-v4: 8-dot Grade Infinity Universal Braille Model},
  year = {2024},
  publisher = {HuggingFace},
  url = {https://huggingface.co/ryanscottbarrett/braille256-v4}
}

License

MIT

Downloads last month
13
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support