Braille256-v1

A language model trained exclusively on Unicode Braille characters (U+2800-U+28FF), demonstrating emergent discovery of Braille contraction patterns.

Model Description

Braille256 is a transformer-based language model with a pure 256-token Braille vocabulary. Unlike traditional language models that use subword tokenization, Braille256 treats each Braille cell as a single token, enabling the model to learn structural patterns inherent to the Braille writing system.

Key Features

  • Pure Braille Vocabulary: 256 Braille characters + 5 special tokens
  • Dot-Pattern Embeddings: Custom initialization based on Braille dot patterns
  • Emergent Contractions: Model independently discovers patterns similar to Grade-2 Braille
  • Lightweight: ~5M parameters, runs on CPU

Architecture

Parameter Value
Parameters 4,940,544
Vocabulary 261
Hidden Size 256
Layers 6
Attention Heads 4
Max Sequence Length 256

Emergent Patterns

The model learned to recognize common letter combinations that mirror official Grade-2 Braille contractions:

Pattern Meaning Learned Frequency
⠞⠓ th High
⠞⠓⠑ the High
⠊⠎ is High
⠋⠕⠗ for High
⠺⠊⠞⠓ with Medium
⠁⠝⠙ and Medium

Usage

import torch
from transformers import AutoConfig, AutoModel, AutoTokenizer

# Load model and tokenizer
model = AutoModel.from_pretrained("your-username/braille256-v1")
tokenizer = AutoTokenizer.from_pretrained("your-username/braille256-v1")

# Generate text
prompt = "⠞⠓⠑"  # "the" in Braille
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0]))

Direct Usage

from braille256_model import Braille256Model, Braille256Config
from braille256_tokenizer import Braille256Tokenizer

model = Braille256Model.from_pretrained("path/to/model")
tokenizer = Braille256Tokenizer.from_pretrained("path/to/model")

# Encode Braille text
text = "⠞⠓⠑⠀⠟⠥⠊⠉⠅⠀⠃⠗⠕⠺⠝⠀⠋⠕⠭"
tokens = tokenizer.encode(text)

# Generate
output = model.generate(torch.tensor([tokens]), max_length=100)
print(tokenizer.decode(output[0].tolist()))

Training

The model was trained on:

  • Sample corpus of English text converted to Braille
  • 2,000 training steps
  • Batch size 8, learning rate 5e-4
  • ~55 minutes on CPU

Training Loss Curve

  • Initial loss: 3.76
  • Final loss: 0.0022

Limitations

  • Trained on limited corpus (sample texts only)
  • May generate repetitive patterns
  • Not suitable for production accessibility applications without further training
  • English-only training data

Intended Use

This model is intended for:

  • Research into emergent linguistic patterns
  • Exploring Braille representation learning
  • Educational demonstrations of language model training
  • Foundation for larger Braille-native models

Citation

@misc{braille256,
  title={Braille256: A Language Model That Rediscovers Braille Contractions},
  author={Your Name},
  year={2024},
  howpublished={HuggingFace Hub}
}

License

MIT License

Downloads last month
11
Safetensors
Model size
4.94M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support