Braille256-v1

A language model trained exclusively on Unicode Braille characters (U+2800-U+28FF), demonstrating emergent discovery of Braille contraction patterns.

Model Description

Braille256 is a transformer-based language model with a pure 256-token Braille vocabulary. Unlike traditional language models that use subword tokenization, Braille256 treats each Braille cell as a single token, enabling the model to learn structural patterns inherent to the Braille writing system.

Key Features

Pure Braille Vocabulary: 256 Braille characters + 5 special tokens
Dot-Pattern Embeddings: Custom initialization based on Braille dot patterns
Emergent Contractions: Model independently discovers patterns similar to Grade-2 Braille
Lightweight: ~5M parameters, runs on CPU

Architecture

Parameter	Value
Parameters	4,940,544
Vocabulary	261
Hidden Size	256
Layers	6
Attention Heads	4
Max Sequence Length	256

Emergent Patterns

The model learned to recognize common letter combinations that mirror official Grade-2 Braille contractions:

Pattern	Meaning	Learned Frequency
⠞⠓	th	High
⠞⠓⠑	the	High
⠊⠎	is	High
⠋⠕⠗	for	High
⠺⠊⠞⠓	with	Medium
⠁⠝⠙	and	Medium

Usage

import torch
from transformers import AutoConfig, AutoModel, AutoTokenizer

# Load model and tokenizer
model = AutoModel.from_pretrained("your-username/braille256-v1")
tokenizer = AutoTokenizer.from_pretrained("your-username/braille256-v1")

# Generate text
prompt = "⠞⠓⠑"  # "the" in Braille
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0]))

Direct Usage

from braille256_model import Braille256Model, Braille256Config
from braille256_tokenizer import Braille256Tokenizer

model = Braille256Model.from_pretrained("path/to/model")
tokenizer = Braille256Tokenizer.from_pretrained("path/to/model")

# Encode Braille text
text = "⠞⠓⠑⠀⠟⠥⠊⠉⠅⠀⠃⠗⠕⠺⠝⠀⠋⠕⠭"
tokens = tokenizer.encode(text)

# Generate
output = model.generate(torch.tensor([tokens]), max_length=100)
print(tokenizer.decode(output[0].tolist()))

Training

The model was trained on:

Sample corpus of English text converted to Braille
2,000 training steps
Batch size 8, learning rate 5e-4
~55 minutes on CPU

Training Loss Curve

Initial loss: 3.76
Final loss: 0.0022

Limitations

Trained on limited corpus (sample texts only)
May generate repetitive patterns
Not suitable for production accessibility applications without further training
English-only training data

Intended Use

This model is intended for:

Research into emergent linguistic patterns
Exploring Braille representation learning
Educational demonstrations of language model training
Foundation for larger Braille-native models

Citation

@misc{braille256,
  title={Braille256: A Language Model That Rediscovers Braille Contractions},
  author={Your Name},
  year={2024},
  howpublished={HuggingFace Hub}
}

License

MIT License

Downloads last month: -

Safetensors

Model size

4.94M params

Tensor type

F32