--- language: en license: mit tags: - bitmar - multimodal - babylm - cross-modal datasets: - babylm_multimodal metrics: - bleu - cross_modal_similarity --- # BitMar 100M Token Model This model was trained on exactly 100 million tokens as part of the BabyLM challenge. ## Training Details - Total tokens: 100,000,000 - Epochs completed: 1 - Tokens processed: 99,686,013 - Cross-modal similarity: 0.3418 ## Model Architecture - Text encoder: 4 layers, 128 hidden size - Vision encoder: DiNOv2 features compressed to 128 - Episodic memory: 32 slots ## Usage ```python from transformers import AutoModel, AutoTokenizer model = AutoModel.from_pretrained("euhidaman/bitmar-attention-multimodal") tokenizer = AutoTokenizer.from_pretrained("euhidaman/bitmar-attention-multimodal") ``` ## Training Status - **Status**: In Progress (Epoch 1) - **Tokens Processed**: 99,686,013 - **Best Cross-modal Similarity**: 0.3418