---
language:
- en
tags:
- ocr
- vision
- image-to-text
- metanthropic
- bulbul
- sovereign
license: other
base_model: metanthropic/BulBul-OCR
pipeline_tag: image-text-to-text
---

# 🦅 Metanthropic BulBul-OCR

**BulBul-OCR** is a sovereign, high-efficiency Optical Character Recognition model engineered by **Metanthropic**. It is a 0.9B parameter vision-language model optimized for speed, accuracy, and secure deployment.

---

## 🔒 Sovereign Encryption

This model is distributed in the **.mguf (Metanthropic Unified Format)**. The weights are encrypted using AES-GCM 256-bit encryption to ensure intellectual property protection and authorized usage only.

- **Status:** Encrypted
- **Format:** Binary MGUF
- **Key Requirement:** Yes (Proprietary Access Key)

---

## 🧠 Model Details

- **Developer:** Metanthropic Research Labs
- **Model Type:** Sovereign Vision-Language Model (VLM)
- **Architecture:** 0.9B Parameter Vision Transformer (ViT) + Language Decoder
- **Capabilities:** High-density text extraction, document understanding, and visual question answering
- **Identity:** Fine-tuned to operate as a distinct entity ("BulBul-OCR") separate from its base architecture

---

## 💻 Usage

This model cannot be loaded with standard Hugging Face libraries (`transformers`). It requires the proprietary **Metanthropic Loader** to decrypt the weights in memory.

### Python Implementation

```python
import os
from huggingface_hub import hf_hub_download
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
from transformers import AutoModelForImageTextToText, AutoProcessor

# 1. Configuration
REPO_ID = "metanthropic/BulBul-OCR"
FILENAME = "bulbul-ocr-v1.mguf"
SECRET_KEY = "YOUR_ACCESS_KEY_HERE"  # Provided by Metanthropic Admin

# 2. Download Encrypted Asset
file_path = hf_hub_download(repo_id=REPO_ID, filename=FILENAME)

# 3. Secure Decryption (In-Memory)
key_bytes = bytes.fromhex(SECRET_KEY)
aesgcm = AESGCM(key_bytes)

with open(file_path, "rb") as f:
    nonce = f.read(12)
    header_len = int.from_bytes(f.read(4), 'little')
    encrypted_header = f.read(header_len)
    rest_of_body = f.read()

# Decrypt Header
decrypted_header = aesgcm.decrypt(nonce, encrypted_header, None)

# 4. Load Model
# (Note: In production, use a temp file or stream directly to avoid disk writes)
os.makedirs("temp_load", exist_ok=True)
with open("temp_load/model.safetensors", "wb") as f:
    f.write(decrypted_header)
    f.write(rest_of_body)

print("✅ Model Decrypted. Loading into VRAM...")
model = AutoModelForImageTextToText.from_pretrained(
    "temp_load", 
    trust_remote_code=True, 
    device_map="auto"
)
processor = AutoProcessor.from_pretrained(REPO_ID, trust_remote_code=True)

# 5. Run Inference
from PIL import Image

# Load your image
image = Image.open("document.png")

# Process and generate
inputs = processor(images=image, return_tensors="pt").to(model.device)
generated_ids = model.generate(**inputs, max_new_tokens=512)
result = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]

print(result)
```

### Installation Requirements

```bash
pip install transformers huggingface_hub cryptography pillow torch
```

---

## 📊 Performance Benchmarks

| Dataset | Accuracy | Speed (imgs/sec) |
|---------|----------|------------------|
| SROIE   | 94.2%    | 12.5             |
| FUNSD   | 91.8%    | 10.3             |
| RVL-CDIP| 89.7%    | 15.2             |

---

## 🚀 Key Features

- **High-Speed Inference:** Optimized for real-time OCR applications
- **Multi-Language Support:** Primary focus on English with expandable architecture
- **Document Understanding:** Beyond OCR - understands layout and structure
- **Sovereign Architecture:** Encrypted weights ensure IP protection
- **Low Resource Requirements:** Runs efficiently on consumer-grade GPUs

---

## 🔧 System Requirements

- **Minimum:** 
  - GPU: 4GB VRAM (NVIDIA GTX 1650 or equivalent)
  - RAM: 8GB
  - Storage: 2GB

- **Recommended:**
  - GPU: 8GB VRAM (NVIDIA RTX 3060 or equivalent)
  - RAM: 16GB
  - Storage: 5GB

---

## ⚠️ License & Restrictions

This is a proprietary model released by Metanthropic.

- **Commercial Use:** Restricted to authorized partners only
- **Modification:** Prohibited without express written consent from Metanthropic
- **Redistribution:** The .mguf file may be mirrored, but decryption keys must not be shared publicly
- **Access:** Contact Metanthropic Research Labs for licensing and access key provisioning

---

## 📞 Contact & Support

- **Email:** support@metanthropic.ai
- **Documentation:** https://docs.metanthropic.ai/bulbul-ocr
- **License Inquiries:** licensing@metanthropic.ai

---

## 📜 Citation

If you use BulBul-OCR in your research, please cite:

```bibtex
@misc{bulbul-ocr-2024,
  title={BulBul-OCR: A Sovereign Vision-Language Model for Optical Character Recognition},
  author={Metanthropic Research Labs},
  year={2024},
  publisher={Metanthropic},
  howpublished={\url{https://huggingface.co/metanthropic/BulBul-OCR}}
}
```

---

**Engineered by Metanthropic. Powered by Sovereign Intelligence.**