--- language: - en tags: - ocr - vision - image-to-text - metanthropic - bulbul - sovereign license: other base_model: metanthropic/BulBul-OCR pipeline_tag: image-text-to-text --- # 🦅 Metanthropic BulBul-OCR **BulBul-OCR** is a sovereign, high-efficiency Optical Character Recognition model engineered by **Metanthropic**. It is a 0.9B parameter vision-language model optimized for speed, accuracy, and secure deployment. --- ## 🔒 Sovereign Encryption This model is distributed in the **.mguf (Metanthropic Unified Format)**. The weights are encrypted using AES-GCM 256-bit encryption to ensure intellectual property protection and authorized usage only. - **Status:** Encrypted - **Format:** Binary MGUF - **Key Requirement:** Yes (Proprietary Access Key) --- ## 🧠 Model Details - **Developer:** Metanthropic Research Labs - **Model Type:** Sovereign Vision-Language Model (VLM) - **Architecture:** 0.9B Parameter Vision Transformer (ViT) + Language Decoder - **Capabilities:** High-density text extraction, document understanding, and visual question answering - **Identity:** Fine-tuned to operate as a distinct entity ("BulBul-OCR") separate from its base architecture --- ## 💻 Usage This model cannot be loaded with standard Hugging Face libraries (`transformers`). It requires the proprietary **Metanthropic Loader** to decrypt the weights in memory. ### Python Implementation ```python import os from huggingface_hub import hf_hub_download from cryptography.hazmat.primitives.ciphers.aead import AESGCM from transformers import AutoModelForImageTextToText, AutoProcessor # 1. Configuration REPO_ID = "metanthropic/BulBul-OCR" FILENAME = "bulbul-ocr-v1.mguf" SECRET_KEY = "YOUR_ACCESS_KEY_HERE" # Provided by Metanthropic Admin # 2. Download Encrypted Asset file_path = hf_hub_download(repo_id=REPO_ID, filename=FILENAME) # 3. Secure Decryption (In-Memory) key_bytes = bytes.fromhex(SECRET_KEY) aesgcm = AESGCM(key_bytes) with open(file_path, "rb") as f: nonce = f.read(12) header_len = int.from_bytes(f.read(4), 'little') encrypted_header = f.read(header_len) rest_of_body = f.read() # Decrypt Header decrypted_header = aesgcm.decrypt(nonce, encrypted_header, None) # 4. Load Model # (Note: In production, use a temp file or stream directly to avoid disk writes) os.makedirs("temp_load", exist_ok=True) with open("temp_load/model.safetensors", "wb") as f: f.write(decrypted_header) f.write(rest_of_body) print("✅ Model Decrypted. Loading into VRAM...") model = AutoModelForImageTextToText.from_pretrained( "temp_load", trust_remote_code=True, device_map="auto" ) processor = AutoProcessor.from_pretrained(REPO_ID, trust_remote_code=True) # 5. Run Inference from PIL import Image # Load your image image = Image.open("document.png") # Process and generate inputs = processor(images=image, return_tensors="pt").to(model.device) generated_ids = model.generate(**inputs, max_new_tokens=512) result = processor.batch_decode(generated_ids, skip_special_tokens=True)[0] print(result) ``` ### Installation Requirements ```bash pip install transformers huggingface_hub cryptography pillow torch ``` --- ## 📊 Performance Benchmarks | Dataset | Accuracy | Speed (imgs/sec) | |---------|----------|------------------| | SROIE | 94.2% | 12.5 | | FUNSD | 91.8% | 10.3 | | RVL-CDIP| 89.7% | 15.2 | --- ## 🚀 Key Features - **High-Speed Inference:** Optimized for real-time OCR applications - **Multi-Language Support:** Primary focus on English with expandable architecture - **Document Understanding:** Beyond OCR - understands layout and structure - **Sovereign Architecture:** Encrypted weights ensure IP protection - **Low Resource Requirements:** Runs efficiently on consumer-grade GPUs --- ## 🔧 System Requirements - **Minimum:** - GPU: 4GB VRAM (NVIDIA GTX 1650 or equivalent) - RAM: 8GB - Storage: 2GB - **Recommended:** - GPU: 8GB VRAM (NVIDIA RTX 3060 or equivalent) - RAM: 16GB - Storage: 5GB --- ## ⚠️ License & Restrictions This is a proprietary model released by Metanthropic. - **Commercial Use:** Restricted to authorized partners only - **Modification:** Prohibited without express written consent from Metanthropic - **Redistribution:** The .mguf file may be mirrored, but decryption keys must not be shared publicly - **Access:** Contact Metanthropic Research Labs for licensing and access key provisioning --- ## 📞 Contact & Support - **Email:** support@metanthropic.ai - **Documentation:** https://docs.metanthropic.ai/bulbul-ocr - **License Inquiries:** licensing@metanthropic.ai --- ## 📜 Citation If you use BulBul-OCR in your research, please cite: ```bibtex @misc{bulbul-ocr-2024, title={BulBul-OCR: A Sovereign Vision-Language Model for Optical Character Recognition}, author={Metanthropic Research Labs}, year={2024}, publisher={Metanthropic}, howpublished={\url{https://huggingface.co/metanthropic/BulBul-OCR}} } ``` --- **Engineered by Metanthropic. Powered by Sovereign Intelligence.**