--- license: cc-by-nc-4.0 language: - en pipeline_tag: image-text-to-text tags: - vision - multimodal - reasoning base_model: tbd --- # Asch 0.1 An experimental image-text-to-text model by OceanirAI. ## What is this? Asch 0.1 is an image-text-to-text model - you give it an image and text, and it generates text responses based on what it sees. Think of it as a vision-language model that can look at images and answer questions about them, describe what's happening, or help you understand visual content. ## Model Overview ASCH is a compact, efficient vision-language model designed for advanced reasoning and multimodal understanding. ### Key Features - Hybrid Reasoning: Structured thinking traces for multi-step decisions - Perceptive Tool Calling: Focus system with zoom and crop capabilities - Structured Outputs: Reliable JSON generation - Advanced OCR: Text recognition in challenging conditions - UI Understanding: Optimized for desktop and mobile interfaces - Edge-Optimized: Efficient architecture for resource-constrained devices ## Model Details - Model Type: Vision-Language Model (Image-Text-to-Text) - Parameters: ~2B - Architecture: Transformer-based hybrid model - License: CC-BY-NC-4.0 - Developed by: OceanirAI ## Usage Coming soon - model under development. ## Contact - Organization: OceanirAI - GitHub: github.com/Oceanir