| license: cc-by-nc-4.0 | |
| language: | |
| - en | |
| pipeline_tag: image-text-to-text | |
| tags: | |
| - vision | |
| - multimodal | |
| - reasoning | |
| base_model: tbd | |
| # Asch 0.1 | |
| An experimental image-text-to-text model by OceanirAI. | |
| ## What is this? | |
| Asch 0.1 is an image-text-to-text model - you give it an image and text, and it generates text responses based on what it sees. Think of it as a vision-language model that can look at images and answer questions about them, describe what's happening, or help you understand visual content. | |
| ## Model Overview | |
| ASCH is a compact, efficient vision-language model designed for advanced reasoning and multimodal understanding. | |
| ### Key Features | |
| - Hybrid Reasoning: Structured thinking traces for multi-step decisions | |
| - Perceptive Tool Calling: Focus system with zoom and crop capabilities | |
| - Structured Outputs: Reliable JSON generation | |
| - Advanced OCR: Text recognition in challenging conditions | |
| - UI Understanding: Optimized for desktop and mobile interfaces | |
| - Edge-Optimized: Efficient architecture for resource-constrained devices | |
| ## Model Details | |
| - Model Type: Vision-Language Model (Image-Text-to-Text) | |
| - Parameters: ~2B | |
| - Architecture: Transformer-based hybrid model | |
| - License: CC-BY-NC-4.0 | |
| - Developed by: OceanirAI | |
| ## Usage | |
| Coming soon - model under development. | |
| ## Contact | |
| - Organization: OceanirAI | |
| - GitHub: github.com/Oceanir | |