Asch 0.1

An experimental image-text-to-text model by OceanirAI.

What is this?

Asch 0.1 is an image-text-to-text model - you give it an image and text, and it generates text responses based on what it sees. Think of it as a vision-language model that can look at images and answer questions about them, describe what's happening, or help you understand visual content.

Model Overview

ASCH is a compact, efficient vision-language model designed for advanced reasoning and multimodal understanding.

Key Features

  • Hybrid Reasoning: Structured thinking traces for multi-step decisions
  • Perceptive Tool Calling: Focus system with zoom and crop capabilities
  • Structured Outputs: Reliable JSON generation
  • Advanced OCR: Text recognition in challenging conditions
  • UI Understanding: Optimized for desktop and mobile interfaces
  • Edge-Optimized: Efficient architecture for resource-constrained devices

Model Details

  • Model Type: Vision-Language Model (Image-Text-to-Text)
  • Parameters: ~2B
  • Architecture: Transformer-based hybrid model
  • License: CC-BY-NC-4.0
  • Developed by: OceanirAI

Usage

Coming soon - model under development.

Contact

  • Organization: OceanirAI
  • GitHub: github.com/Oceanir
Downloads last month
10
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support