OceanirAI
/

ASCH

+---
+license: cc-by-nc-4.0
+language:
+- en
+pipeline_tag: image-text-to-text
+tags:
+- vision
+- multimodal
+- reasoning
+base_model: tbd
+---
+# Asch 0.1
+An experimental image-text-to-text model by OceanirAI.
+## What is this?
+Asch 0.1 is an image-text-to-text model - you give it an image and text, and it generates text responses based on what it sees. Think of it as a vision-language model that can look at images and answer questions about them, describe what's happening, or help you understand visual content.
+## Model Overview
+ASCH is a compact, efficient vision-language model designed for advanced reasoning and multimodal understanding.
+### Key Features
+- Hybrid Reasoning: Structured thinking traces for multi-step decisions
+- Perceptive Tool Calling: Focus system with zoom and crop capabilities
+- Structured Outputs: Reliable JSON generation
+- Advanced OCR: Text recognition in challenging conditions
+- UI Understanding: Optimized for desktop and mobile interfaces
+- Edge-Optimized: Efficient architecture for resource-constrained devices
+## Model Details
+- Model Type: Vision-Language Model (Image-Text-to-Text)
+- Parameters: ~2B
+- Architecture: Transformer-based hybrid model
+- License: CC-BY-NC-4.0
+- Developed by: OceanirAI
+## Usage
+Coming soon - model under development.
+## Contact
+- Organization: OceanirAI
+- GitHub: github.com/Oceanir

config.json ADDED Viewed

+{
+  "architectures": [
+    "ASCHForCausalLM"
+  ],
+  "model_type": "asch",
+  "torch_dtype": "float16",
+  "transformers_version": "4.40.0",
+  "vocab_size": 151936,
+  "hidden_size": 2048,
+  "num_hidden_layers": 24,
+  "num_attention_heads": 16,
+  "max_position_embeddings": 8192
+}