Nanny7 commited on
Commit
8335a6f
·
1 Parent(s): aa62f23

Add Asch 0.1 model files with image-text-to-text description

Browse files
Files changed (2) hide show
  1. README.md +49 -0
  2. config.json +13 -0
README.md ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ language:
4
+ - en
5
+ pipeline_tag: image-text-to-text
6
+ tags:
7
+ - vision
8
+ - multimodal
9
+ - reasoning
10
+ base_model: tbd
11
+ ---
12
+
13
+ # Asch 0.1
14
+
15
+ An experimental image-text-to-text model by OceanirAI.
16
+
17
+ ## What is this?
18
+
19
+ Asch 0.1 is an image-text-to-text model - you give it an image and text, and it generates text responses based on what it sees. Think of it as a vision-language model that can look at images and answer questions about them, describe what's happening, or help you understand visual content.
20
+
21
+ ## Model Overview
22
+
23
+ ASCH is a compact, efficient vision-language model designed for advanced reasoning and multimodal understanding.
24
+
25
+ ### Key Features
26
+
27
+ - Hybrid Reasoning: Structured thinking traces for multi-step decisions
28
+ - Perceptive Tool Calling: Focus system with zoom and crop capabilities
29
+ - Structured Outputs: Reliable JSON generation
30
+ - Advanced OCR: Text recognition in challenging conditions
31
+ - UI Understanding: Optimized for desktop and mobile interfaces
32
+ - Edge-Optimized: Efficient architecture for resource-constrained devices
33
+
34
+ ## Model Details
35
+
36
+ - Model Type: Vision-Language Model (Image-Text-to-Text)
37
+ - Parameters: ~2B
38
+ - Architecture: Transformer-based hybrid model
39
+ - License: CC-BY-NC-4.0
40
+ - Developed by: OceanirAI
41
+
42
+ ## Usage
43
+
44
+ Coming soon - model under development.
45
+
46
+ ## Contact
47
+
48
+ - Organization: OceanirAI
49
+ - GitHub: github.com/Oceanir
config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "ASCHForCausalLM"
4
+ ],
5
+ "model_type": "asch",
6
+ "torch_dtype": "float16",
7
+ "transformers_version": "4.40.0",
8
+ "vocab_size": 151936,
9
+ "hidden_size": 2048,
10
+ "num_hidden_layers": 24,
11
+ "num_attention_heads": 16,
12
+ "max_position_embeddings": 8192
13
+ }