vikhyatk commited on
Commit
704a11b
·
verified ·
1 Parent(s): b805734

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -1
README.md CHANGED
@@ -3,4 +3,14 @@ library_name: transformers
3
  tags: []
4
  ---
5
 
6
- Coming soon.
 
 
 
 
 
 
 
 
 
 
 
3
  tags: []
4
  ---
5
 
6
+ Moondream 3 (Preview) is vision language model with a mixture of experts architecture (9B total parameters, 2B active).
7
+
8
+ Architecture details:
9
+
10
+ 1. 24 layers; the first four are dense, the rest have MoE FFNs with 64 experts, 8 activated per token
11
+ 2. MoE FFNs have GeGLU architecture, with inner/gate dim of 1024. The model's hidden dim is 2048.
12
+ 3. Usable context length increased to 32K, with [a custom efficient SuperBPE tokenizer](https://huggingface.co/moondream/starmie-v1)
13
+ 4. Multi-headed attention with learned position- and data-dependent temperature scaling
14
+ 5. Vision encoder initialized from SigLIP-SO-400M, with multi-crop channel concatenation for token-efficient high resolution image processing
15
+
16
+ For more details, please refer to our ||coming soon release blog post||.