Dcas89 commited on
Commit
d7355a4
·
verified ·
1 Parent(s): ddf1e19

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -0
README.md CHANGED
@@ -1,3 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
1
  # Aurea: Adaptive Multimodal Fusion for Vision-Language Models
2
 
3
  Aurea is an open-source research project aimed at advancing vision-language model (VLM) pretraining by leveraging cutting-edge vision encoders—DINOv2 and SigLIP2. The core of Aurea is a novel adaptive **spatial-range attention mechanism** that intelligently fuses spatial and semantic information from encoder-derived visual features, enabling richer and more context-aware representations for various downstream tasks.
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model:
6
+ - microsoft/Phi-4-mini-instruct
7
+ - facebook/dinov2-with-registers-giant
8
+ - google/siglip2-so400m-patch14-224
9
+ pipeline_tag: visual-question-answering
10
+ ---
11
+
12
  # Aurea: Adaptive Multimodal Fusion for Vision-Language Models
13
 
14
  Aurea is an open-source research project aimed at advancing vision-language model (VLM) pretraining by leveraging cutting-edge vision encoders—DINOv2 and SigLIP2. The core of Aurea is a novel adaptive **spatial-range attention mechanism** that intelligently fuses spatial and semantic information from encoder-derived visual features, enabling richer and more context-aware representations for various downstream tasks.