jquenum commited on
Commit
6173a15
·
verified ·
1 Parent(s): 2832e94

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -3
README.md CHANGED
@@ -1,3 +1,49 @@
1
- ---
2
- license: cc-by-nc-sa-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-sa-4.0
3
+ ---
4
+ # LISAt_PRE
5
+
6
+ **LISAt_PRE** is a remote-sensing-focused MLLM that is tailored to improve performance in scenarios requiring detailed visual understanding and natural language reasoning over satellite and aerial imagery.
7
+
8
+ ---
9
+
10
+ ## Overview
11
+
12
+ LISAt_PRE enhances the [LISAt](https://huggingface.co/jquenum/LISAt-7b) framework by adapting it to remote-sensing applications, which require better handling of diverse visual data and specialized query types. The architecture integrates:
13
+
14
+ - A **Remote-CLIP ViT-L/14** vision encoder
15
+ - A **Vicuna-7B** LLM for text understanding and reasoning
16
+ - A **linear projection module** to align vision and language representations
17
+ - A segmentation model trained on high-quality mask annotations
18
+
19
+ An architectural overview is shown in Figure 3 (refer to paper).
20
+
21
+ ---
22
+
23
+ ## Key Features
24
+
25
+ - **Remote-Sensing Specialization**: Trained on domain-specific imagery to handle the unique challenges of satellite data.
26
+ - **Multimodal Alignment**: Combines textual and visual inputs through a unified architecture.
27
+ - **Training with [PreGRES](https://huggingface.co/datasets/jquenum/PreGRES/blob/main/README.md)**: LISAt_PRE is pre-trained on the [PreGRES](https://huggingface.co/datasets/jquenum/PreGRES/blob/main/README.md) dataset using LoRA (Hu et al., 2021), before being fine-tuned on GRES.
28
+
29
+ ---
30
+
31
+ ## Architecture
32
+
33
+ - **Language Model**: [Vicuna-7B](https://lmsys.org/blog/2023-03-30-vicuna/) (Chiang et al., 2023)
34
+ - **Vision Encoder**: Remote-CLIP ViT-L/14 (Liu et al., 2024a)
35
+
36
+ ---
37
+
38
+ ## Citation
39
+
40
+ If you use LISAt_PRE in your work, please cite:
41
+
42
+ ```bibtex
43
+ @article{quenum2025lisat,
44
+ title={LISAt: Language-Instructed Segmentation Assistant for Satellite Imagery},
45
+ author={Quenum, Jerome and Hsieh, Wen-Han and Wu, Tsung-Han and Gupta, Ritwik and Darrell, Trevor and Chan, David M},
46
+ journal={arXiv preprint arXiv:2505.02829},
47
+ year={2025},
48
+ url={https://arxiv.org/pdf/2505.02829}
49
+ }