File size: 1,503 Bytes
3d7a782
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
---
library_name: transformers
license: apache-2.0
language:
- en
- zh
tags:
- remote-sensing
- mllm
- multimodal
- earth-observation
- satellite-imagery
pipeline_tag: image-text-to-text
---

# 🌍 TerraSense-Base

A Multimodal Large Language Model for Remote Sensing.

## πŸ“– Documentation

For usage instructions, examples, and detailed documentation, please visit:

πŸ‘‰ **[GitHub Repository](https://github.com/TerraSense-CASM/terrasense)**

## πŸš€ Quick Start

```python
from transformers import AutoModelForVision2Seq, AutoProcessor
from qwen_vl_utils import process_vision_info
import torch

model = AutoModelForVision2Seq.from_pretrained(
    "TerraSense-CASM/TerraSense-Base",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)
processor = AutoProcessor.from_pretrained("TerraSense-CASM/TerraSense-Base", trust_remote_code=True)

messages = [{"role": "user", "content": [
    {"type": "image", "image": "path/to/image.jpg"},
    {"type": "text", "text": "Describe this remote sensing image."},
]}]

text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
image_inputs, _ = process_vision_info(messages)
inputs = processor(text=[text], images=image_inputs, padding=True, return_tensors="pt").to("cuda")
output = model.generate(**inputs, max_new_tokens=512)
print(processor.batch_decode(output, skip_special_tokens=True)[0])
```

## πŸ“œ License

[Apache 2.0](https://github.com/TerraSense-CASM/terrasense/blob/main/LICENSE)