MagCrop-TMRA-v2

Model Description

TMRA (Task-Multi-Resolution Aggregator) is a BERT-based classifier that automatically determines the granularity level required for Remote Sensing Visual Question Answering tasks. The model classifies input queries into three categories: Image-level, Region-level, or Pixel-level precision requirements.

Model Details

Base Model: bert-base-uncased
Task: Multi-class Text Classification (3 classes)
Classes:
- image: Global scene understanding queries
- region: Localized object detection queries
- pixel: Fine-grained segmentation queries
Training Data: Synthetic RS-VQA query dataset

Intended Use

TMRA is designed as a preprocessing component for the OmniCrop pipeline, enabling adaptive multi-granular processing of remote sensing imagery based on query semantics.

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("beingamanforever/MagCrop-TMRA-v2")
model = AutoModelForSequenceClassification.from_pretrained("beingamanforever/MagCrop-TMRA-v2")

# Classify query
query = "Count the number of white cars in the parking lot"
inputs = tokenizer(query, return_tensors="pt", padding=True, truncation=True)

with torch.no_grad():
    outputs = model(**inputs)
    prediction = outputs.logits.argmax(-1).item()

granularity_map = {0: "image", 1: "region", 2: "pixel"}
print(f"Predicted Granularity: {granularity_map[prediction]}")

Example Predictions

Query	Predicted Granularity
"Describe the overall landscape"	image
"Locate the industrial buildings"	region
"Segment individual vehicles in the parking area"	pixel
"Count the number of tennis courts"	region
"What is the color of the central building's roof?"	pixel

Limitations

Trained on synthetic RS-VQA queries; performance may vary on real-world datasets
Optimized for English language queries
May require fine-tuning for domain-specific terminology

Downloads last month: 5

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for beingamanforever/MagCrop-TMRA-v2

Base model

google-bert/bert-base-uncased

Finetuned

(6785)

this model