# Earthmind-R1 EarthMind-4B fine-tuned with GRPO (Group Relative Policy Optimization) for geospatial visual question answering. ## Model Details - **Base Model**: EarthMind-4B (InternVL-based architecture) - **Training Method**: GRPO with LoRA adapters - **Training Data**: Geospatial instruction dataset - **Output Format**: Chain-of-thought reasoning with `` and `` tags ## Usage ```python import torch from transformers import AutoModel, AutoTokenizer from PIL import Image model = AutoModel.from_pretrained( "aadex/Earthmind-R1", trust_remote_code=True, torch_dtype=torch.bfloat16, device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained("aadex/Earthmind-R1", trust_remote_code=True) # Prepare for generation model.preparing_for_generation(tokenizer=tokenizer, max_new_tokens=512, torch_dtype=torch.bfloat16) # Load your image image = Image.open("your_image.jpg") # Create prompt question = "Describe what you see in this satellite image." prompt = f"""User: {question} First output the thinking process in tags and then output the final answer in tags. Assistant:""" # Generate (use model's chat method or manual generation) response = model.chat(tokenizer, pixel_values, question, generation_config) print(response) ``` ## Training Trained using GRPO with: - LoRA rank: 16 - LoRA alpha: 32 - Learning rate: 5e-6 - Epochs: 3 - Reward functions: accuracy, format ## License Please refer to the base EarthMind-4B model license.