jquenum
/

LISAt-7b

PyTorch

llava

Model card Files Files and versions

xet

Community

jquenum commited on Feb 21, 2025

Commit

4c3c38f

verified ·

1 Parent(s): 1e10baa

Update README.md

Browse files

Files changed (1) hide show

README.md +1 -42

README.md CHANGED Viewed

@@ -27,7 +27,7 @@ With its advanced reasoning capabilities and superior performance on geospatial
 - **Training data**: we introduce the Geospatial Reasoning Segmentation Dataset (GRES), a collection of vision and language data designed around
 remote-sensing applications. GRES consists of two core components: PreGRES, a dataset consisting of over 1M remote-sensing specific visual instruction-tuning Q/A pairs for pre-training geospatial models, and GRES, a semi-synthetic dataset specialized for reasoning segmentation of remote-sensing data and consisting of 9,205 images and 27,615 natural language queries/answers within those images. From this LISAt dataset, we generate train, test, and validation splits consisting of 7,205, 1,500, and 500 images respectively.
-- **mplementation Details**: LISAT and LISATPRE are trained on eight DGX A100 80GB GPUs. In the first stage, we pretrain LISATPRE (context length = 2048) using LoRA for 1 epoch on PreGRES with next-token prediction cross-entropy loss. We employ the AdamW optimizer with a learning rate of 3e−4 and a cosine-decay learning rate scheduler, setting the batch size to 2 and gradient accumulation
 steps to 6.
 In the second stage, we train LISAT using GRES, as well as two traditional natural image referring segmentation datasets, FP-Ref-COCO and ReasonSeg. LoRA is applied to LISATPRE , while the SAM decoder undergoes full fine-tuning. The learning rate is set to 3e−4, with all other configurations remaining the
@@ -92,47 +92,6 @@ outputs = model.generate(**inputs)
 ```
-## Intended Use
-### Intended Use Cases
-LISAT-7B is intended for both **commercial** and **research** use, specifically in **geospatial** and **remote-sensing image analysis** tasks. The model is designed to generate segmentation masks, provide visual descriptions, and answer complex queries about remote-sensing images. It excels in reasoning over implicit queries referring to multiple objects, enabling more advanced analysis compared to traditional segmentation models. LISAT-7B is suitable for tasks in fields such as environmental monitoring, urban planning, and disaster response, where high-quality, detailed analysis of remote-sensing imagery is required.
-Additionally, LISAT-7B can be adapted for various **natural language processing** tasks, such as generating captions, answering questions, or improving other models through synthetic data generation and fine-tuning. The LISAT-7B model is licensed under the **LISAT Community License**, allowing for these use cases in a responsible manner.
-### Out-of-scope Use
-LISAT-7B is **not** intended for use in any applications that violate applicable laws or regulations, including trade compliance and data protection laws. Use in any way that violates the **Acceptable Use Policy** or the **LISAT Community License** is prohibited. Additionally, any use of LISAT-7B beyond the supported **remote-sensing imagery** tasks outlined in this model card, including use in unsupported domains or languages, is not permitted.
-## Responsibility & Safety
-As part of our responsible release approach, we followed a multi-pronged strategy to manage trust and safety risks in LISAT-7B:
-1. **Enable developers** to deploy helpful, safe, and flexible experiences tailored to their target audience and the specific use cases supported by LISAT-7B.
-2. **Protect developers** against adversarial users who may attempt to exploit the model's capabilities for harmful purposes.
-3. **Provide safeguards** for the community to help prevent the misuse of LISAT-7B, ensuring responsible use in remote-sensing applications.
-### Responsible Deployment
-LISAT-7B is a model designed to be used in various geospatial tasks, particularly remote-sensing applications. Our approach is focused on building helpful models that enable the world to benefit from the power of this technology, while aligning model safety with generic use cases to address a standard set of risks and harms.
-Developers are encouraged to tailor safety measures specific to their use cases, defining their own policies and deploying LISAT-7B with the necessary safeguards in their systems. LISAT-7B was developed following best practices outlined in our Responsible Use Guide. You can refer to the [Responsible Use Guide](#) to learn more about how to implement and enforce these safety practices in your deployment.
-## Critical and Other Risks
-We specifically focused our efforts on mitigating the following critical risk areas:
-1. **Geospatial Misuse**
-   We conducted risk assessments to evaluate whether **LISAT-7B** could be exploited by malicious actors for harmful purposes, such as generating misleading or inaccurate geospatial interpretations that may contribute to dangerous decision-making in critical applications (e.g., military or surveillance purposes).
-2. **Environmental Impact**
-   Given the model's domain in remote-sensing imagery, we performed analyses to assess whether **LISAT-7B** could potentially misinterpret environmental data, leading to unsafe or harmful recommendations related to climate or natural resource management.
-3. **Privacy and Security Risks**
-   As **LISAT-7B** processes and analyzes images, we ensured that the model does not inadvertently expose sensitive or personal data, adhering to privacy standards and avoiding the generation of harmful insights from publicly or privately sourced imagery.
-4. **Model Bias in Object Segmentation**
-   Our evaluation includes assessing whether **LISAT-7B** exhibits bias in its segmentation or interpretation of geospatial data across various terrains, object types, or demographics. We worked with domain experts to ensure that the model's performance is accurate and fair, minimizing biases that could affect decision-making in areas like land use planning or environmental protection.
 ### Community
 Generative AI safety requires ongoing expertise and collaboration, and we believe in the strength of the open community to accelerate progress in this area. We encourage contributions and collaboration within open consortiums focused on AI safety and responsible model deployment. We also engage with communities through relevant standards, including the MLCommons safety evaluation framework.

 - **Training data**: we introduce the Geospatial Reasoning Segmentation Dataset (GRES), a collection of vision and language data designed around
 remote-sensing applications. GRES consists of two core components: PreGRES, a dataset consisting of over 1M remote-sensing specific visual instruction-tuning Q/A pairs for pre-training geospatial models, and GRES, a semi-synthetic dataset specialized for reasoning segmentation of remote-sensing data and consisting of 9,205 images and 27,615 natural language queries/answers within those images. From this LISAt dataset, we generate train, test, and validation splits consisting of 7,205, 1,500, and 500 images respectively.
+- **Implementation Details**: LISAT and LISATPRE are trained on eight DGX A100 80GB GPUs. In the first stage, we pretrain LISATPRE (context length = 2048) using LoRA for 1 epoch on PreGRES with next-token prediction cross-entropy loss. We employ the AdamW optimizer with a learning rate of 3e−4 and a cosine-decay learning rate scheduler, setting the batch size to 2 and gradient accumulation
 steps to 6.
 In the second stage, we train LISAT using GRES, as well as two traditional natural image referring segmentation datasets, FP-Ref-COCO and ReasonSeg. LoRA is applied to LISATPRE , while the SAM decoder undergoes full fine-tuning. The learning rate is set to 3e−4, with all other configurations remaining the
 ```
 ### Community
 Generative AI safety requires ongoing expertise and collaboration, and we believe in the strength of the open community to accelerate progress in this area. We encourage contributions and collaboration within open consortiums focused on AI safety and responsible model deployment. We also engage with communities through relevant standards, including the MLCommons safety evaluation framework.