OSINTTool / README.md
Firemedic15's picture
Update App
dcb2915 verified
---
title: OSINT Threat Analyst
emoji: 🌐
colorFrom: indigo
colorTo: blue
sdk: gradio
sdk_version: 5.23.0
app_file: app.py
pinned: false
license: apache-2.0
tags:
- osint
- security
- geopolitics
- threat-intelligence
- agentic
- smolagents
- conflict-analysis
- nlp
---
# 🌐 OSINT Threat Analyst
An agentic multi-source open-source intelligence (OSINT) briefing tool that autonomously gathers, fuses, and synthesizes geopolitical conflict data into structured threat assessments.
## Architecture
This Space demonstrates a **context-aware agentic analyst** loop:
1. **Tool use** β€” The agent autonomously decides when to call ACLED (structured conflict event data) and RSS news feeds (unstructured open-source reporting)
2. **Multi-source fusion** β€” Raw data from heterogeneous sources is normalized and co-analyzed
3. **Structured synthesis** β€” Output is produced as a typed threat brief with severity, confidence, key findings, escalation indicators, and watch items
4. **Dual output** β€” Rendered HTML brief + transparent agent trace for auditability
## Data Sources
| Source | Type | Coverage |
|--------|------|----------|
| [ACLED](https://acleddata.com) | Structured conflict events | 80+ countries |
| Reuters, BBC, Al Jazeera | RSS news | Global |
| Bellingcat, Crisis Group | OSINT/analysis RSS | Conflict-specific |
| UN News, Foreign Policy | Policy/humanitarian | Global |
## Setup (Self-hosting)
Set the following Space secrets:
```
ACLED_API_KEY β€” Register free at https://developer.acleddata.com
ACLED_EMAIL β€” Email used for ACLED registration
HF_TOKEN β€” Auto-injected in HuggingFace Spaces
```
## Research Context
This tool operationalizes the **context-aware agentic security analyst architecture** explored in ongoing research on converged physical/cyber security intelligence operations. The agentic loop pattern β€” autonomous tool selection, multi-source ingestion, and structured synthesis β€” mirrors the design described in the supporting academic work.
Key research contributions demonstrated here:
- Local-language and multilingual source integration (extend `RSS_FEED_REGISTRY` in `tools.py`)
- Structured output schema enabling downstream automation
- Transparent reasoning trace for analyst oversight
## Extending
To add new RSS sources, add entries to `RSS_FEED_REGISTRY` in `tools.py`:
```python
RSS_FEED_REGISTRY["your_source"] = "https://example.com/feed.xml"
```
To add new tools (e.g. GDELT, Twitter/X, Telegram), decorate functions with `@tool` and pass them to `ToolCallingAgent`.
## Stack
- [`smolagents`](https://github.com/huggingface/smolagents) β€” HuggingFace-native agentic framework
- [`Qwen/Qwen2.5-72B-Instruct`](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct) β€” Free HF Inference API model
- [ACLED API](https://acleddata.com/acleddatanew/wp-content/uploads/dlm_uploads/2021/11/ACLED_API-User-Guide_2021.pdf) β€” Armed Conflict Location & Event Data
- [Gradio](https://gradio.app) β€” UI framework
## Disclaimer
⚠️ AI-generated from open sources. Not for operational use without independent verification. ACLED data is third-party and subject to their terms of use.