PaintbotMistral-7b / README.md
clarkkitchen22's picture
Upload updated README
8ad3a6d verified
---
language: en
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
tags:
- mistral
- lora
- merged
- gguf
- text-generation
base_model: mistralai/Mistral-7B-Instruct-v0.2
---
# 🌍 Geopolitical Analysis Agent
**Advanced strategic forecasting and simulation engine combining RAG, SQLite, ChromaDB, and Claude AI**
## What Is This?
A production-ready geopolitical analysis system that:
- **Answers complex "what-if" questions** about world events
- **Models quantitative scenarios** (tank stocks, production rates, timelines)
- **Combines structured data + unstructured knowledge** via RAG
- **Provides rigorous analysis** like a think tank war games coordinator
- **Prepares training data** for fine-tuning specialized models
## Key Features
### 🧠 Intelligent RAG Architecture
- **Vector search** with ChromaDB for semantic retrieval
- **Structured database** with SQLite for facts, metrics, inventories
- **Hybrid retrieval** combining both sources for comprehensive context
### 📊 Quantitative Modeling
- Project military inventories over time
- Calculate attrition rates and production capacities
- Model economic sustainability scenarios
- Compare alternative pathways
### 💾 Production-Ready Stack
- FastAPI backend with async support
- SQLAlchemy ORM for database management
- Sentence Transformers for embeddings
- Claude Sonnet 4 for analysis
- Clean HTML/JS frontend
### 🎯 Example Queries
```
"Where will Russia's tank stock be in 5 years with 15% annual
losses and 200 tanks/year production?"
"What's China's timeline to semiconductor parity with Taiwan
if sanctions continue vs. if they're lifted?"
"How long can Iran sustain its proxy network at $60/barrel
vs $100/barrel oil prices?"
"Model European energy security in 2030 under three scenarios:
diversified LNG, accelerated renewables, or partial Russian
reconciliation"
```
## Quick Start
### 1. Install
```bash
cd geopolitical-agent/backend
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```
### 2. Configure
Create `.env` file:
```bash
ANTHROPIC_API_KEY=your_key_here
```
### 3. Initialize
```bash
python -c "from models.database import init_db; init_db()"
```
### 4. Run
```bash
python app.py
```
Server starts on http://localhost:8000
### 5. Open Frontend
Open `frontend/index.html` in browser or:
```bash
cd frontend
python -m http.server 8080
```
### 6. Load Sample Data
Click "Load Sample Data" button in UI or:
```bash
curl -X POST http://localhost:8000/api/data/load-sample-data
```
## Architecture
```
┌─────────────────────────────────────────────────┐
│ Frontend (HTML/JS) │
└─────────────────┬───────────────────────────────┘
│ REST API
┌─────────────────┴───────────────────────────────┐
│ FastAPI Backend │
│ │
│ ┌──────────────────────────────────────────┐ │
│ │ Analysis Service (Claude + RAG) │ │
│ └─────────┬────────────────────┬───────────┘ │
│ │ │ │
│ ┌────────┴────────┐ ┌──────┴───────────┐ │
│ │ RAG Service │ │ Data Ingestion │ │
│ └────────┬────────┘ └──────┬───────────┘ │
│ │ │ │
│ ┌────────┴────────┐ ┌──────┴───────────┐ │
│ │ ChromaDB │ │ SQLite │ │
│ │ (Vectors) │ │ (Structured) │ │
│ └─────────────────┘ └──────────────────┘ │
└─────────────────────────────────────────────────┘
```
## Project Structure
```
geopolitical-agent/
├── backend/
│ ├── app.py # Main FastAPI app
│ ├── config.py # Configuration
│ ├── requirements.txt # Dependencies
│ ├── models/
│ │ ├── database.py # SQLAlchemy models
│ │ └── embeddings.py # ChromaDB manager
│ ├── services/
│ │ ├── rag_service.py # RAG orchestration
│ │ ├── analysis_service.py # Analysis engine
│ │ └── data_ingestion.py # Data loading
│ ├── routes/
│ │ ├── query.py # Query endpoints
│ │ └── data.py # Data endpoints
│ └── data/
│ ├── geopolitical.db # SQLite database
│ └── chroma_db/ # Vector store
├── frontend/
│ └── index.html # Web interface
├── data/
│ ├── sample_data/ # Sample datasets
│ └── training/ # Fine-tuning prep
└── docs/
├── SETUP.md # Setup guide
└── API.md # API documentation
```
## Database Schema
### Countries
- Basic country attributes
- GDP, population, military budget
- Regional categorization
### Military Assets
- Equipment inventories (tanks, aircraft, etc.)
- Operational rates
- Production and attrition rates
### Geopolitical Events
- Timeline of significant events
- Impact scoring
- Related countries tracking
### Metrics Time Series
- Economic indicators
- Production statistics
- Any quantitative metric over time
### Knowledge Sources
- Document provenance tracking
- Credibility scoring
- Source metadata
## API Examples
### Analyze Query
```bash
curl -X POST http://localhost:8000/api/query/analyze \
-H "Content-Type: application/json" \
-d '{
"query": "Your geopolitical question here",
"use_cache": true
}'
```
### Add Knowledge
```bash
curl -X POST http://localhost:8000/api/data/add-document \
-H "Content-Type: application/json" \
-d '{
"text": "Your geopolitical knowledge document",
"metadata": {"type": "report", "country": "China"}
}'
```
Full API documentation: `docs/API.md`
## Fine-Tuning Preparation
### Export Training Data
```python
from models.database import SessionLocal, AnalysisCache
import json
db = SessionLocal()
analyses = db.query(AnalysisCache).all()
training_data = []
for analysis in analyses:
training_data.append({
"messages": [
{
"role": "system",
"content": "You are a geopolitical analysis expert..."
},
{
"role": "user",
"content": analysis.query_text
},
{
"role": "assistant",
"content": analysis.analysis_result
}
]
})
with open("training_data.jsonl", "w") as f:
for item in training_data:
f.write(json.dumps(item) + "\n")
```
### LoRA Training
Use the exported data to fine-tune a LoRA adapter on geopolitical data:
1. Export queries/responses from `analysis_cache` table
2. Format as JSONL for LoRA training
3. Train LoRA adapter on domain-specific data
4. Deploy fine-tuned model for specialized analysis
## Extending the System
### Add New Countries
```python
from models.database import SessionLocal, Country
db = SessionLocal()
country = Country(
name="Pakistan",
iso_code="PAK",
region="South Asia",
population=235000000,
gdp_usd=376000000000,
military_budget_usd=11000000000
)
db.add(country)
db.commit()
```
### Add Military Assets
```python
from models.database import MilitaryAsset
asset = MilitaryAsset(
country_id=country.id,
asset_type="Fighter Aircraft",
asset_name="JF-17 Thunder",
quantity=150,
operational_rate=0.75,
production_rate_yearly=25,
attrition_rate_yearly=0.05
)
db.add(asset)
db.commit()
```
### Add Knowledge Documents
```python
from services.data_ingestion import DataIngestionService
service = DataIngestionService()
service.add_knowledge_document(
text="Your geopolitical analysis or fact...",
metadata={
"type": "intelligence_assessment",
"country": "Iran",
"classification": "open_source"
}
)
```
## Configuration
Edit `backend/config.py`:
```python
# Embedding model (smaller = faster, larger = better)
EMBEDDING_MODEL = "sentence-transformers/all-MiniLM-L6-v2"
# RAG retrieval settings
TOP_K_RESULTS = 5 # Number of relevant chunks
SIMILARITY_THRESHOLD = 0.7 # Minimum relevance score
# Claude settings
DEFAULT_MODEL = "claude-sonnet-4-20250514"
MAX_TOKENS = 4000
TEMPERATURE = 0.3 # Lower = more analytical
```
## Performance Tips
1. **Adjust retrieval**: Tune `TOP_K_RESULTS` and `SIMILARITY_THRESHOLD`
2. **Enable caching**: Set `use_cache=true` for repeated queries
3. **Batch document ingestion**: Use bulk-add for multiple documents
4. **Index optimization**: Add SQLite indexes for frequent queries
## Use Cases
### Strategic Planning
- War games scenario modeling
- Resource sustainability analysis
- Timeline projections
### Intelligence Analysis
- Capability gap assessments
- Economic constraint modeling
- Production capacity tracking
### Academic Research
- Geopolitical trend analysis
- Historical pattern recognition
- Comparative case studies
### Policy Analysis
- Sanction impact modeling
- Alliance dynamics assessment
- Economic leverage analysis
## Roadmap
- [ ] Real-time data ingestion from news sources
- [ ] Multi-agent debate for competing analyses
- [ ] Temporal reasoning for historical patterns
- [ ] Export to PDF reports
- [ ] WebSocket streaming for long analyses
- [ ] Named Entity Recognition for auto-tagging
- [ ] Graph database for relationship modeling
## Contributing
Areas for contribution:
1. **Data**: Add domain-specific geopolitical datasets
2. **Models**: Integrate specialized embedding models
3. **Analysis**: Enhance quantitative modeling functions
4. **UI**: Improve frontend visualization
5. **Documentation**: Add tutorials and examples
## License
MIT License - See LICENSE file
## Citation
If you use this system in research:
```bibtex
@software{geopolitical_analysis_agent,
title={Geopolitical Analysis Agent: RAG-based Strategic Forecasting},
author={[Your Name]},
year={2025},
url={https://github.com/yourusername/geopolitical-agent}
}
```
## Support
- Documentation: `docs/`
- API Reference: `docs/API.md`
- Setup Guide: `docs/SETUP.md`
- Issues: GitHub Issues
## Acknowledgments
Built with:
- [FastAPI](https://fastapi.tiangolo.com/)
- [ChromaDB](https://www.trychroma.com/)
- [Anthropic Claude](https://www.anthropic.com/)
- [Sentence Transformers](https://www.sbert.net/)
---
**Ready to analyze the world? Start with `python app.py`** 🚀