PaintbotMistral-7b / README.md

Upload updated README

8ad3a6d verified about 2 months ago

11.6 kB

	---
	language: en
	license: apache-2.0
	library_name: transformers
	pipeline_tag: text-generation
	tags:
	- mistral
	- lora
	- merged
	- gguf
	- text-generation
	base_model: mistralai/Mistral-7B-Instruct-v0.2
	---








	# 🌍 Geopolitical Analysis Agent

	Advanced strategic forecasting and simulation engine combining RAG, SQLite, ChromaDB, and Claude AI

	## What Is This?

	A production-ready geopolitical analysis system that:
	- Answers complex "what-if" questions about world events
	- Models quantitative scenarios (tank stocks, production rates, timelines)
	- Combines structured data + unstructured knowledge via RAG
	- Provides rigorous analysis like a think tank war games coordinator
	- Prepares training data for fine-tuning specialized models

	## Key Features

	### 🧠 Intelligent RAG Architecture
	- Vector search with ChromaDB for semantic retrieval
	- Structured database with SQLite for facts, metrics, inventories
	- Hybrid retrieval combining both sources for comprehensive context

	### 📊 Quantitative Modeling
	- Project military inventories over time
	- Calculate attrition rates and production capacities
	- Model economic sustainability scenarios
	- Compare alternative pathways

	### 💾 Production-Ready Stack
	- FastAPI backend with async support
	- SQLAlchemy ORM for database management
	- Sentence Transformers for embeddings
	- Claude Sonnet 4 for analysis
	- Clean HTML/JS frontend

	### 🎯 Example Queries

	```
	"Where will Russia's tank stock be in 5 years with 15% annual
	losses and 200 tanks/year production?"

	"What's China's timeline to semiconductor parity with Taiwan
	if sanctions continue vs. if they're lifted?"

	"How long can Iran sustain its proxy network at $60/barrel
	vs $100/barrel oil prices?"

	"Model European energy security in 2030 under three scenarios:
	diversified LNG, accelerated renewables, or partial Russian
	reconciliation"
	```

	## Quick Start

	### 1. Install

	```bash
	cd geopolitical-agent/backend
	python3 -m venv venv
	source venv/bin/activate
	pip install -r requirements.txt
	```

	### 2. Configure

	Create `.env` file:
	```bash
	ANTHROPIC_API_KEY=your_key_here
	```

	### 3. Initialize

	```bash
	python -c "from models.database import init_db; init_db()"
	```

	### 4. Run

	```bash
	python app.py
	```

	Server starts on http://localhost:8000

	### 5. Open Frontend

	Open `frontend/index.html` in browser or:
	```bash
	cd frontend
	python -m http.server 8080
	```

	### 6. Load Sample Data

	Click "Load Sample Data" button in UI or:
	```bash
	curl -X POST http://localhost:8000/api/data/load-sample-data
	```

	## Architecture

	```
	┌─────────────────────────────────────────────────┐
	│ Frontend (HTML/JS) │
	└─────────────────┬───────────────────────────────┘
	│ REST API
	┌─────────────────┴───────────────────────────────┐
	│ FastAPI Backend │
	│ │
	│ ┌──────────────────────────────────────────┐ │
	│ │ Analysis Service (Claude + RAG) │ │
	│ └─────────┬────────────────────┬───────────┘ │
	│ │ │ │
	│ ┌────────┴────────┐ ┌──────┴───────────┐ │
	│ │ RAG Service │ │ Data Ingestion │ │
	│ └────────┬────────┘ └──────┬───────────┘ │
	│ │ │ │
	│ ┌────────┴────────┐ ┌──────┴───────────┐ │
	│ │ ChromaDB │ │ SQLite │ │
	│ │ (Vectors) │ │ (Structured) │ │
	│ └─────────────────┘ └──────────────────┘ │
	└─────────────────────────────────────────────────┘
	```

	## Project Structure

	```
	geopolitical-agent/
	├── backend/
	│ ├── app.py # Main FastAPI app
	│ ├── config.py # Configuration
	│ ├── requirements.txt # Dependencies
	│ ├── models/
	│ │ ├── database.py # SQLAlchemy models
	│ │ └── embeddings.py # ChromaDB manager
	│ ├── services/
	│ │ ├── rag_service.py # RAG orchestration
	│ │ ├── analysis_service.py # Analysis engine
	│ │ └── data_ingestion.py # Data loading
	│ ├── routes/
	│ │ ├── query.py # Query endpoints
	│ │ └── data.py # Data endpoints
	│ └── data/
	│ ├── geopolitical.db # SQLite database
	│ └── chroma_db/ # Vector store
	├── frontend/
	│ └── index.html # Web interface
	├── data/
	│ ├── sample_data/ # Sample datasets
	│ └── training/ # Fine-tuning prep
	└── docs/
	├── SETUP.md # Setup guide
	└── API.md # API documentation
	```

	## Database Schema

	### Countries
	- Basic country attributes
	- GDP, population, military budget
	- Regional categorization

	### Military Assets
	- Equipment inventories (tanks, aircraft, etc.)
	- Operational rates
	- Production and attrition rates

	### Geopolitical Events
	- Timeline of significant events
	- Impact scoring
	- Related countries tracking

	### Metrics Time Series
	- Economic indicators
	- Production statistics
	- Any quantitative metric over time

	### Knowledge Sources
	- Document provenance tracking
	- Credibility scoring
	- Source metadata

	## API Examples

	### Analyze Query
	```bash
	curl -X POST http://localhost:8000/api/query/analyze \
	-H "Content-Type: application/json" \
	-d '{
	"query": "Your geopolitical question here",
	"use_cache": true
	}'
	```

	### Add Knowledge
	```bash
	curl -X POST http://localhost:8000/api/data/add-document \
	-H "Content-Type: application/json" \
	-d '{
	"text": "Your geopolitical knowledge document",
	"metadata": {"type": "report", "country": "China"}
	}'
	```

	Full API documentation: `docs/API.md`

	## Fine-Tuning Preparation

	### Export Training Data

	```python
	from models.database import SessionLocal, AnalysisCache
	import json

	db = SessionLocal()
	analyses = db.query(AnalysisCache).all()

	training_data = []
	for analysis in analyses:
	training_data.append({
	"messages": [
	{
	"role": "system",
	"content": "You are a geopolitical analysis expert..."
	},
	{
	"role": "user",
	"content": analysis.query_text
	},
	{
	"role": "assistant",
	"content": analysis.analysis_result
	}
	]
	})

	with open("training_data.jsonl", "w") as f:
	for item in training_data:
	f.write(json.dumps(item) + "\n")
	```

	### LoRA Training

	Use the exported data to fine-tune a LoRA adapter on geopolitical data:

	1. Export queries/responses from `analysis_cache` table
	2. Format as JSONL for LoRA training
	3. Train LoRA adapter on domain-specific data
	4. Deploy fine-tuned model for specialized analysis

	## Extending the System

	### Add New Countries

	```python
	from models.database import SessionLocal, Country

	db = SessionLocal()
	country = Country(
	name="Pakistan",
	iso_code="PAK",
	region="South Asia",
	population=235000000,
	gdp_usd=376000000000,
	military_budget_usd=11000000000
	)
	db.add(country)
	db.commit()
	```

	### Add Military Assets

	```python
	from models.database import MilitaryAsset

	asset = MilitaryAsset(
	country_id=country.id,
	asset_type="Fighter Aircraft",
	asset_name="JF-17 Thunder",
	quantity=150,
	operational_rate=0.75,
	production_rate_yearly=25,
	attrition_rate_yearly=0.05
	)
	db.add(asset)
	db.commit()
	```

	### Add Knowledge Documents

	```python
	from services.data_ingestion import DataIngestionService

	service = DataIngestionService()
	service.add_knowledge_document(
	text="Your geopolitical analysis or fact...",
	metadata={
	"type": "intelligence_assessment",
	"country": "Iran",
	"classification": "open_source"
	}
	)
	```

	## Configuration

	Edit `backend/config.py`:

	```python
	# Embedding model (smaller = faster, larger = better)
	EMBEDDING_MODEL = "sentence-transformers/all-MiniLM-L6-v2"

	# RAG retrieval settings
	TOP_K_RESULTS = 5 # Number of relevant chunks
	SIMILARITY_THRESHOLD = 0.7 # Minimum relevance score

	# Claude settings
	DEFAULT_MODEL = "claude-sonnet-4-20250514"
	MAX_TOKENS = 4000
	TEMPERATURE = 0.3 # Lower = more analytical
	```

	## Performance Tips

	1. Adjust retrieval: Tune `TOP_K_RESULTS` and `SIMILARITY_THRESHOLD`
	2. Enable caching: Set `use_cache=true` for repeated queries
	3. Batch document ingestion: Use bulk-add for multiple documents
	4. Index optimization: Add SQLite indexes for frequent queries

	## Use Cases

	### Strategic Planning
	- War games scenario modeling
	- Resource sustainability analysis
	- Timeline projections

	### Intelligence Analysis
	- Capability gap assessments
	- Economic constraint modeling
	- Production capacity tracking

	### Academic Research
	- Geopolitical trend analysis
	- Historical pattern recognition
	- Comparative case studies

	### Policy Analysis
	- Sanction impact modeling
	- Alliance dynamics assessment
	- Economic leverage analysis

	## Roadmap

	- [ ] Real-time data ingestion from news sources
	- [ ] Multi-agent debate for competing analyses
	- [ ] Temporal reasoning for historical patterns
	- [ ] Export to PDF reports
	- [ ] WebSocket streaming for long analyses
	- [ ] Named Entity Recognition for auto-tagging
	- [ ] Graph database for relationship modeling

	## Contributing

	Areas for contribution:
	1. Data: Add domain-specific geopolitical datasets
	2. Models: Integrate specialized embedding models
	3. Analysis: Enhance quantitative modeling functions
	4. UI: Improve frontend visualization
	5. Documentation: Add tutorials and examples

	## License

	MIT License - See LICENSE file

	## Citation

	If you use this system in research:

	```bibtex
	@software{geopolitical_analysis_agent,
	title={Geopolitical Analysis Agent: RAG-based Strategic Forecasting},
	author={[Your Name]},
	year={2025},
	url={https://github.com/yourusername/geopolitical-agent}
	}
	```

	## Support

	- Documentation: `docs/`
	- API Reference: `docs/API.md`
	- Setup Guide: `docs/SETUP.md`
	- Issues: GitHub Issues

	## Acknowledgments

	Built with:
	- [FastAPI](https://fastapi.tiangolo.com/)
	- [ChromaDB](https://www.trychroma.com/)
	- [Anthropic Claude](https://www.anthropic.com/)
	- [Sentence Transformers](https://www.sbert.net/)

	---

	Ready to analyze the world? Start with `python app.py` 🚀