--- title: Optimized Data Harvester emoji: πŸš€ colorFrom: blue colorTo: purple sdk: streamlit sdk_version: 1.28.0 app_file: app_simplified.py pinned: false license: mit --- # πŸš€ Optimized Data Harvester Research-verified API endpoints with intelligent data extraction, compression, and optimized database storage. ## ✨ Features ### 🌍 **10 Data Sources** - Research-verified API endpoints based on 2024 official documentation - πŸ‡ΈπŸ‡ͺ **Swedish Government**: Skolverket, SCB, Kolada, Riksbanken, Swecris, CSN - 🌍 **International**: Eurostat, WHO, OECD, World Bank ### 🎯 **Modern Interface** - **Minimalist Design** - Clean, gradient interface with glass morphism - **Real-time Metrics** - Live status indicators and progress tracking - **Interactive Cards** - Hover effects and modern UI components - **Responsive Layout** - Works perfectly on all devices ### πŸš€ **Optimized Features** - **One-Click Collection** - Single button fetches from all 10 APIs automatically - **Intelligent Data Extraction** - Custom parsers for each API response format - **Smart Compression** - Automatic gzip compression for data >512 bytes (up to 80% space savings) - **Enhanced Database** - SQLite with WAL mode, proper indexing, and performance optimization - **Real-time Analytics** - Live performance metrics, compression ratios, and error tracking - **Advanced Viewer** - Enhanced database explorer with summary statistics ### πŸ”§ **Technical Architecture** - **Response-Tested APIs** - All endpoints verified by live testing in 2024 - **Smart Data Paths** - Configurable extraction using dot notation (e.g., "body._embedded.schoolUnits") - **Performance Database** - WAL mode, 6 strategic indexes, compression support - **Error Resilience** - Graceful degradation with detailed error logging - **Format Support** - JSON-stat, SDMX-JSON, HAL+JSON, OData, PX-Web - **Authentication Handling** - Bearer tokens, API keys, and public endpoints ## πŸ“Š **API Details** ### Swedish Sources (Research-Verified 2024) - **Skolverket**: Education data, compact school units with coordinates (HAL+JSON v3) - **SCB**: Population statistics via POST requests (Rate limit: 10/10 sec) - **Kolada**: Municipal KPIs and regional data (No auth required) - **Riksbanken**: Latest exchange rate observations (SWEA v1 API with public key) - **Swecris**: Research projects (Bearer token: VRSwecrisAPI2025-1) - **CSN**: Student finance statistics (PX-Web API format) ### International Sources (Research-Verified 2024) - **Eurostat**: EU demographic density statistics (JSON-stat 2.0 format) - **WHO**: Global health dimensions via GHO OData API (No auth required) - **OECD**: Economic data via SDMX-JSON format (stats.oecd.org) - **World Bank**: Population indicators via API v2 (format=json) ## πŸš€ **Simple Workflow** ### **One-Click Operation** 1. **Click Button** - Press "FETCH ALL DATA FROM ALL APIS" 2. **Watch Progress** - Real-time updates as each API is contacted 3. **View Results** - Automatic display of success metrics and data preview 4. **Explore Data** - Built-in database viewer and analytics charts ### **What Happens Automatically:** - **API Requests** - All 10 APIs contacted with proper headers and authentication - **Data Processing** - Responses parsed and meaningful data extracted - **Storage** - Automatic saving to SQLite with compression and deduplication - **Analytics** - Instant charts and statistics generated - **Error Handling** - Failed requests logged with detailed error messages ## 🎨 **Modern Design** - **Glass Morphism** - Translucent cards with backdrop blur - **Gradient Backgrounds** - Beautiful blue-to-purple gradients - **Status Indicators** - Color-coded API health monitoring - **Smooth Animations** - CSS transitions and hover effects - **Responsive Grid** - Adaptive layout for all screen sizes ## πŸ“ˆ **Data Visualization** - **Success Rate Pie Charts** - Visual fetch status overview - **Records Bar Charts** - Compare data volume by API - **Real-time Metrics** - Live updating counters and indicators - **Interactive Tables** - Sortable, filterable data preview The simplest and most efficient way to collect data from multiple international APIs with a single click.