madriClaro / README.md
Ruben
Integrate Aclarador with Groq API for clarity analysis
28aa7d9
---
title: Madrid Content Analyzer
emoji: πŸ›οΈ
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit
---
# πŸ›οΈ Madrid Content Analyzer
Automated analysis of language clarity in Madrid City Council communications.
## πŸ“‹ What It Does
This application:
- πŸ“₯ **Fetches content** from Madrid City Council RSS feeds and Open Data Portal
- πŸ” **Analyzes language clarity** using the Aclarador system
- πŸ“Š **Tracks trends** over time in a DuckDB database
- πŸ“ˆ **Visualizes results** in an interactive dashboard
## 🎯 Features
### πŸ“Š Dashboard
- Real-time statistics on analyzed content
- Clarity score distribution charts
- Timeline of content and scores
- Category breakdown
### πŸ“ Content Browser
- Search and filter content by date, category, and clarity score
- View detailed analysis for each item
- Identify areas for improvement
### πŸ“ˆ Analytics
- Find low-clarity items that need improvement
- Track trends over time
- Export data for further analysis
### βš™οΈ Settings
- Manual content fetch trigger
- Database statistics
- Fetch logs and history
## πŸ”„ Automatic Updates
Content is fetched and analyzed automatically every 6 hours. The app runs continuously on Hugging Face Spaces with persistent data storage.
## πŸ› οΈ Technology Stack
- **Frontend**: Gradio 4.44
- **Database**: DuckDB (analytics-optimized, 16GB storage)
- **Scheduler**: APScheduler (background tasks)
- **Visualization**: Plotly
- **Analysis**: Aclarador language clarity analyzer
## πŸ’Ύ Data Storage
Data is stored in `/data/madrid.duckdb` which persists across Space restarts. The database can hold millions of content items within the 16GB Space storage limit.
## πŸš€ Usage
Simply visit the Space URL and explore the tabs:
1. **Dashboard** - See overall statistics and trends
2. **Browse Content** - Search and filter analyzed content
3. **Analytics** - Find improvement opportunities
4. **Settings** - Trigger manual updates, view logs
## πŸ“ˆ Data Sources
- **RSS Feed**: https://diario.madrid.es/feed
- **Open Data Portal**: https://datos.madrid.es/portal/site/egob
## πŸ”’ Privacy
This Space analyzes publicly available content from Madrid City Council. No personal data is collected or stored.
## πŸ“„ License
MIT License - Feel free to fork and adapt!
## 🀝 Contributing
Contributions are welcome! This project helps improve government communication clarity.
---
**Built with** πŸ€— Hugging Face Spaces | **Free forever** πŸ’°