File size: 2,524 Bytes
8fbd17f
28aa7d9
 
 
 
8fbd17f
28aa7d9
8fbd17f
 
28aa7d9
8fbd17f
 
28aa7d9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
---
title: Madrid Content Analyzer
emoji: πŸ›οΈ
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit
---

# πŸ›οΈ Madrid Content Analyzer

Automated analysis of language clarity in Madrid City Council communications.

## πŸ“‹ What It Does

This application:
- πŸ“₯ **Fetches content** from Madrid City Council RSS feeds and Open Data Portal
- πŸ” **Analyzes language clarity** using the Aclarador system
- πŸ“Š **Tracks trends** over time in a DuckDB database
- πŸ“ˆ **Visualizes results** in an interactive dashboard

## 🎯 Features

### πŸ“Š Dashboard
- Real-time statistics on analyzed content
- Clarity score distribution charts
- Timeline of content and scores
- Category breakdown

### πŸ“ Content Browser
- Search and filter content by date, category, and clarity score
- View detailed analysis for each item
- Identify areas for improvement

### πŸ“ˆ Analytics
- Find low-clarity items that need improvement
- Track trends over time
- Export data for further analysis

### βš™οΈ Settings
- Manual content fetch trigger
- Database statistics
- Fetch logs and history

## πŸ”„ Automatic Updates

Content is fetched and analyzed automatically every 6 hours. The app runs continuously on Hugging Face Spaces with persistent data storage.

## πŸ› οΈ Technology Stack

- **Frontend**: Gradio 4.44
- **Database**: DuckDB (analytics-optimized, 16GB storage)
- **Scheduler**: APScheduler (background tasks)
- **Visualization**: Plotly
- **Analysis**: Aclarador language clarity analyzer

## πŸ’Ύ Data Storage

Data is stored in `/data/madrid.duckdb` which persists across Space restarts. The database can hold millions of content items within the 16GB Space storage limit.

## πŸš€ Usage

Simply visit the Space URL and explore the tabs:

1. **Dashboard** - See overall statistics and trends
2. **Browse Content** - Search and filter analyzed content
3. **Analytics** - Find improvement opportunities
4. **Settings** - Trigger manual updates, view logs

## πŸ“ˆ Data Sources

- **RSS Feed**: https://diario.madrid.es/feed
- **Open Data Portal**: https://datos.madrid.es/portal/site/egob

## πŸ”’ Privacy

This Space analyzes publicly available content from Madrid City Council. No personal data is collected or stored.

## πŸ“„ License

MIT License - Feel free to fork and adapt!

## 🀝 Contributing

Contributions are welcome! This project helps improve government communication clarity.

---

**Built with** πŸ€— Hugging Face Spaces | **Free forever** πŸ’°