Alexander Oh commited on
Commit
097fe34
Β·
unverified Β·
2 Parent(s): d4f4ff7 eabeab3

Merge pull request #1 from alexoh2bd/initial-structure

Browse files
Files changed (10) hide show
  1. .env.example +1 -0
  2. .gitattributes +35 -0
  3. .gitignore +23 -0
  4. Dockerfile +20 -0
  5. README.md +187 -1
  6. config.json +32 -0
  7. requirements.txt +7 -0
  8. src/api_handler.py +272 -0
  9. src/cli_demo.py +181 -0
  10. src/streamlit_app.py +271 -0
.env.example ADDED
@@ -0,0 +1 @@
 
 
1
+ NEWSAPI_KEY=your_newsapi_key_here
.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
.gitignore ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Environment variables
2
+ .env
3
+
4
+ # Python cache
5
+ __pycache__/
6
+ *.pyc
7
+ *.pyo
8
+ *.pyd
9
+
10
+ # Virtual environment
11
+ .venv/
12
+ venv/
13
+
14
+ # IDE
15
+ .vscode/
16
+ .idea/
17
+
18
+ # OS
19
+ .DS_Store
20
+ Thumbs.db
21
+
22
+ # Streamlit
23
+ .streamlit/
Dockerfile ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.13.5-slim
2
+
3
+ WORKDIR /app
4
+
5
+ RUN apt-get update && apt-get install -y \
6
+ build-essential \
7
+ curl \
8
+ git \
9
+ && rm -rf /var/lib/apt/lists/*
10
+
11
+ COPY requirements.txt ./
12
+ COPY src/ ./src/
13
+
14
+ RUN pip3 install -r requirements.txt
15
+
16
+ EXPOSE 8501
17
+
18
+ HEALTHCHECK CMD curl --fail http://localhost:8501/_stcore/health
19
+
20
+ ENTRYPOINT ["streamlit", "run", "src/streamlit_app.py", "--server.port=8501", "--server.address=0.0.0.0"]
README.md CHANGED
@@ -1 +1,187 @@
1
- # BootcampFinalProject
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: AI News Sentiment Analyzer
3
+ emoji: πŸ€–
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: streamlit
7
+ sdk_version: "1.28.0"
8
+ app_file: src/streamlit_app.py
9
+ pinned: false
10
+ ---
11
+
12
+ # πŸ€– AI News Sentiment Analyzer
13
+
14
+ An interactive web application that fetches the latest AI-related news and analyzes the sentiment of headlines and articles. Built with Python, Streamlit, and powered by NewsAPI.
15
+
16
+ ## πŸš€ Live Demo
17
+
18
+ 🌐 **Try it live**: [https://huggingface.co/spaces/jonasneves/BootcampFinalProject](https://huggingface.co/spaces/jonasneves/BootcampFinalProject)
19
+
20
+ ## πŸ› οΈ Installation
21
+
22
+ ### Prerequisites
23
+ - Python 3.9+
24
+ - NewsAPI key (get free at [newsapi.org](https://newsapi.org))
25
+
26
+ ### Setup Instructions
27
+
28
+ 1. **Clone the repository**
29
+ ```bash
30
+ git clone https://github.com/alexoh2bd/BootcampFinalProject
31
+ cd BootcampFinalProject
32
+ ```
33
+
34
+ 2. **Create virtual environment**
35
+ ```bash
36
+ # macOS/Linux
37
+ python3 -m venv .venv
38
+ source .venv/bin/activate
39
+ ```
40
+
41
+ 3. **Install dependencies**
42
+ ```bash
43
+ pip install -r requirements.txt
44
+ ```
45
+
46
+ 4. **Set up environment variables**
47
+
48
+ Create a `.env` file in the project root:
49
+ ```bash
50
+ NEWSAPI_KEY=your_newsapi_key_here
51
+ ```
52
+
53
+ ## 🎯 Usage
54
+
55
+ ### Web Application
56
+
57
+ Run the Streamlit app:
58
+ ```bash
59
+ streamlit run streamlit_app.py
60
+ ```
61
+
62
+ Then open your browser to `http://localhost:8501`
63
+
64
+ ### Command Line Interface
65
+
66
+ For quick sentiment analysis:
67
+
68
+ ```bash
69
+ # Basic usage
70
+ python cli_demo.py
71
+
72
+ # Custom search query
73
+ python cli_demo.py --query "ChatGPT" --days 3
74
+
75
+ # Filter to specific sources
76
+ python cli_demo.py --sources "techcrunch,wired" --max-articles 5
77
+
78
+ # Show only positive articles
79
+ python cli_demo.py --positive-only
80
+
81
+ # Show detailed sentiment analysis
82
+ python cli_demo.py --sentiment-only
83
+ ```
84
+
85
+ #### CLI Options
86
+ - `--query, -q`: Search query (default: "artificial intelligence")
87
+ - `--days, -d`: Days to look back (default: 7)
88
+ - `--sources, -s`: Comma-separated news sources
89
+ - `--max-articles, -m`: Maximum articles to display (default: 10)
90
+ - `--positive-only`: Show only positive sentiment articles
91
+ - `--negative-only`: Show only negative sentiment articles
92
+ - `--sentiment-only`: Show only sentiment analysis summary
93
+
94
+ ## πŸ”§ Technical Architecture
95
+
96
+ ```mermaid
97
+ flowchart TB
98
+ subgraph Frontend["🎨 Frontend Layer"]
99
+ A["🌐 Streamlit UI"]
100
+ B["πŸ’» CLI Interface"]
101
+ end
102
+
103
+ subgraph Application["βš™οΈ Application Layer"]
104
+ C["api_handler.py<br/>πŸ”§ Core Logic"]
105
+ D["streamlit_app.py<br/>πŸ“Š Web Framework"]
106
+ E["cli_demo.py<br/>⌨️ Command Line"]
107
+ end
108
+
109
+ subgraph Processing["🧠 Data Processing"]
110
+ F["TextBlob<br/>Sentiment Engine"]
111
+ G["Plotly<br/>Visualizations"]
112
+ H["Pandas<br/>Data Processing"]
113
+ end
114
+
115
+ subgraph External["🌐 External Services"]
116
+ I["πŸ“‘ NewsAPI<br/>TechCrunch, Wired, etc."]
117
+ J["πŸ” Environment<br/>API Keys"]
118
+ end
119
+
120
+ A --> D
121
+ B --> E
122
+ D --> C
123
+ E --> C
124
+ C --> F
125
+ C --> H
126
+ D --> G
127
+ C --> I
128
+ C --> J
129
+
130
+ classDef frontend fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
131
+ classDef application fill:#fff3e0,stroke:#f57c00,stroke-width:2px
132
+ classDef processing fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
133
+ classDef external fill:#fce4ec,stroke:#c2185b,stroke-width:2px
134
+
135
+ class A,B frontend
136
+ class C,D,E application
137
+ class F,G,H processing
138
+ class I,J external
139
+ ```
140
+
141
+ ## πŸ“ˆ Example Output
142
+
143
+ ### CLI Example
144
+ ```bash
145
+ πŸ€– AI News Sentiment Analyzer
146
+ ==================================================
147
+
148
+ πŸ” Searching for: "artificial intelligence"
149
+ πŸ“… Looking back: 7 days
150
+
151
+ πŸ“° Found 43 articles
152
+
153
+ Sentiment Distribution:
154
+ 😊 Positive: 18 articles (41.9%)
155
+ 😐 Neutral: 15 articles (34.9%)
156
+ 😞 Negative: 10 articles (23.2%)
157
+
158
+ πŸ“„ Top 10 Articles:
159
+ --------------------------------------------------------------------------------
160
+ 1. 😊 [TechCrunch] 2024-01-20 14:30
161
+ AI startup raises $50M for breakthrough in healthcare diagnosis
162
+ Sentiment: Positive (Score: 0.45)
163
+ πŸ“ Revolutionary AI technology promises to transform medical diagnosis...
164
+ πŸ”— https://techcrunch.com/...
165
+
166
+ 2. 😞 [Reuters] 2024-01-20 12:15
167
+ Concerns grow over AI job displacement in manufacturing
168
+ Sentiment: Negative (Score: -0.32)
169
+ πŸ“ Labor unions express worry about automation replacing workers...
170
+ πŸ”— https://reuters.com/...
171
+ ```
172
+
173
+ ## 🀝 Contributing
174
+
175
+ This project was built as part of the Duke AIPI 503 Bootcamp.
176
+
177
+ ### Development Setup
178
+
179
+ 1. Fork the repository
180
+ 2. Create a feature branch: `git checkout -b feature/some-feature`
181
+ 3. Make your changes and commit: `git commit -m 'Add some feature'`
182
+ 4. Push to the branch: `git push origin feature/some-feature`
183
+ 5. Open a Pull Request
184
+
185
+ ## πŸ“ License
186
+
187
+ This project is licensed under the MIT License - see the LICENSE file for details.
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "search_queries": [
3
+ "artificial intelligence",
4
+ "machine learning",
5
+ "ChatGPT",
6
+ "OpenAI",
7
+ "deep learning",
8
+ "neural networks",
9
+ "AI ethics",
10
+ "robotics",
11
+ "computer vision",
12
+ "natural language processing"
13
+ ],
14
+ "news_sources": {
15
+ "tech_media": "techcrunch,wired,ars-technica,the-verge,engadget",
16
+ "general_news": "reuters,associated-press,bbc-news",
17
+ "us_news": "cnn,fox-news,abc-news",
18
+ "financial_news": "financial-times,wall-street-journal,bloomberg"
19
+ },
20
+ "source_categories": [
21
+ "All Sources",
22
+ "Tech Media",
23
+ "General News",
24
+ "US News",
25
+ "Financial News"
26
+ ],
27
+ "test_texts": [
28
+ "AI breakthrough promises to revolutionize healthcare",
29
+ "Concerns grow over AI job displacement",
30
+ "New machine learning model shows mixed results"
31
+ ]
32
+ }
requirements.txt ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ streamlit>=1.28.0
2
+ pandas>=2.0.0
3
+ requests>=2.31.0
4
+ python-dotenv>=1.0.0
5
+ textblob>=0.17.1
6
+ plotly>=5.15.0
7
+ numpy>=1.24.0
src/api_handler.py ADDED
@@ -0,0 +1,272 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ AI News API Handler
3
+ Fetches AI-related news from NewsAPI and performs sentiment analysis
4
+ """
5
+ import requests
6
+ import pandas as pd
7
+ from datetime import datetime, timedelta
8
+ import os
9
+ import json
10
+ from dotenv import load_dotenv
11
+ from textblob import TextBlob
12
+ from typing import List, Dict, Optional
13
+
14
+ # Load environment variables
15
+ load_dotenv()
16
+
17
+ class AINewsAnalyzer:
18
+ def __init__(self):
19
+ self.api_key = os.getenv('NEWSAPI_KEY')
20
+ self.base_url = "https://newsapi.org/v2/everything"
21
+
22
+ if not self.api_key:
23
+ raise ValueError("NewsAPI key not found. Please set NEWSAPI_KEY in your .env file")
24
+
25
+ def fetch_ai_news(self,
26
+ query: str = "artificial intelligence",
27
+ days: int = 7,
28
+ language: str = "en",
29
+ sources: Optional[str] = None,
30
+ page_size: int = 100) -> List[Dict]:
31
+ """
32
+ Fetch AI-related news from NewsAPI
33
+
34
+ Args:
35
+ query: Search query for news articles
36
+ days: Number of days to look back
37
+ language: Language code (default: "en")
38
+ sources: Comma-separated string of news sources
39
+ page_size: Number of articles to fetch (max 100)
40
+
41
+ Returns:
42
+ List of news articles with metadata
43
+ """
44
+ # Calculate date range
45
+ to_date = datetime.now()
46
+ from_date = to_date - timedelta(days=days)
47
+
48
+ # Prepare API parameters
49
+ params = {
50
+ 'q': query,
51
+ 'from': from_date.strftime('%Y-%m-%d'),
52
+ 'to': to_date.strftime('%Y-%m-%d'),
53
+ 'language': language,
54
+ 'sortBy': 'publishedAt',
55
+ 'pageSize': page_size,
56
+ 'apiKey': self.api_key
57
+ }
58
+
59
+ # Add sources if specified
60
+ if sources:
61
+ params['sources'] = sources
62
+
63
+ try:
64
+ # Make API request
65
+ response = requests.get(self.base_url, params=params)
66
+ response.raise_for_status()
67
+
68
+ data = response.json()
69
+
70
+ if data['status'] == 'ok':
71
+ return data['articles']
72
+ else:
73
+ print(f"API Error: {data.get('message', 'Unknown error')}")
74
+ return []
75
+
76
+ except requests.exceptions.RequestException as e:
77
+ print(f"Request failed: {e}")
78
+ return []
79
+
80
+ def analyze_sentiment(self, text: str) -> Dict:
81
+ """
82
+ Analyze sentiment of given text using TextBlob
83
+
84
+ Args:
85
+ text: Text to analyze
86
+
87
+ Returns:
88
+ Dictionary with sentiment metrics
89
+ """
90
+ if not text:
91
+ return {
92
+ 'polarity': 0.0,
93
+ 'subjectivity': 0.0,
94
+ 'label': 'neutral',
95
+ 'confidence': 0.0
96
+ }
97
+
98
+ blob = TextBlob(text)
99
+ polarity = blob.sentiment.polarity
100
+ subjectivity = blob.sentiment.subjectivity
101
+
102
+ # Determine sentiment label
103
+ if polarity > 0.1:
104
+ label = 'positive'
105
+ elif polarity < -0.1:
106
+ label = 'negative'
107
+ else:
108
+ label = 'neutral'
109
+
110
+ # Calculate confidence (distance from neutral)
111
+ confidence = abs(polarity)
112
+
113
+ return {
114
+ 'polarity': polarity,
115
+ 'subjectivity': subjectivity,
116
+ 'label': label,
117
+ 'confidence': confidence
118
+ }
119
+
120
+ def process_news_articles(self, articles: List[Dict]) -> pd.DataFrame:
121
+ """
122
+ Process news articles and add sentiment analysis
123
+
124
+ Args:
125
+ articles: List of news articles from API
126
+
127
+ Returns:
128
+ DataFrame with processed articles and sentiment data
129
+ """
130
+ processed_articles = []
131
+
132
+ for article in articles:
133
+ # Skip articles with missing essential data
134
+ if not article.get('title') or not article.get('publishedAt'):
135
+ continue
136
+
137
+ # Analyze sentiment of title and description
138
+ title_sentiment = self.analyze_sentiment(article['title'])
139
+ description_sentiment = self.analyze_sentiment(article.get('description', ''))
140
+
141
+ # Combine title and description sentiment (weighted toward title)
142
+ combined_polarity = (title_sentiment['polarity'] * 0.7 +
143
+ description_sentiment['polarity'] * 0.3)
144
+ combined_subjectivity = (title_sentiment['subjectivity'] * 0.7 +
145
+ description_sentiment['subjectivity'] * 0.3)
146
+
147
+ # Determine overall sentiment
148
+ if combined_polarity > 0.1:
149
+ overall_sentiment = 'positive'
150
+ elif combined_polarity < -0.1:
151
+ overall_sentiment = 'negative'
152
+ else:
153
+ overall_sentiment = 'neutral'
154
+
155
+ processed_article = {
156
+ 'title': article['title'],
157
+ 'description': article.get('description', ''),
158
+ 'url': article['url'],
159
+ 'source': article['source']['name'],
160
+ 'published_at': article['publishedAt'],
161
+ 'author': article.get('author', 'Unknown'),
162
+ 'sentiment_label': overall_sentiment,
163
+ 'sentiment_polarity': combined_polarity,
164
+ 'sentiment_subjectivity': combined_subjectivity,
165
+ 'title_sentiment': title_sentiment['label'],
166
+ 'title_polarity': title_sentiment['polarity'],
167
+ 'description_sentiment': description_sentiment['label'],
168
+ 'description_polarity': description_sentiment['polarity']
169
+ }
170
+
171
+ processed_articles.append(processed_article)
172
+
173
+ # Convert to DataFrame
174
+ df = pd.DataFrame(processed_articles)
175
+
176
+ # Convert published_at to datetime
177
+ if not df.empty:
178
+ df['published_at'] = pd.to_datetime(df['published_at'])
179
+ df = df.sort_values('published_at', ascending=False)
180
+
181
+ return df
182
+
183
+ def get_ai_news_with_sentiment(self,
184
+ query: str = "artificial intelligence",
185
+ days: int = 7,
186
+ sources: Optional[str] = None) -> pd.DataFrame:
187
+ """
188
+ Complete pipeline: fetch news and analyze sentiment
189
+
190
+ Args:
191
+ query: Search query for news articles
192
+ days: Number of days to look back
193
+ sources: Comma-separated string of news sources
194
+
195
+ Returns:
196
+ DataFrame with news articles and sentiment analysis
197
+ """
198
+ print(f"Fetching {query} news from the last {days} days...")
199
+
200
+ # Fetch articles
201
+ articles = self.fetch_ai_news(query=query, days=days, sources=sources)
202
+
203
+ if not articles:
204
+ print("No articles found.")
205
+ return pd.DataFrame()
206
+
207
+ print(f"Found {len(articles)} articles. Analyzing sentiment...")
208
+
209
+ # Process and analyze
210
+ df = self.process_news_articles(articles)
211
+
212
+ print(f"Processed {len(df)} articles with sentiment analysis.")
213
+ return df
214
+
215
+ def fetch_ai_news(query="artificial intelligence", days=7, sources=None):
216
+ """Standalone function to fetch AI news"""
217
+ analyzer = AINewsAnalyzer()
218
+ return analyzer.fetch_ai_news(query, days, sources=sources)
219
+
220
+ def analyze_sentiment(text):
221
+ """Standalone function to analyze sentiment"""
222
+ analyzer = AINewsAnalyzer()
223
+ return analyzer.analyze_sentiment(text)
224
+
225
+ def get_ai_news_with_sentiment(query="artificial intelligence", days=7, sources=None):
226
+ """Standalone function for complete pipeline"""
227
+ analyzer = AINewsAnalyzer()
228
+ return analyzer.get_ai_news_with_sentiment(query, days, sources)
229
+
230
+ def load_config():
231
+ """Load configuration from config.json"""
232
+ with open('config.json', 'r') as f:
233
+ return json.load(f)
234
+
235
+ if __name__ == "__main__":
236
+ # Test the API when run directly
237
+ analyzer = AINewsAnalyzer()
238
+ config = load_config()
239
+
240
+ print("Testing AI News Sentiment Analyzer...")
241
+ print("=" * 50)
242
+
243
+ # Test sentiment analysis
244
+ test_texts = config["test_texts"]
245
+
246
+ print("\nSentiment Analysis Examples:")
247
+ for text in test_texts:
248
+ sentiment = analyzer.analyze_sentiment(text)
249
+ print(f"Text: {text}")
250
+ print(f"Sentiment: {sentiment['label']} (polarity: {sentiment['polarity']:.2f})")
251
+ print()
252
+
253
+ # Test news fetching
254
+ print("Fetching recent AI news...")
255
+ df = analyzer.get_ai_news_with_sentiment(days=3)
256
+
257
+ if not df.empty:
258
+ print(f"\nFound {len(df)} articles")
259
+ print("\nSentiment Distribution:")
260
+ print(df['sentiment_label'].value_counts())
261
+
262
+ print("\nTop 3 Most Positive Headlines:")
263
+ positive_articles = df[df['sentiment_label'] == 'positive'].nlargest(3, 'sentiment_polarity')
264
+ for _, article in positive_articles.iterrows():
265
+ print(f"πŸ“ˆ {article['title']} (Score: {article['sentiment_polarity']:.2f})")
266
+
267
+ print("\nTop 3 Most Negative Headlines:")
268
+ negative_articles = df[df['sentiment_label'] == 'negative'].nsmallest(3, 'sentiment_polarity')
269
+ for _, article in negative_articles.iterrows():
270
+ print(f"πŸ“‰ {article['title']} (Score: {article['sentiment_polarity']:.2f})")
271
+ else:
272
+ print("No articles found. Check your API key and internet connection.")
src/cli_demo.py ADDED
@@ -0,0 +1,181 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ CLI Demo for AI News Sentiment Analyzer
4
+ Demonstrates the functionality via command line interface
5
+ """
6
+
7
+ import argparse
8
+ import sys
9
+ from datetime import datetime
10
+ from api_handler import AINewsAnalyzer
11
+
12
+ def print_header():
13
+ """Print a nice header for the CLI"""
14
+ print("πŸ€– AI News Sentiment Analyzer")
15
+ print("=" * 50)
16
+ print()
17
+
18
+ def print_sentiment_emoji(sentiment):
19
+ """Return emoji based on sentiment"""
20
+ emoji_map = {
21
+ 'positive': '😊',
22
+ 'negative': '😞',
23
+ 'neutral': '😐'
24
+ }
25
+ return emoji_map.get(sentiment, '🀷')
26
+
27
+ def display_articles(df, max_articles=10):
28
+ """Display articles in a formatted way"""
29
+ if df.empty:
30
+ print("❌ No articles found.")
31
+ return
32
+
33
+ print(f"πŸ“° Found {len(df)} articles")
34
+ print("\nSentiment Distribution:")
35
+ sentiment_counts = df['sentiment_label'].value_counts()
36
+ for sentiment, count in sentiment_counts.items():
37
+ emoji = print_sentiment_emoji(sentiment)
38
+ percentage = (count / len(df)) * 100
39
+ print(f" {emoji} {sentiment.title()}: {count} articles ({percentage:.1f}%)")
40
+
41
+ print(f"\nπŸ“„ Top {min(max_articles, len(df))} Articles:")
42
+ print("-" * 80)
43
+
44
+ for idx, (_, article) in enumerate(df.head(max_articles).iterrows(), 1):
45
+ sentiment_emoji = print_sentiment_emoji(article['sentiment_label'])
46
+ score = article['sentiment_polarity']
47
+ published = article['published_at'].strftime('%Y-%m-%d %H:%M')
48
+
49
+ print(f"{idx:2}. {sentiment_emoji} [{article['source']}] {published}")
50
+ print(f" {article['title']}")
51
+ print(f" Sentiment: {article['sentiment_label'].title()} (Score: {score:.2f})")
52
+ if article['description'] and len(article['description']) > 100:
53
+ description = article['description'][:100] + "..."
54
+ else:
55
+ description = article['description'] or "No description available"
56
+ print(f" πŸ“ {description}")
57
+ print(f" πŸ”— {article['url']}")
58
+ print()
59
+
60
+ def display_sentiment_analysis(df):
61
+ """Display detailed sentiment analysis"""
62
+ if df.empty:
63
+ return
64
+
65
+ print("\nπŸ“Š Sentiment Analysis Summary:")
66
+ print("-" * 40)
67
+
68
+ # Overall statistics
69
+ avg_polarity = df['sentiment_polarity'].mean()
70
+ avg_subjectivity = df['sentiment_subjectivity'].mean()
71
+
72
+ print(f"Average Polarity: {avg_polarity:.3f} (Range: -1.0 to +1.0)")
73
+ print(f"Average Subjectivity: {avg_subjectivity:.3f} (Range: 0.0 to 1.0)")
74
+
75
+ if avg_polarity > 0.1:
76
+ overall_mood = "πŸ“ˆ Generally Positive"
77
+ elif avg_polarity < -0.1:
78
+ overall_mood = "πŸ“‰ Generally Negative"
79
+ else:
80
+ overall_mood = "➑️ Generally Neutral"
81
+
82
+ print(f"Overall Mood: {overall_mood}")
83
+
84
+ # Most positive and negative articles
85
+ if len(df[df['sentiment_label'] == 'positive']) > 0:
86
+ most_positive = df.loc[df['sentiment_polarity'].idxmax()]
87
+ print(f"\n😊 Most Positive: \"{most_positive['title']}\" ({most_positive['sentiment_polarity']:.2f})")
88
+
89
+ if len(df[df['sentiment_label'] == 'negative']) > 0:
90
+ most_negative = df.loc[df['sentiment_polarity'].idxmin()]
91
+ print(f"😞 Most Negative: \"{most_negative['title']}\" ({most_negative['sentiment_polarity']:.2f})")
92
+
93
+ def display_sources(df):
94
+ """Display source breakdown"""
95
+ if df.empty:
96
+ return
97
+
98
+ print("\nπŸ“Ί News Sources:")
99
+ print("-" * 30)
100
+ source_counts = df['source'].value_counts()
101
+ for source, count in source_counts.head(10).items():
102
+ print(f" πŸ“° {source}: {count} articles")
103
+
104
+ def main():
105
+ parser = argparse.ArgumentParser(description='AI News Sentiment Analyzer CLI Demo')
106
+ parser.add_argument('--query', '-q',
107
+ default='artificial intelligence',
108
+ help='Search query for news articles (default: "artificial intelligence")')
109
+ parser.add_argument('--days', '-d',
110
+ type=int,
111
+ default=7,
112
+ help='Number of days to look back (default: 7)')
113
+ parser.add_argument('--sources', '-s',
114
+ help='Comma-separated list of news sources (e.g., "techcrunch,wired")')
115
+ parser.add_argument('--max-articles', '-m',
116
+ type=int,
117
+ default=10,
118
+ help='Maximum number of articles to display (default: 10)')
119
+ parser.add_argument('--sentiment-only',
120
+ action='store_true',
121
+ help='Show only sentiment analysis summary')
122
+ parser.add_argument('--positive-only',
123
+ action='store_true',
124
+ help='Show only positive articles')
125
+ parser.add_argument('--negative-only',
126
+ action='store_true',
127
+ help='Show only negative articles')
128
+
129
+ args = parser.parse_args()
130
+
131
+ print_header()
132
+
133
+ try:
134
+ # Initialize analyzer
135
+ analyzer = AINewsAnalyzer()
136
+
137
+ print(f"πŸ” Searching for: \"{args.query}\"")
138
+ print(f"πŸ“… Looking back: {args.days} days")
139
+ if args.sources:
140
+ print(f"πŸ“° Sources: {args.sources}")
141
+ print()
142
+
143
+ # Fetch and analyze news
144
+ df = analyzer.get_ai_news_with_sentiment(
145
+ query=args.query,
146
+ days=args.days,
147
+ sources=args.sources
148
+ )
149
+
150
+ if df.empty:
151
+ print("❌ No articles found. Try adjusting your search parameters.")
152
+ return
153
+
154
+ # Filter by sentiment if requested
155
+ if args.positive_only:
156
+ df = df[df['sentiment_label'] == 'positive']
157
+ print("πŸ”½ Filtered to show only POSITIVE articles")
158
+ elif args.negative_only:
159
+ df = df[df['sentiment_label'] == 'negative']
160
+ print("πŸ”½ Filtered to show only NEGATIVE articles")
161
+
162
+ # Display results based on options
163
+ if args.sentiment_only:
164
+ display_sentiment_analysis(df)
165
+ else:
166
+ display_articles(df, args.max_articles)
167
+ display_sentiment_analysis(df)
168
+ display_sources(df)
169
+
170
+ print(f"\nβœ… Analysis complete! Processed {len(df)} articles.")
171
+
172
+ except KeyboardInterrupt:
173
+ print("\nπŸ‘‹ Analysis interrupted by user.")
174
+ sys.exit(0)
175
+ except Exception as e:
176
+ print(f"❌ Error occurred: {e}")
177
+ print("Please check your API key and internet connection.")
178
+ sys.exit(1)
179
+
180
+ if __name__ == "__main__":
181
+ main()
src/streamlit_app.py ADDED
@@ -0,0 +1,271 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ AI News Sentiment Analyzer - Streamlit Web Application
3
+ Interactive dashboard for analyzing sentiment of AI-related news
4
+ """
5
+
6
+ import streamlit as st
7
+ import pandas as pd
8
+ import plotly.express as px
9
+ import json
10
+ from api_handler import AINewsAnalyzer
11
+
12
+ # Page configuration
13
+ st.set_page_config(
14
+ page_title="AI News Sentiment Analyzer",
15
+ page_icon="πŸ€–",
16
+ layout="wide",
17
+ initial_sidebar_state="expanded"
18
+ )
19
+
20
+ # Custom CSS for better styling
21
+ st.markdown("""
22
+ <style>
23
+ .main-header {
24
+ font-size: 2.5rem;
25
+ font-weight: bold;
26
+ color: #1f77b4;
27
+ text-align: center;
28
+ margin-bottom: 2rem;
29
+ }
30
+ .metric-card {
31
+ background-color: #f0f2f6;
32
+ padding: 1rem;
33
+ border-radius: 0.5rem;
34
+ border-left: 5px solid #1f77b4;
35
+ }
36
+ .positive { color: #28a745; }
37
+ .negative { color: #dc3545; }
38
+ .neutral { color: #6c757d; }
39
+ </style>
40
+ """, unsafe_allow_html=True)
41
+
42
+ @st.cache_data(ttl=1800) # Cache for 30 minutes
43
+ def load_config():
44
+ """Load configuration from config.json"""
45
+ with open('config.json', 'r') as f:
46
+ return json.load(f)
47
+
48
+ @st.cache_data(ttl=1800) # Cache for 30 minutes
49
+ def load_news_data(query, days, sources=None):
50
+ """Load and cache news data"""
51
+ try:
52
+ analyzer = AINewsAnalyzer()
53
+ df = analyzer.get_ai_news_with_sentiment(query=query, days=days, sources=sources)
54
+ return df, None
55
+ except Exception as e:
56
+ return pd.DataFrame(), str(e)
57
+
58
+
59
+ def create_sentiment_distribution(df):
60
+ """Create sentiment distribution pie chart"""
61
+ if df.empty:
62
+ return None
63
+
64
+ sentiment_counts = df['sentiment_label'].value_counts()
65
+
66
+ fig = px.pie(
67
+ values=sentiment_counts.values,
68
+ names=sentiment_counts.index,
69
+ title="🎯 Sentiment Distribution",
70
+ color_discrete_map={
71
+ 'positive': '#28a745',
72
+ 'negative': '#dc3545',
73
+ 'neutral': '#6c757d'
74
+ }
75
+ )
76
+
77
+ fig.update_traces(textposition='inside', textinfo='percent+label')
78
+ return fig
79
+
80
+ def create_source_analysis(df):
81
+ """Create source analysis chart"""
82
+ if df.empty:
83
+ return None
84
+
85
+ source_sentiment = df.groupby(['source', 'sentiment_label']).size().unstack(fill_value=0)
86
+ source_sentiment = source_sentiment.loc[source_sentiment.sum(axis=1).nlargest(10).index]
87
+
88
+ fig = px.bar(
89
+ source_sentiment.reset_index(),
90
+ x='source',
91
+ y=['positive', 'negative', 'neutral'],
92
+ title="πŸ“° Sentiment by News Source (Top 10)",
93
+ color_discrete_map={
94
+ 'positive': '#28a745',
95
+ 'negative': '#dc3545',
96
+ 'neutral': '#6c757d'
97
+ }
98
+ )
99
+
100
+ fig.update_layout(
101
+ xaxis_title="News Source",
102
+ yaxis_title="Number of Articles",
103
+ xaxis_tickangle=-45
104
+ )
105
+
106
+ return fig
107
+
108
+ def create_polarity_distribution(df):
109
+ """Create sentiment polarity distribution"""
110
+ if df.empty:
111
+ return None
112
+
113
+ fig = px.histogram(
114
+ df,
115
+ x='sentiment_polarity',
116
+ nbins=30,
117
+ title="πŸ“Š Sentiment Polarity Distribution",
118
+ labels={'sentiment_polarity': 'Sentiment Polarity', 'count': 'Number of Articles'}
119
+ )
120
+
121
+ # Add vertical lines for sentiment boundaries
122
+ fig.add_vline(x=0.1, line_dash="dash", line_color="green", annotation_text="Positive Threshold")
123
+ fig.add_vline(x=-0.1, line_dash="dash", line_color="red", annotation_text="Negative Threshold")
124
+ fig.add_vline(x=0, line_dash="dash", line_color="gray", annotation_text="Neutral")
125
+
126
+ return fig
127
+
128
+
129
+ def main():
130
+ # Header
131
+ st.markdown("<h1 class='main-header'>πŸ€– AI News Sentiment Analyzer</h1>", unsafe_allow_html=True)
132
+ st.markdown("### Discover the sentiment trends in AI-related news from around the world")
133
+
134
+ # Load configuration
135
+ config = load_config()
136
+
137
+ # Sidebar controls
138
+ st.sidebar.header("πŸ”§ Analysis Settings")
139
+
140
+ # Query input
141
+ query_options = config["search_queries"]
142
+
143
+ selected_query = st.sidebar.selectbox(
144
+ "πŸ“ Search Topic:",
145
+ options=query_options,
146
+ index=0
147
+ )
148
+
149
+ custom_query = st.sidebar.text_input(
150
+ "Or enter custom search:",
151
+ placeholder="e.g., 'generative AI'"
152
+ )
153
+
154
+ # Use custom query if provided
155
+ final_query = custom_query if custom_query else selected_query
156
+
157
+ # Time range
158
+ days = st.sidebar.slider(
159
+ "πŸ“… Days to analyze:",
160
+ min_value=1,
161
+ max_value=30,
162
+ value=7,
163
+ help="How many days back to search for news"
164
+ )
165
+
166
+ # News sources from config
167
+ news_sources = config["news_sources"]
168
+
169
+ source_option = st.sidebar.selectbox(
170
+ "πŸ“° Source Category:",
171
+ options=config["source_categories"],
172
+ index=0
173
+ )
174
+
175
+ if source_option == "Tech Media":
176
+ sources = news_sources["tech_media"]
177
+ elif source_option == "General News":
178
+ sources = news_sources["general_news"]
179
+ elif source_option == "US News":
180
+ sources = news_sources["us_news"]
181
+ elif source_option == "Financial News":
182
+ sources = news_sources["financial_news"]
183
+ else:
184
+ sources = None
185
+
186
+ # Load data
187
+ if st.sidebar.button("πŸš€ Analyze News", type="primary"):
188
+ with st.spinner(f"Fetching and analyzing news about '{final_query}'..."):
189
+ df, error = load_news_data(final_query, days, sources)
190
+
191
+ if error:
192
+ st.error(f"Error loading data: {error}")
193
+ st.stop()
194
+
195
+ if df.empty:
196
+ st.warning("No articles found. Try adjusting your search parameters.")
197
+ st.stop()
198
+
199
+ # Store results in session state
200
+ st.session_state.df = df
201
+ st.session_state.query = final_query
202
+ st.session_state.days = days
203
+
204
+ # Display results if data is available
205
+ if 'df' in st.session_state:
206
+ df = st.session_state.df
207
+
208
+ # Summary metrics
209
+ st.markdown("### πŸ“Š Analysis Summary")
210
+ col1, col2, col3, col4 = st.columns(4)
211
+
212
+ with col1:
213
+ st.metric("πŸ“° Total Articles", len(df))
214
+
215
+ with col2:
216
+ avg_polarity = df['sentiment_polarity'].mean()
217
+ delta_polarity = f"{avg_polarity:+.3f}"
218
+ st.metric("🎭 Avg Sentiment", f"{avg_polarity:.3f}", delta_polarity)
219
+
220
+ with col3:
221
+ positive_pct = (len(df[df['sentiment_label'] == 'positive']) / len(df) * 100)
222
+ st.metric("😊 Positive %", f"{positive_pct:.1f}%")
223
+
224
+ with col4:
225
+ unique_sources = df['source'].nunique()
226
+ st.metric("πŸ“Ί News Sources", unique_sources)
227
+
228
+ # Charts
229
+ st.markdown("### πŸ“ˆ Visual Analysis")
230
+
231
+ # Row 1: Distribution and source analysis
232
+ col1, col2 = st.columns(2)
233
+
234
+ with col1:
235
+ dist_fig = create_sentiment_distribution(df)
236
+ if dist_fig:
237
+ st.plotly_chart(dist_fig, use_container_width=True)
238
+
239
+ with col2:
240
+ source_fig = create_source_analysis(df)
241
+ if source_fig:
242
+ st.plotly_chart(source_fig, use_container_width=True)
243
+
244
+ # Row 2: Polarity distribution (full width)
245
+ polarity_fig = create_polarity_distribution(df)
246
+ if polarity_fig:
247
+ st.plotly_chart(polarity_fig, use_container_width=True)
248
+
249
+
250
+ else:
251
+ # Welcome message
252
+ st.info("πŸ‘‹ Welcome! Configure your analysis settings in the sidebar and click 'Analyze News' to get started.")
253
+
254
+ # Sample visualization or instructions
255
+ st.markdown("""
256
+ ### πŸš€ How to Use:
257
+
258
+ 1. **Choose a topic** from the dropdown or enter your own search term
259
+ 2. **Select time range** (1-30 days) to analyze recent news
260
+ 3. **Pick news sources** or leave as 'All Sources' for comprehensive coverage
261
+ 4. **Click 'Analyze News'** to fetch and analyze articles
262
+
263
+ ### πŸ“Š What You'll Get:
264
+
265
+ - **Sentiment Analysis** of headlines and descriptions
266
+ - **Interactive Charts** showing trends over time
267
+ - **Source Breakdown** to see which outlets cover your topic
268
+ """)
269
+
270
+ if __name__ == "__main__":
271
+ main()