AA-6055 commited on
Commit
09673a0
·
verified ·
1 Parent(s): 443a737

Upload readme.md

Browse files
Files changed (1) hide show
  1. readme.md +115 -0
readme.md ADDED
@@ -0,0 +1,115 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # News Sentiment Analysis & Hindi Text-to-Speech (TTS) Web Application
2
+
3
+ ## Objective
4
+ This project is a web-based application that:
5
+ - Extracts key details from multiple news articles about a **user-provided company**
6
+ - Performs **sentiment analysis**
7
+ - Conducts **comparative sentiment analysis**
8
+ - Generates a **Hindi Text-to-Speech (TTS)** audio report
9
+ - Provides a user-friendly interface for interaction
10
+
11
+ The tool allows users to input a company name and receive a **structured sentiment report** along with **audio output**.
12
+
13
+ ---
14
+
15
+ ## Features
16
+ 1. **News Extraction**: Scrapes and displays the title, summary, and metadata from at least 10 unique news articles (non-JavaScript pages) using **BeautifulSoup (bs4)**.
17
+ 2. **Sentiment Analysis**: Determines the sentiment (positive, negative, neutral) of each article.
18
+ 3. **Comparative Analysis**: Compares sentiment across articles to visualize sentiment distribution.
19
+ 4. **Hindi TTS Generation**: Converts summarized insights into Hindi audio using an open-source TTS model.
20
+ 5. **User Interface**: Simple web UI built with **Streamlit** or **Gradio**.
21
+ 6. **API Communication**: Frontend and backend communicate via **REST APIs**.
22
+ 7. **Deployment**: Live deployment on **Hugging Face Spaces**.
23
+ 8. **Documentation**: Complete setup and usage guide.
24
+
25
+ ---
26
+
27
+ ## Project Setup
28
+
29
+ ### Clone the Repository
30
+ ```bash
31
+ git clone https://github.com/yourusername/news-sentiment-tts.git
32
+ cd news-sentiment-tts
33
+ ```
34
+
35
+ ### Create Virtual Environment & Install Dependencies
36
+ ```bash
37
+ python -m venv venv
38
+ source venv/bin/activate # On Windows: venv\Scripts\activate
39
+ pip install -r requirements.txt
40
+ ```
41
+
42
+ ### Run the Application Locally
43
+ ```bash
44
+ streamlit run app.py
45
+ ```
46
+ ```bash
47
+ python api.py
48
+ ```
49
+ _or_
50
+ ```bash
51
+ python app.py # If using Gradio
52
+ ```
53
+
54
+ ### Deployment
55
+ The app is deployed on Hugging Face Spaces:
56
+ ```
57
+ https://huggingface.co/spaces/yourusername/news-sentiment-tts
58
+ ```
59
+
60
+ ---
61
+
62
+ ## Model Details
63
+
64
+ | Task | Model Used | Description |
65
+ |-----------------------|-------------------------------------|------------|
66
+ | **Summarization** | `transformers.pipeline('summarization')` | Hugging Face pre-trained model for article summary generation |
67
+ | **Sentiment Analysis**| `transformers.pipeline('sentiment-analysis')` | Classifies article sentiment into Positive/Negative/Neutral |
68
+ | **Text-to-Speech** | Coqui TTS / `indic-tts` | Open-source TTS model to generate Hindi speech from text |
69
+
70
+ ---
71
+
72
+ ## API Development & Usage
73
+
74
+ ### Backend APIs:
75
+ - **Endpoint**: `/analyze_news`
76
+ - **Method**: POST
77
+ - **Input**: `{"company": "Company Name"}`
78
+ - **Output**: List of news articles with metadata
79
+
80
+ ### Testing with Postman:
81
+ 1. Set API base URL as `http://localhost:8000/`
82
+ 2. Select appropriate endpoints and POST body
83
+ 3. Test responses for JSON or audio file streaming
84
+
85
+ ---
86
+
87
+ ## Third-Party API Usage
88
+ | API/Library | Purpose |
89
+ |------------------------|--------------------------------------------------------|
90
+ | **News scraping** | BeautifulSoup (no external news APIs used) |
91
+ | **Hugging Face models**| Sentiment analysis & summarization |
92
+ | **gtts** | Hindi Text-to-Speech generation |
93
+
94
+ ---
95
+
96
+ ## Assumptions & Limitations
97
+ - **Scraping Limitations**: Only static, non-JS websites are scraped due to `BeautifulSoup` limitations. Also newsorg pi is also used.
98
+ - **Article Count**: A minimum of 10 articles is targeted, but the count may vary based on search results.
99
+ - **Language Support**: Sentiment analysis is primarily in English, TTS is specifically Hindi.
100
+ - **Deployment**: Optimized for Hugging Face Spaces; heavy TTS models may experience latency. Due to flask being used hugging face is not able to run two files at a time
101
+
102
+ ---
103
+
104
+ ## Dependencies
105
+ ```text
106
+ beautifulsoup4
107
+ requests
108
+ transformers
109
+ torch
110
+ streamlit
111
+ gtts
112
+ scikit-learn
113
+ ```
114
+
115
+ ---