vishnu-coder commited on
Commit
d76a82f
Β·
1 Parent(s): e80c084

Added Hugging Face Space metadata

Browse files
Files changed (1) hide show
  1. README.md +24 -184
README.md CHANGED
@@ -1,184 +1,24 @@
1
- πŸ’Ό Twitter Sentiment Intelligence
2
- The solution spans data ingestion, preprocessing, model training, governance, deployment readiness, and business storytelling β€” all within a lightweight, reproducible ML pipeline.
3
-
4
- 🧰 Technology Stack & Analytics Capabilities
5
-
6
- Python, Pandas, Scikit-learn, NLTK – rapid experimentation & production-ready sentiment pipelines
7
-
8
- TF-IDF Vectorization + Multiclass Logistic Regression – explainable predictions across positive, neutral, negative classes
9
-
10
- Streamlit UI (app/app.py) – displays class probabilities & narrative insights for stakeholders
11
-
12
- Notebook-driven exploration (notebooks/) – retraining lineage and model interpretability
13
-
14
- Container-ready CI/CD pipeline – automates tests, linting, and retraining
15
-
16
- 🧭 Why This Project Stands Out
17
-
18
- Oracle ecosystem alignment – optional ingestion from Oracle Autonomous Database with OCI deployment playbooks
19
-
20
- Consulting-grade engineering – modular Python package, logging, configuration management, GitHub Actions CI
21
-
22
- Business storytelling – dashboards, KPIs, and executive-friendly insights translating data into measurable ROI
23
-
24
- πŸ”— Live Resources & Quick Links
25
- Asset Purpose Link
26
- Live Streamlit Demo Interview-ready interactive dashboard https://<your-app>.streamlit.app
27
-
28
- Architecture Blueprint Deep dive on data, model & DevOps layers docs/architecture.md
29
-
30
- OCI Deployment Playbook Containerized rollout on Oracle Cloud deployment/oracle_cloud.md
31
-
32
- Streamlit Cloud Guide Launch public URL in minutes deployment/streamlit_cloud.md
33
-
34
- Vercel Redirect Playbook Vanity domain / QR-friendly link deployment/vercel_redirect.md
35
-
36
- GitHub Live Validation Verify live deployment health deployment/github_live_validation.md
37
-
38
- Training Data Sample Quick-start dataset for retraining data/twitter_training.csv
39
-
40
- βœ… After publishing, replace the placeholder URL above and embed it directly in your resume / interview slides.
41
-
42
- πŸ—οΈ Architecture Overview
43
- flowchart LR
44
- subgraph Data_Layer
45
- A[Oracle Autonomous DB] -- optional --> B[(CSV Data)]
46
- end
47
- subgraph Processing_Layer
48
- B --> C[Data Loader]
49
- C --> D[Text Preprocessor]
50
- D --> E[Scikit-learn Pipeline]
51
- E --> F[Model Metrics]
52
- end
53
- subgraph Experience_Layer
54
- E --> G[Streamlit Dashboard]
55
- E --> H[CLI Automation]
56
- F --> G
57
- F --> I[Reporting / GitHub Pages]
58
- end
59
- subgraph DevOps_Layer
60
- J[GitHub Actions CI]
61
- J --> E
62
- J --> G
63
- end
64
-
65
- πŸ“ Repository Structure
66
- Path Description
67
- app/ Streamlit app for demos
68
- artifacts/ Generated model artifacts (gitignored; created via scripts/train.py)
69
- config/ Central YAML settings controlling ingestion & deployment toggles
70
- data/ Sample labelled tweets
71
- deployment/ Multi-cloud deployment guides
72
- docs/ Architecture diagrams & KPI catalogue
73
- scripts/ Automation scripts for training & inference
74
- src/twitter_sentiment/ Core reusable Python package
75
- tests/ Pytest-based unit tests
76
- .github/workflows/ CI pipeline (lint + tests)
77
- πŸš€ Quick Start
78
- # 1️⃣ Create virtual environment
79
- python -m venv .venv
80
- source .venv/bin/activate # Windows: .venv\Scripts\activate
81
-
82
- # 2️⃣ Install dependencies
83
- pip install --upgrade pip
84
- pip install -r requirements.txt
85
-
86
- # 3️⃣ Train pipeline & generate artifacts
87
- python scripts/train.py
88
-
89
- # 4️⃣ Launch Deloitte Storytelling Dashboard
90
- streamlit run app/app.py
91
-
92
-
93
- Note: Artifacts are excluded from source control.
94
- Run python scripts/train.py whenever cloning the repo to recreate artifacts/sentiment_pipeline.joblib.
95
-
96
- πŸ§ͺ Quality Gates
97
- Check Command
98
- Unit tests pytest -q
99
- Package compile check python -m compileall src
100
- Model retrain python scripts/train.py
101
-
102
- CI runs automatically on every push or PR to main.
103
-
104
- 🧩 Feature Highlights
105
-
106
- Config-driven ingestion switch (CSV ⇄ Oracle DB)
107
-
108
- Reusable text-cleaning module (URL, mention, punctuation handling)
109
-
110
- Auditable Scikit-learn pipeline with persisted metrics
111
-
112
- Streamlit dashboard β†’ probabilities + governance tab
113
-
114
- CLI utilities for automated workflows
115
-
116
- Pytest suite + GitHub Actions CI/CD
117
-
118
- ☁️ Cloud Deployment
119
-
120
- See deployment/oracle_cloud.md
121
- for the Oracle Cloud Infrastructure (OCI) guide:
122
-
123
- Containerize Streamlit app using OCI Container Instances
124
-
125
- Automate retraining via OCI Data Science jobs / GitHub Actions
126
-
127
- Connect to Autonomous DB with wallet credentials
128
-
129
- Or follow deployment/streamlit_cloud.md
130
- for a free Streamlit Cloud public linkβ€”perfect for interviews.
131
-
132
- πŸ—ƒοΈ Oracle Database Integration (Optional)
133
- oracle_integration:
134
- enabled: true
135
- wallet_location: /path/to/wallet
136
- user: DATA_ENGINEER
137
- dsn: myadb_high
138
- sql_query: |
139
- SELECT text, sentiment
140
- FROM analytics.twitter_training_data
141
- WHERE created_at >= SYSDATE - 30
142
-
143
- πŸ“Š Storytelling in Interviews
144
-
145
- Business Impact – sentiment tracking β†’ improved ROI & retention
146
-
147
- Oracle Expertise – wallet connectivity, SQL modeling, OCI integration
148
-
149
- Engineering Rigor – modular design, testing, CI/CD
150
-
151
- Consulting Mindset – executive-ready Streamlit deliverable
152
-
153
- πŸ“¦ GitHub Best Practices
154
-
155
- Use feature branches (e.g., feature/oracle-ingestion)
156
-
157
- Raise PRs with CI status + screenshots
158
-
159
- Track backlog via GitHub Projects
160
-
161
- Tag releases (e.g., v1.0.0) post-deployment
162
-
163
- 🧠 Training Artifacts
164
-
165
- Generated after running scripts/train.py:
166
-
167
- sentiment_pipeline.joblib – Full TF-IDF + Logistic Regression pipeline
168
-
169
- Legacy (compatible) files:
170
-
171
- app/logistic_model.pkl
172
-
173
- app/tfidf_vectorizer.pkl
174
-
175
- sentiment_model.pkl
176
-
177
- Use these for quick demos or backward comparisons.
178
-
179
- 🀝 Contributing
180
-
181
- Pull requests welcome!
182
- Please open an issue describing any enhancement or bugfix before major changes.
183
-
184
- "# redeploy"
 
1
+ ---
2
+ title: Twitter Sentiment Intelligence
3
+ emoji: πŸ“Š
4
+ colorFrom: blue
5
+ colorTo: indigo
6
+ sdk: streamlit
7
+ sdk_version: 1.37.0
8
+ app_file: app.py
9
+ pinned: false
10
+ license: mit
11
+ ---
12
+
13
+ # 🧠 Twitter Sentiment Intelligence
14
+ A Deloitte-ready AI dashboard that performs **real-time sentiment analysis** on tweets using a fine-tuned pipeline deployed via Hugging Face.
15
+
16
+ ## πŸš€ Features
17
+ - Real-time tweet sentiment prediction
18
+ - Visual sentiment distribution graphs
19
+ - Interactive Streamlit dashboard
20
+ - Integrated Hugging Face model inference
21
+
22
+ ---
23
+
24
+ πŸ‘¨β€πŸ’» **Developed by:** [Vishnu Singh](https://github.com/Youranalyst-coder)