twitter-sentiment-app / docs /architecture.md
vishnu-coder's picture
Reconcile README with legacy project summary
eab2256
# Architecture & Solution Blueprint
## High-level flow
1. **Ingest**: CSV files for local dev, Oracle Autonomous Database for enterprise deployments.
2. **Process**: Config-driven preprocessing with reusable Python package.
3. **Model**: Scikit-learn pipeline with TF-IDF + Logistic Regression.
4. **Serve**: Streamlit dashboard and CLI automation.
5. **Operate**: GitHub Actions CI, retraining script, and OCI deployment path.
```mermaid
sequenceDiagram
participant User
participant Streamlit
participant Predictor
participant Pipeline
participant OracleDB
User->>Streamlit: Input tweet
Streamlit->>Predictor: call predict_with_threshold
Predictor->>Pipeline: transform + predict_proba
Pipeline-->>Predictor: labels & probabilities
Predictor-->>Streamlit: curated response
Streamlit-->>User: sentiment insights & KPIs
Predictor->>OracleDB: (optional) pull latest training data
```
## Key metrics & KPIs
| KPI | Description | Target |
| --- | --- | --- |
| Macro F1 | Balanced view across positive/neutral/negative | ≥ 0.80 |
| Prediction latency | Streamlit inference response time | < 200 ms |
| Data freshness | Time since last Oracle sync | < 24 hours |
| Model drift PSI | Population stability index | < 0.2 |
## Extensibility roadmap
- Plug-in architecture for additional languages.
- OCI Data Science jobs for scheduled retraining.
- Oracle APEX dashboard embedding the Streamlit app.
- Integration with Deloitte's accelerators for risk & compliance logging.