twitter-sentiment-app / docs /architecture.md
vishnu-coder's picture
Reconcile README with legacy project summary
eab2256

A newer version of the Streamlit SDK is available: 1.54.0

Upgrade

Architecture & Solution Blueprint

High-level flow

  1. Ingest: CSV files for local dev, Oracle Autonomous Database for enterprise deployments.
  2. Process: Config-driven preprocessing with reusable Python package.
  3. Model: Scikit-learn pipeline with TF-IDF + Logistic Regression.
  4. Serve: Streamlit dashboard and CLI automation.
  5. Operate: GitHub Actions CI, retraining script, and OCI deployment path.
sequenceDiagram
    participant User
    participant Streamlit
    participant Predictor
    participant Pipeline
    participant OracleDB

    User->>Streamlit: Input tweet
    Streamlit->>Predictor: call predict_with_threshold
    Predictor->>Pipeline: transform + predict_proba
    Pipeline-->>Predictor: labels & probabilities
    Predictor-->>Streamlit: curated response
    Streamlit-->>User: sentiment insights & KPIs
    Predictor->>OracleDB: (optional) pull latest training data

Key metrics & KPIs

KPI Description Target
Macro F1 Balanced view across positive/neutral/negative ≥ 0.80
Prediction latency Streamlit inference response time < 200 ms
Data freshness Time since last Oracle sync < 24 hours
Model drift PSI Population stability index < 0.2

Extensibility roadmap

  • Plug-in architecture for additional languages.
  • OCI Data Science jobs for scheduled retraining.
  • Oracle APEX dashboard embedding the Streamlit app.
  • Integration with Deloitte's accelerators for risk & compliance logging.