File size: 5,510 Bytes
379c3ae
 
 
 
 
 
 
 
 
8d0b0e5
 
 
 
 
bf04727
8d0b0e5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bf04727
8d0b0e5
 
 
bf04727
8d0b0e5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bf04727
 
 
8d0b0e5
bf04727
8d0b0e5
bf04727
 
8d0b0e5
bf04727
 
 
 
 
8d0b0e5
 
 
 
 
 
 
 
 
bf04727
8d0b0e5
 
 
bf04727
8d0b0e5
bf04727
8d0b0e5
bf04727
 
 
 
 
8d0b0e5
bf04727
 
 
8d0b0e5
 
 
 
bf04727
 
 
 
 
 
8d0b0e5
bf04727
 
 
6fb0d9b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8d0b0e5
bf04727
8d0b0e5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
---
title: InnerVoice
emoji: 🗣️
colorFrom: indigo
colorTo: purple
sdk: docker
app_port: 7860
---

<div align="center">
  <img src="https://raw.githubusercontent.com/LonelyGuy12/InnerVoice/main/frontend/public/logo.png" alt="InnerVoice Logo" width="120" />
  <h1>InnerVoice</h1>
  <p><strong>AI-driven Emotional Wellness Tracker & Proactive Support Network</strong></p>
</div>

---

## 🚀 The Problem

*Problem Statement inspired by the Smart India Hackathon framework:*

Mental health decline is often a silent trajectory. When individuals experience prolonged periods of stress or depression, their primary support networks—therapists, partners, and close friends—are completely disconnected from their day-to-day emotional state, forcing interventions to be reactive (during a crisis) rather than proactive.

Furthermore, traditional written journaling causes high cognitive friction resulting in low adherence, and standard mood-tracking apps rely on subjective 1-to-10 sliders that fail to capture subconscious emotional exhaustion.

## 💡 The Solution

**InnerVoice** is a full-stack platform that completely removes the friction of self-reflection. Users simply speak their mind for 60 seconds into their device. 

Instead of just parsing text, InnerVoice **listens**. It utilizes advanced, localized ML models (`wav2vec2`) to extract acoustic features (pitch, speech rate, energetic variance, conversational pauses) to uncover latent emotions. 

Additionally, InnerVoice breaks the isolation of mental health struggles through its **Trusted Circle** architecture. The platform automatically broadcasts Weekly Emotional Trend Reports (synthesized by an LLM) to a pre-authenticated support system to facilitate early human intervention.

---

## 🔥 Key Features

1. **Zero-Friction Voice Journaling**
   - Client-side recording with real-time waveform visualization.
   - Immediate audio-to-text transcription powered by OpenAI Whisper.

2. **Acoustic Acoustic Analysis**
   - `librosa` extracts physiological distress markers: pitch deviation, energy (RMS), speech rate, and zero-crossing rate (pauses/filler words).
   - Emotion classification via localized HuggingFace Models (`wav2vec2-lg-xlsr-en-speech-emotion-recognition`).

3. **Trusted Circle Architecture (Brevo Integration)**
   - Allow users to invite therapists or partners to their secure network. 
   - Proactive broadcasting of LLM-synthesized Weekly Trend Reports directly to via automatic Transactional Emails using the Brevo API.
   - Granular CRUD management of trusted members.

4. **Multi-Factor Correlation & Contextual Engagement**
   - **Sleep/Mood Correlation**: Visual and statistical mapping linking sleep deprivation securely to mood degradation.
   - **Contextual Prompts**: AI reads your previous 5 entries to generate highly personalized daily journaling prompts specifically tailored to your emotional state.
   - **Advanced Theming**: Fully responsive Dark/Light mode architecture.
   - **Crisis Safety Net**: Local and international resources triggered automatically.

---

## 🛠️ Tech Stack

- **Frontend**: Next.js 14, React context, TailwindCSS, Recharts, Framer Motion
- **Backend**: FastAPI, SQLAlchemy, SQLite/PostgreSQL
- **Communications**: Brevo Transactional Email REST API via `httpx`
- **AI/ML**: `wav2vec2`, `librosa`, OpenRouter / OpenAI for generation

---

## ⚙️ Quick Start (Local Deployment)

### 1. Backend Setup

The backend requires Python 3.10+ and downloads ~1.5GB of ML models on first run.

```bash
cd backend
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```

**Configuration**:
Rename `.env.example` to `.env` and configure your API keys:
```env
OPENROUTER_API_KEY="your_ai_key"
BREVO_API_KEY="your_brevo_api_key"
BREVO_SENDER_EMAIL="your_brevo_verified_sender@email.com"
```

**Run Server**:
```bash
python -m uvicorn main:app --reload
```
*API runs locally at http://localhost:8000*

### 2. Frontend Setup

```bash
cd frontend
npm install

# Start the dev server 
npm run dev
```

*Visit `http://localhost:3000` to interact with InnerVoice!*

### 3. Database Seeding (Testing Mode)
If you want to test the multi-week timeline and Trusted Circle functionality without recording 30 days of real audio, you can seed the SQLite database with rich mock entries:

```bash
cd backend
source venv/bin/activate
python seed_data.py
```
This script generates a user ID. Add that ID to your `frontend/.env.local` as `NEXT_PUBLIC_DEMO_USER_ID` or simply browse in the generic UI Demo Mode.

---

## 🐳 Deployment (Hugging Face Spaces)

This project includes a unified `Dockerfile` mapped to a single port specifically designed to be highly compatible with **Hugging Face Spaces**.

### To Deploy on Hugging Face:
1. Create a new **Docker Space** on Hugging Face.
2. Link it to your GitHub repository OR upload these files directly.
3. In the Space Settings, map your Secrets (Environment Variables):
   - `OPENROUTER_API_KEY`
   - `BREVO_API_KEY`
   - `BREVO_SENDER_EMAIL`
4. The Space will automatically build the React static files, install the Python ML stack, and boot both servers behind a unified endpoint!

---

## 🔒 Privacy & Architecture

Audio files are converted in-memory within the backend ML pipeline and are **deleted securely and immediately** after the acoustic feature-extraction completes. We only permanently store quantified acoustic metrics (floats), the speech transcription string, and the categorical detected emotion.