File size: 4,118 Bytes
4848a6a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
# CASL Voice Bot

A speech pathology assistant using AI to assess students' speaking abilities based on the CASL-2 framework. This application helps speech-language pathologists (SLPs) with speech assessment in school settings.

## Implementations

This project provides multiple implementations:

1. **LiveKit Implementation** - Uses LiveKit agents with OpenAI's real-time voice API for low-latency, high-quality audio streaming.
2. **Direct API Implementation** - Uses OpenAI's API directly without LiveKit, for simpler deployment.
3. **Hugging Face Spaces** - An adaptive implementation that works on Hugging Face Spaces, automatically detecting whether LiveKit is available.

## Features

- Voice-to-voice interaction with AI speech pathologist
- CASL-2 framework assessment
- Real-time assessment tracking
- Session recording and saving
- Custom note-taking for SLPs
- Gradio web interface for easy sharing and use in school settings

## CASL-2 Assessment Areas

The AI speech pathologist assesses students in these key areas:

1. **Lexical/Semantic Skills**: Vocabulary knowledge, word meanings, and contextual word use
2. **Syntactic Skills**: Grammar and sentence structure understanding
3. **Supralinguistic Skills**: Higher-level language skills beyond literal meanings
4. **Pragmatic Skills**: Language use in social contexts (less emphasis for younger students)

## Setup Instructions

### Prerequisites

- Python 3.8+
- OpenAI API key with access to GPT-4o and TTS models

### Installation

1. Clone the repository:
   ```
   git clone https://github.com/yourusername/CASLVoiceBot.git
   cd CASLVoiceBot
   ```

2. Create a virtual environment and install dependencies:
   ```
   python -m venv venv
   source venv/bin/activate  # On Windows: venv\Scripts\activate
   pip install -r requirements.txt
   ```

3. For LiveKit implementation, install LiveKit dependencies:
   ```
   # Edit requirements.txt to uncomment the livekit-agents line
   pip install livekit-agents>=0.7.0
   ```

4. Set up environment variables:
   ```
   cp .env.example .env
   ```
   Then edit `.env` to add your OpenAI API key.

### Running the Application

#### LiveKit Implementation (recommended for best performance)
```
python run_livekit.py
```

#### Direct API Implementation (simpler deployment)
```
python run_direct.py
```

#### Command Line Options
Both implementations support these options:
- `--share`: Share the app publicly (enabled by default)
- `--local`: Run the app locally without sharing

## Deployment on Hugging Face Spaces

1. Create a new Space on Hugging Face with the Gradio SDK
2. Upload the repository contents to the Space
3. Add your OPENAI_API_KEY as a secret in the Space settings

By default, the Hugging Face Spaces deployment will try to use LiveKit if available, and fall back to direct API if not.

## Project Structure

```
CASLVoiceBot/
β”œβ”€β”€ app.py                  # Hugging Face Spaces entry point
β”œβ”€β”€ run_direct.py           # Direct API implementation runner
β”œβ”€β”€ run_livekit.py          # LiveKit implementation runner
β”œβ”€β”€ requirements.txt        # Common dependencies
β”œβ”€β”€ .env.example            # Environment variables template
β”œβ”€β”€ implementations/
β”‚   β”œβ”€β”€ common/             # Shared utilities
β”‚   β”‚   β”œβ”€β”€ casl_utils.py   # CASL-2 assessment utilities
β”‚   β”œβ”€β”€ direct/             # Direct API implementation
β”‚   β”‚   β”œβ”€β”€ app.py          # Direct OpenAI API app
β”‚   β”œβ”€β”€ livekit/            # LiveKit implementation
β”‚   β”‚   β”œβ”€β”€ app.py          # LiveKit app
β”‚   β”‚   β”œβ”€β”€ livekit_gradio_hf.py # HF-compatible LiveKit app
β”œβ”€β”€ session_data/           # Saved session data
```

## Usage

1. Optionally enter a Student ID to track sessions
2. Select your preferred AI voice
3. Click "Start Session" to begin a speech assessment
4. Wait for the AI to introduce itself, then speak when prompted
5. View real-time assessment in the interface
6. SLPs can add notes throughout the session
7. Save the session when finished
8. Click "Stop Session" to end

## License

[MIT License](LICENSE)