kesbeast23 commited on
Commit
953f504
·
unverified ·
0 Parent(s):

Initial commit with clean history (no audio files)

Browse files
.gitattributes ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ *.wav filter=lfs diff=lfs merge=lfs -text
.gitignore ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Python bytecode
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+
6
+ # Virtual environment
7
+ venv/
8
+ env/
9
+ ENV/
10
+
11
+ # Distribution / packaging
12
+ dist/
13
+ build/
14
+ *.egg-info/
15
+
16
+ # Jupyter Notebook
17
+ .ipynb_checkpoints
18
+
19
+ # Temp directory
20
+ /tmp/
21
+
22
+ # Log files
23
+ *.log
24
+
25
+ # Mac OS files
26
+ .DS_Store
27
+
28
+ # Temp audio files
29
+ *.wav.tmp
30
+
31
+ # Audio files
32
+ *.wav
33
+
34
+ # Exclude data directories except README files
35
+ torgo_original/data/*
36
+ !torgo_original/data/README.md
37
+ torgo-synthetic/data/*
38
+ !torgo-synthetic/data/README.md
.gradio/certificate.pem ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ -----BEGIN CERTIFICATE-----
2
+ MIIFazCCA1OgAwIBAgIRAIIQz7DSQONZRGPgu2OCiwAwDQYJKoZIhvcNAQELBQAw
3
+ TzELMAkGA1UEBhMCVVMxKTAnBgNVBAoTIEludGVybmV0IFNlY3VyaXR5IFJlc2Vh
4
+ cmNoIEdyb3VwMRUwEwYDVQQDEwxJU1JHIFJvb3QgWDEwHhcNMTUwNjA0MTEwNDM4
5
+ WhcNMzUwNjA0MTEwNDM4WjBPMQswCQYDVQQGEwJVUzEpMCcGA1UEChMgSW50ZXJu
6
+ ZXQgU2VjdXJpdHkgUmVzZWFyY2ggR3JvdXAxFTATBgNVBAMTDElTUkcgUm9vdCBY
7
+ MTCCAiIwDQYJKoZIhvcNAQEBBQADggIPADCCAgoCggIBAK3oJHP0FDfzm54rVygc
8
+ h77ct984kIxuPOZXoHj3dcKi/vVqbvYATyjb3miGbESTtrFj/RQSa78f0uoxmyF+
9
+ 0TM8ukj13Xnfs7j/EvEhmkvBioZxaUpmZmyPfjxwv60pIgbz5MDmgK7iS4+3mX6U
10
+ A5/TR5d8mUgjU+g4rk8Kb4Mu0UlXjIB0ttov0DiNewNwIRt18jA8+o+u3dpjq+sW
11
+ T8KOEUt+zwvo/7V3LvSye0rgTBIlDHCNAymg4VMk7BPZ7hm/ELNKjD+Jo2FR3qyH
12
+ B5T0Y3HsLuJvW5iB4YlcNHlsdu87kGJ55tukmi8mxdAQ4Q7e2RCOFvu396j3x+UC
13
+ B5iPNgiV5+I3lg02dZ77DnKxHZu8A/lJBdiB3QW0KtZB6awBdpUKD9jf1b0SHzUv
14
+ KBds0pjBqAlkd25HN7rOrFleaJ1/ctaJxQZBKT5ZPt0m9STJEadao0xAH0ahmbWn
15
+ OlFuhjuefXKnEgV4We0+UXgVCwOPjdAvBbI+e0ocS3MFEvzG6uBQE3xDk3SzynTn
16
+ jh8BCNAw1FtxNrQHusEwMFxIt4I7mKZ9YIqioymCzLq9gwQbooMDQaHWBfEbwrbw
17
+ qHyGO0aoSCqI3Haadr8faqU9GY/rOPNk3sgrDQoo//fb4hVC1CLQJ13hef4Y53CI
18
+ rU7m2Ys6xt0nUW7/vGT1M0NPAgMBAAGjQjBAMA4GA1UdDwEB/wQEAwIBBjAPBgNV
19
+ HRMBAf8EBTADAQH/MB0GA1UdDgQWBBR5tFnme7bl5AFzgAiIyBpY9umbbjANBgkq
20
+ hkiG9w0BAQsFAAOCAgEAVR9YqbyyqFDQDLHYGmkgJykIrGF1XIpu+ILlaS/V9lZL
21
+ ubhzEFnTIZd+50xx+7LSYK05qAvqFyFWhfFQDlnrzuBZ6brJFe+GnY+EgPbk6ZGQ
22
+ 3BebYhtF8GaV0nxvwuo77x/Py9auJ/GpsMiu/X1+mvoiBOv/2X/qkSsisRcOj/KK
23
+ NFtY2PwByVS5uCbMiogziUwthDyC3+6WVwW6LLv3xLfHTjuCvjHIInNzktHCgKQ5
24
+ ORAzI4JMPJ+GslWYHb4phowim57iaztXOoJwTdwJx4nLCgdNbOhdjsnvzqvHu7Ur
25
+ TkXWStAmzOVyyghqpZXjFaH3pO3JLF+l+/+sKAIuvtd7u+Nxe5AW0wdeRlN8NwdC
26
+ jNPElpzVmbUq4JUagEiuTDkHzsxHpFKVK7q4+63SM1N95R1NbdWhscdCb+ZAJzVc
27
+ oyi3B43njTOQ5yOf+1CceWxG1bQVs5ZufpsMljq4Ui0/1lvh+wjChP4kqKOJ2qxq
28
+ 4RgqsahDYVvTH9w7jXbyLeiNdd8XM2w9U/t7y0Ff/9yi0GE44Za4rF2LN9d11TPA
29
+ mRGunUHBcnWEvgJBQl9nJEiU0Zsnvgc/ubhPgXRR4Xq37Z0j4r7g1SgEEzwxA57d
30
+ emyPxgcYxn/eR44/KJ4EBs+lVDR3veyJm+kXQ99b21/+jh5Xos1AnX5iItreGCc=
31
+ -----END CERTIFICATE-----
README.md ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Speech Evaluation Experiment
3
+ emoji: 👀
4
+ colorFrom: yellow
5
+ colorTo: pink
6
+ sdk: gradio
7
+ sdk_version: 5.31.0
8
+ app_file: app.py
9
+ pinned: false
10
+ license: apache-2.0
11
+ ---
12
+
13
+ # Speech Evaluation Experiment
14
+
15
+ This application allows users to evaluate synthetic speech samples against original samples, rating their naturalness and intelligibility.
16
+
17
+ ## Requirements
18
+
19
+ All dependencies are listed in `requirements.txt`.
20
+
21
+ ## Setup
22
+
23
+ 1. Clone this repository
24
+ 2. Install dependencies: `pip install -r requirements.txt`
25
+ 3. Run the application: `python experiment1.py`
26
+
27
+ ## Data Structure
28
+
29
+ The experiment uses two datasets:
30
+ - `torgo_original` - Contains original speech samples
31
+ - `torgo-synthetic` - Contains synthetic speech samples
32
+
33
+ Each dataset has its own metadata.csv file that describes the audio files.
34
+
35
+ ## Deployment
36
+
37
+ This application is designed to work with Hugging Face Spaces or similar platforms that support Gradio applications.
38
+
39
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
app.py ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ from experiment1 import create_experiment_interface
3
+
4
+ # Create the Gradio interface
5
+ demo = create_experiment_interface()
6
+
7
+ # Launch the app
8
+ if __name__ == "__main__":
9
+ # For local development
10
+ demo.launch()
11
+
12
+ # For Hugging Face Spaces, the demo variable will be used automatically
deploy_instructions.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Deployment Instructions for Hugging Face Spaces
2
+
3
+ Follow these steps to deploy your speech evaluation app to Hugging Face Spaces:
4
+
5
+ ## 1. Create a Hugging Face Account
6
+ - Go to https://huggingface.co/join if you don't have an account
7
+
8
+ ## 2. Create a New Space
9
+ - Go to https://huggingface.co/spaces
10
+ - Click "Create new Space"
11
+ - Choose "Gradio" as the SDK
12
+ - Name your Space (e.g., "speech-evaluation-experiment")
13
+ - Set visibility (Public or Private)
14
+ - Click "Create Space"
15
+
16
+ ## 3. Push Your Code to the Space
17
+ Run these commands in your terminal:
18
+
19
+ ```bash
20
+ # Initialize git repository (if not already done)
21
+ git init
22
+
23
+ # Add all files
24
+ git add .
25
+
26
+ # Commit changes
27
+ git commit -m "Initial commit for deployment"
28
+
29
+ # Add Hugging Face Space as remote (replace YOUR_USERNAME with your Hugging Face username)
30
+ git remote add space https://huggingface.co/spaces/YOUR_USERNAME/speech-evaluation-experiment
31
+
32
+ # Push to Hugging Face
33
+ git push --force space main
34
+ ```
35
+
36
+ ## 4. Monitor Deployment
37
+ - Go to your Space at https://huggingface.co/spaces/YOUR_USERNAME/speech-evaluation-experiment
38
+ - You can see build logs and when your app is deployed
39
+
40
+ ## 5. Custom Domain (Optional)
41
+ - You can set up a custom domain in the Space settings
42
+
43
+ ## Important Notes
44
+ - Make sure all audio files are included in the repository
45
+ - The app.py file will be automatically detected and run
46
+ - Requirements.txt will be used to install dependencies
deploy_to_huggingface.md ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Deploy to Hugging Face Spaces with Token Authentication
2
+
3
+ ## Step 1: Create a Hugging Face Access Token
4
+
5
+ 1. Go to https://huggingface.co/settings/tokens
6
+ 2. Click "New token"
7
+ 3. Give it a name (e.g., "Speech Evaluation App")
8
+ 4. Set permissions to "Write"
9
+ 5. Click "Generate a token"
10
+ 6. Copy the token (you'll only see it once)
11
+
12
+ ## Step 2: Deploy to Hugging Face Spaces
13
+
14
+ Run these commands in your terminal:
15
+
16
+ ```bash
17
+ # Initialize git repository
18
+ git init
19
+
20
+ # Add all files
21
+ git add .
22
+
23
+ # Commit changes
24
+ git commit -m "Initial deployment"
25
+
26
+ # Add Hugging Face Space as remote
27
+ git remote add origin https://huggingface.co/spaces/kesbeast23/speech-evaluation-experiment
28
+
29
+ # Push to Hugging Face
30
+ git push -u origin main
31
+ ```
32
+
33
+ When prompted for username and password:
34
+ - Username: your Hugging Face username
35
+ - Password: paste your access token (not your account password)
36
+
37
+ ## Step 3: Check Deployment
38
+
39
+ Once the push is complete, visit your Space at:
40
+ https://huggingface.co/spaces/kesbeast23/speech-evaluation-experiment
41
+
42
+ It may take a few minutes for the app to build and deploy.
experiment1.py ADDED
@@ -0,0 +1,662 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ import os
3
+ import random
4
+ import pandas as pd
5
+ from datetime import datetime
6
+ import numpy as np
7
+ import uuid
8
+ import soundfile as sf
9
+ import librosa
10
+ import noisereduce as nr
11
+ import tempfile
12
+ import atexit
13
+ import shutil
14
+
15
+ # Constants
16
+ # Get absolute paths
17
+ WORKSPACE_ROOT = os.path.dirname(os.path.abspath(__file__))
18
+ ORIGINAL_DATA_DIR = os.path.join(WORKSPACE_ROOT, "torgo_original")
19
+ SYNTHETIC_DATA_DIR = os.path.join(WORKSPACE_ROOT, "torgo-synthetic")
20
+ RESULTS_FILE = os.path.join(WORKSPACE_ROOT, "experiment_Results.csv")
21
+ TEMP_DIR = os.path.join(tempfile.gettempdir(), "speech_evaluation")
22
+
23
+ # Create directories if they don't exist
24
+ os.makedirs(TEMP_DIR, exist_ok=True)
25
+ os.makedirs(os.path.join(ORIGINAL_DATA_DIR, "data"), exist_ok=True)
26
+ os.makedirs(os.path.join(SYNTHETIC_DATA_DIR, "data"), exist_ok=True)
27
+
28
+ # Track generated temp files for cleanup
29
+ temp_files = []
30
+
31
+ # Flag to check if running in demo mode (no audio files)
32
+ DEMO_MODE = True
33
+
34
+ # Check if we're in demo mode (no audio files)
35
+ def check_demo_mode():
36
+ original_data_path = os.path.join(ORIGINAL_DATA_DIR, "data")
37
+ synthetic_data_path = os.path.join(SYNTHETIC_DATA_DIR, "data")
38
+
39
+ # Check if data directories exist and contain files
40
+ if (os.path.exists(original_data_path) and len(os.listdir(original_data_path)) > 0 and
41
+ os.path.exists(synthetic_data_path) and len(os.listdir(synthetic_data_path)) > 0):
42
+ return False
43
+ return True
44
+
45
+ # Set demo mode flag
46
+ DEMO_MODE = check_demo_mode()
47
+ if DEMO_MODE:
48
+ print("Running in DEMO MODE - No audio files found")
49
+
50
+ # Register cleanup function to run on exit
51
+ def cleanup_temp_files():
52
+ """Remove temporary files and directory on exit"""
53
+ for temp_file in temp_files:
54
+ try:
55
+ if os.path.exists(temp_file):
56
+ os.remove(temp_file)
57
+ except Exception as e:
58
+ print(f"Error removing temp file {temp_file}: {e}")
59
+
60
+ try:
61
+ if os.path.exists(TEMP_DIR):
62
+ shutil.rmtree(TEMP_DIR)
63
+ except Exception as e:
64
+ print(f"Error removing temp directory {TEMP_DIR}: {e}")
65
+
66
+ atexit.register(cleanup_temp_files)
67
+
68
+ # Sample type mapping
69
+ SAMPLE_TYPE_MAPPING = {
70
+ "Original": "Natural", # For display purposes
71
+ "Natural": "Original" # For database storage
72
+ }
73
+
74
+ # Define columns for results DataFrame
75
+ COLUMNS = [
76
+ 'timestamp', 'participant_id', 'sample_id', 'sample_type',
77
+ 'naturalness_rating', 'intelligibility_rating', 'comments',
78
+ 'transcription', 'original_speaker', 'synthetic_speaker',
79
+ 'participant_guess', 'guess_correct'
80
+ ]
81
+
82
+ # Initialize results DataFrame
83
+ try:
84
+ results_df = pd.read_csv(RESULTS_FILE)
85
+ # Verify columns match expected structure
86
+ if list(results_df.columns) != COLUMNS:
87
+ results_df = pd.DataFrame(columns=COLUMNS)
88
+ results_df.to_csv(RESULTS_FILE, index=False)
89
+ except (pd.errors.EmptyDataError, FileNotFoundError):
90
+ # Create new DataFrame if file is empty or doesn't exist
91
+ results_df = pd.DataFrame(columns=COLUMNS)
92
+ results_df.to_csv(RESULTS_FILE, index=False)
93
+
94
+ # Read metadata files
95
+ original_metadata = pd.read_csv(os.path.join(ORIGINAL_DATA_DIR, "metadata.csv"))
96
+ synthetic_metadata = pd.read_csv(os.path.join(SYNTHETIC_DATA_DIR, "metadata.csv"))
97
+
98
+ # Set a fixed random seed for reproducibility
99
+ RANDOM_SEED = 42
100
+ random.seed(RANDOM_SEED)
101
+
102
+ def convert_display_type_to_storage(display_type):
103
+ """Convert display sample type to storage type"""
104
+ if display_type == "Natural":
105
+ return "Original"
106
+ return display_type
107
+
108
+ def convert_storage_type_to_display(storage_type):
109
+ """Convert storage sample type to display type"""
110
+ if storage_type == "Original":
111
+ return "Natural"
112
+ return storage_type
113
+
114
+ def get_audio_path(file_path, is_original=True):
115
+ """Convert metadata file path to actual audio file path"""
116
+ # Remove 'data/' prefix if present
117
+ file_path = file_path.replace('data/', '')
118
+
119
+ # Construct absolute path
120
+ if is_original:
121
+ return os.path.join(ORIGINAL_DATA_DIR, "data", file_path)
122
+ else:
123
+ return os.path.join(SYNTHETIC_DATA_DIR, "data", file_path)
124
+
125
+ def verify_audio_file(file_path):
126
+ """Verify that audio file exists and is readable"""
127
+ if DEMO_MODE:
128
+ # In demo mode, pretend all files exist
129
+ return True
130
+
131
+ try:
132
+ if os.path.exists(file_path):
133
+ data, samplerate = sf.read(file_path)
134
+ return True
135
+ return False
136
+ except:
137
+ return False
138
+
139
+ def generate_participant_id():
140
+ """Generate a unique participant ID"""
141
+ # Get existing participant IDs
142
+ existing_ids = set()
143
+ if os.path.exists(RESULTS_FILE):
144
+ try:
145
+ results = pd.read_csv(RESULTS_FILE)
146
+ if not results.empty:
147
+ existing_ids = set(results['participant_id'].unique())
148
+ except pd.errors.EmptyDataError:
149
+ pass
150
+
151
+ # Find the next available number
152
+ counter = 1
153
+ while f"P{counter:03d}" in existing_ids:
154
+ counter += 1
155
+
156
+ return f"P{counter:03d}"
157
+
158
+ def preprocess_audio(file_path):
159
+ """Remove background noise from the audio file and return a temporary file path"""
160
+ global temp_files
161
+
162
+ if DEMO_MODE:
163
+ # In demo mode, return a placeholder empty audio file
164
+ temp_path = os.path.join(TEMP_DIR, f"demo_{uuid.uuid4()}.wav")
165
+ # Create a short silent wav file
166
+ sr = 16000
167
+ silent_audio = np.zeros(int(sr * 1.5)) # 1.5 seconds of silence
168
+ sf.write(temp_path, silent_audio, sr)
169
+ temp_files.append(temp_path)
170
+ return temp_path
171
+
172
+ try:
173
+ # Load audio file
174
+ audio, sr = librosa.load(file_path, sr=None)
175
+
176
+ # Apply noise reduction
177
+ reduced_noise = nr.reduce_noise(y=audio, sr=sr)
178
+
179
+ # Create a temporary file to store the noise-reduced audio
180
+ temp_path = os.path.join(TEMP_DIR, f"processed_{os.path.basename(file_path)}")
181
+ sf.write(temp_path, reduced_noise, sr)
182
+
183
+ # Track the temp file for later cleanup
184
+ temp_files.append(temp_path)
185
+
186
+ return temp_path
187
+ except Exception as e:
188
+ print(f"Error preprocessing audio: {e}")
189
+ # Create a silent audio in case of error
190
+ temp_path = os.path.join(TEMP_DIR, f"error_{uuid.uuid4()}.wav")
191
+ sr = 16000
192
+ silent_audio = np.zeros(int(sr * 1.5)) # 1.5 seconds of silence
193
+ sf.write(temp_path, silent_audio, sr)
194
+ temp_files.append(temp_path)
195
+ return temp_path
196
+
197
+ # Create a fixed set of original samples and their synthetic versions
198
+ def create_sample_pairs():
199
+ """Create a selection of original samples and their synthetic versions
200
+ - Each original sample is included only once
201
+ - All synthetic versions of each original sample are included"""
202
+
203
+ # Extract speaker IDs from file names (format: data/X_YYY_Session...)
204
+ original_metadata['speaker_id'] = original_metadata['file_name'].apply(
205
+ lambda x: x.split('_')[1] if '_' in x else 'unknown'
206
+ )
207
+
208
+ # Get unique speakers from the extracted speaker IDs
209
+ original_speakers = original_metadata['speaker_id'].unique()
210
+ print(f"Found {len(original_speakers)} unique original speakers: {original_speakers}")
211
+
212
+ # First, identify files that have synthetic versions
213
+ original_files_with_synthetic = synthetic_metadata['original_file'].unique()
214
+ print(f"Found {len(original_files_with_synthetic)} original files that have synthetic versions")
215
+
216
+ # Group by original file to structure the experiment properly
217
+ organized_samples = []
218
+
219
+ # Dictionary to store selected original files (to avoid duplicates)
220
+ selected_original_files = set()
221
+
222
+ # First approach: Select specific example samples from the user's data
223
+ example_samples = [
224
+ "data/F_F03_Session3_0164.wav", # "sing"
225
+ "data/F_F03_Session3_0170.wav", # "leak"
226
+ "data/F_F03_Session3_0158.wav" # "brought"
227
+ ]
228
+
229
+ for orig_file in example_samples:
230
+ if orig_file in original_metadata['file_name'].values and orig_file not in selected_original_files:
231
+ # Find the transcription for this original file
232
+ orig_row = original_metadata[original_metadata['file_name'] == orig_file].iloc[0]
233
+
234
+ # Find all synthetic versions of this original file
235
+ matching_synthetic = synthetic_metadata[
236
+ synthetic_metadata['original_file'] == orig_file
237
+ ]
238
+
239
+ if not matching_synthetic.empty:
240
+ # Verify original file exists
241
+ orig_path = get_audio_path(orig_file, is_original=True)
242
+ if verify_audio_file(orig_path):
243
+ # Create a group with one original and all its synthetic versions
244
+ group = {
245
+ 'original': {
246
+ 'file': orig_file,
247
+ 'path': orig_path,
248
+ 'transcription': orig_row['transcription'],
249
+ },
250
+ 'synthetic': []
251
+ }
252
+
253
+ # Add synthetic versions
254
+ for _, synth_row in matching_synthetic.iterrows():
255
+ synth_path = get_audio_path(synth_row['file_name'], is_original=False)
256
+ if verify_audio_file(synth_path):
257
+ group['synthetic'].append({
258
+ 'file': synth_row['file_name'],
259
+ 'path': synth_path,
260
+ 'transcription': orig_row['transcription'],
261
+ 'original_speaker': synth_row['original_speaker'],
262
+ 'synthetic_speaker': synth_row['synthetic_speaker']
263
+ })
264
+
265
+ # Only add group if it has synthetic versions
266
+ if group['synthetic']:
267
+ organized_samples.append(group)
268
+ selected_original_files.add(orig_file)
269
+ print(f"Added example group: {orig_file} with {len(group['synthetic'])} synthetic versions")
270
+
271
+ # Second approach: If needed, add more samples from other original speakers
272
+ if len(organized_samples) < 7: # Aim for at least 7 original samples
273
+ # Filter original metadata to only include files that have synthetic versions
274
+ filterable_originals = original_metadata[
275
+ original_metadata['file_name'].isin(original_files_with_synthetic) &
276
+ ~original_metadata['file_name'].isin(selected_original_files)
277
+ ]
278
+
279
+ # Select samples from each speaker
280
+ for speaker in original_speakers:
281
+ # Skip if we already have enough samples
282
+ if len(organized_samples) >= 7:
283
+ break
284
+
285
+ speaker_samples = filterable_originals[filterable_originals['speaker_id'] == speaker]
286
+
287
+ # Skip if no samples for this speaker have synthetic versions
288
+ if len(speaker_samples) == 0:
289
+ print(f"No additional samples with synthetic versions for speaker {speaker}")
290
+ continue
291
+
292
+ # Select one sample per speaker
293
+ selected_sample = speaker_samples.sample(n=1, random_state=RANDOM_SEED).iloc[0]
294
+ orig_file = selected_sample['file_name']
295
+
296
+ # Skip if already selected
297
+ if orig_file in selected_original_files:
298
+ continue
299
+
300
+ # Find all synthetic versions of this original file
301
+ matching_synthetic = synthetic_metadata[
302
+ synthetic_metadata['original_file'] == orig_file
303
+ ]
304
+
305
+ # Verify original file exists
306
+ orig_path = get_audio_path(orig_file, is_original=True)
307
+ if verify_audio_file(orig_path):
308
+ # Create a group with one original and all its synthetic versions
309
+ group = {
310
+ 'original': {
311
+ 'file': orig_file,
312
+ 'path': orig_path,
313
+ 'transcription': selected_sample['transcription'],
314
+ },
315
+ 'synthetic': []
316
+ }
317
+
318
+ # Add synthetic versions
319
+ for _, synth_row in matching_synthetic.iterrows():
320
+ synth_path = get_audio_path(synth_row['file_name'], is_original=False)
321
+ if verify_audio_file(synth_path):
322
+ group['synthetic'].append({
323
+ 'file': synth_row['file_name'],
324
+ 'path': synth_path,
325
+ 'transcription': selected_sample['transcription'],
326
+ 'original_speaker': synth_row['original_speaker'],
327
+ 'synthetic_speaker': synth_row['synthetic_speaker']
328
+ })
329
+
330
+ # Only add group if it has synthetic versions
331
+ if group['synthetic']:
332
+ organized_samples.append(group)
333
+ selected_original_files.add(orig_file)
334
+ print(f"Added additional group: {orig_file} with {len(group['synthetic'])} synthetic versions")
335
+
336
+ # Now flatten the organized samples into a list of samples to play in sequence
337
+ playback_sequence = []
338
+
339
+ for group in organized_samples:
340
+ # First add the original
341
+ playback_sequence.append({
342
+ 'is_original': True,
343
+ 'file_name': group['original']['file'],
344
+ 'file_path': group['original']['path'],
345
+ 'transcription': group['original']['transcription'],
346
+ 'original_speaker': '',
347
+ 'synthetic_speaker': ''
348
+ })
349
+
350
+ # Then add all synthetic versions
351
+ for synth in group['synthetic']:
352
+ playback_sequence.append({
353
+ 'is_original': False,
354
+ 'file_name': synth['file'],
355
+ 'file_path': synth['path'],
356
+ 'transcription': synth['transcription'],
357
+ 'original_speaker': synth['original_speaker'],
358
+ 'synthetic_speaker': synth['synthetic_speaker']
359
+ })
360
+
361
+ # Print statistics
362
+ print(f"Created sequence with {len(playback_sequence)} samples:")
363
+ print(f"- {len(organized_samples)} original samples")
364
+ print(f"- {len(playback_sequence) - len(organized_samples)} synthetic versions")
365
+
366
+ if len(playback_sequence) == 0:
367
+ print("WARNING: No samples were created. Please check metadata files.")
368
+
369
+ return playback_sequence
370
+
371
+ # Initialize sample playback sequence
372
+ playback_sequence = create_sample_pairs()
373
+ print(f"Created playback sequence with {len(playback_sequence)} samples")
374
+ current_sample_index = 0
375
+
376
+ def save_rating(participant_id, sample_id, sample_type, naturalness, intelligibility, comments,
377
+ transcription, original_speaker, synthetic_speaker, participant_guess):
378
+ """Save the rating to the CSV file"""
379
+ global results_df # Move global declaration to the start of the function
380
+
381
+ timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
382
+
383
+ # Map Yes/No to Natural/Synthetic
384
+ if participant_guess == "Yes":
385
+ storage_guess = "Original"
386
+ elif participant_guess == "No":
387
+ storage_guess = "Synthetic"
388
+ else:
389
+ storage_guess = ""
390
+
391
+ # Check if guess was correct
392
+ guess_correct = storage_guess.lower() == sample_type.lower()
393
+
394
+ # Get current accuracy
395
+ participant_results = results_df[results_df['participant_id'] == participant_id]
396
+ current_accuracy = participant_results['guess_correct'].mean() * 100 if not participant_results.empty else 100
397
+
398
+ # Create feedback message
399
+ display_type = convert_storage_type_to_display(sample_type)
400
+ feedback = f"System Feedback: Your guess was {'correct' if guess_correct else 'incorrect'} (It was {display_type}). Current accuracy: {current_accuracy:.1f}%"
401
+
402
+ # Don't combine user comments with feedback - save user comments as is
403
+ new_row = {
404
+ 'timestamp': timestamp,
405
+ 'participant_id': participant_id,
406
+ 'sample_id': sample_id,
407
+ 'sample_type': sample_type,
408
+ 'naturalness_rating': naturalness,
409
+ 'intelligibility_rating': intelligibility,
410
+ 'comments': comments or "",
411
+ 'transcription': transcription,
412
+ 'original_speaker': original_speaker,
413
+ 'synthetic_speaker': synthetic_speaker,
414
+ 'participant_guess': storage_guess,
415
+ 'guess_correct': guess_correct
416
+ }
417
+
418
+ results_df = pd.concat([results_df, pd.DataFrame([new_row])], ignore_index=True)
419
+ results_df.to_csv(RESULTS_FILE, index=False)
420
+ return feedback
421
+
422
+ def create_experiment_interface():
423
+ """Create the Gradio interface for the experiment"""
424
+
425
+ with gr.Blocks(title="Dysarthric Speech Evaluation") as demo:
426
+ gr.Markdown("""
427
+ # Dysarthric Speech Evaluation Experiment
428
+
429
+ Welcome to the experiment! You will be asked to evaluate speech samples and determine if they are natural recordings or synthetic (computer-generated) speech.
430
+
431
+ ## Instructions:
432
+ 1. Listen to each audio sample carefully (background noise has been reduced for better clarity)
433
+ 2. Guess whether the sample is Natural (real human recording) or Synthetic (computer-generated)
434
+ 3. Rate the naturalness and intelligibility on a scale of 1-5
435
+ 4. Add any comments about the speech sample (optional)
436
+ 5. Click 'Submit Rating' to save your evaluation and see if your guess was correct
437
+
438
+ ## Rating Scale:
439
+ - 1: Poor/Unintelligible
440
+ - 2: Fair
441
+ - 3: Good
442
+ - 4: Very Good
443
+ - 5: Excellent/Highly Intelligible
444
+
445
+ Note: After each submission, feedback about your guess and current accuracy will appear in the system feedback area.
446
+ """)
447
+
448
+ # State variables
449
+ current_participant_id = gr.State(value=generate_participant_id())
450
+
451
+ with gr.Row():
452
+ with gr.Column():
453
+ participant_id_display = gr.Textbox(
454
+ label="Participant ID",
455
+ interactive=False
456
+ )
457
+
458
+ # Add progress indicator
459
+ progress_text = gr.Textbox(
460
+ label="Progress",
461
+ interactive=False,
462
+ value="Progress: 0/0 samples"
463
+ )
464
+
465
+ sample_id = gr.Textbox(label="Sample ID", visible=False)
466
+ sample_type = gr.Textbox(label="True Sample Type", visible=False)
467
+ transcription = gr.Textbox(label="Transcription (What should be said)")
468
+ original_speaker = gr.Textbox(label="Original Speaker", visible=False)
469
+ synthetic_speaker = gr.Textbox(label="Synthetic Speaker", visible=False)
470
+
471
+ audio_player = gr.Audio(
472
+ label="Speech Sample",
473
+ type="filepath",
474
+ format="wav",
475
+ autoplay=False
476
+ )
477
+
478
+ participant_guess = gr.Radio(
479
+ choices=["Yes", "No"],
480
+ label="Does this audio sound natural to you?",
481
+ value=None
482
+ )
483
+
484
+ # Declare sliders directly (no gr.Row wrappers)
485
+ naturalness = gr.Slider(
486
+ minimum=1,
487
+ maximum=5,
488
+ step=1,
489
+ value=3,
490
+ label="Naturalness Rating",
491
+ info="Rate how natural/human-like the speech sounds",
492
+ interactive=True
493
+ )
494
+ intelligibility = gr.Slider(
495
+ minimum=1,
496
+ maximum=5,
497
+ step=1,
498
+ value=3,
499
+ label="Intelligibility Rating",
500
+ info="Rate how easy it is to understand the speech",
501
+ interactive=True
502
+ )
503
+
504
+ comments = gr.Textbox(
505
+ label="Additional Comments",
506
+ placeholder="Enter any observations about the speech sample.",
507
+ lines=5
508
+ )
509
+
510
+ # Add a status textbox below comments for feedback
511
+ status = gr.Textbox(
512
+ label="Status / System Feedback",
513
+ interactive=False,
514
+ lines=2
515
+ )
516
+
517
+ submit_btn = gr.Button("Submit Rating", variant="primary")
518
+ next_btn = gr.Button("Next Sample", variant="secondary")
519
+
520
+ def reset_interface():
521
+ """Reset interface elements to default values"""
522
+ return {
523
+ participant_guess: None,
524
+ naturalness: 3,
525
+ intelligibility: 3,
526
+ comments: ""
527
+ }
528
+
529
+ def load_next_sample(participant_id):
530
+ """Load the next sample from the playback sequence"""
531
+ global current_sample_index
532
+
533
+ if current_sample_index >= len(playback_sequence):
534
+ participant_results = results_df[results_df['participant_id'] == participant_id]
535
+ final_accuracy = participant_results['guess_correct'].mean() * 100 if not participant_results.empty else 0
536
+
537
+ return [
538
+ None, # audio_player
539
+ "Experiment Complete", # sample_id
540
+ "Complete", # sample_type
541
+ "", # transcription
542
+ 3, # naturalness
543
+ 3, # intelligibility
544
+ f"Experiment complete! Final accuracy: {final_accuracy:.1f}%", # comments
545
+ "Experiment complete!", # status
546
+ "", # original_speaker
547
+ "", # synthetic_speaker
548
+ participant_id, # participant_id_display
549
+ None, # participant_guess
550
+ f"Progress: {len(playback_sequence)}/{len(playback_sequence)} samples" # progress_text
551
+ ]
552
+
553
+ current_sample = playback_sequence[current_sample_index]
554
+
555
+ # Calculate progress information
556
+ progress_text = f"Progress: {current_sample_index + 1}/{len(playback_sequence)} samples"
557
+
558
+ # Get sample type and file path
559
+ sample_type_val = "Original" if current_sample['is_original'] else "Synthetic"
560
+ audio_file = current_sample['file_path']
561
+
562
+ # Apply noise reduction to the audio file
563
+ preprocessed_audio = preprocess_audio(audio_file)
564
+
565
+ # Move to next sample
566
+ current_sample_index += 1
567
+
568
+ return [
569
+ preprocessed_audio, # audio_player (now with reduced noise)
570
+ current_sample['file_name'], # sample_id
571
+ sample_type_val, # sample_type
572
+ current_sample['transcription'], # transcription
573
+ 3, # naturalness
574
+ 3, # intelligibility
575
+ "", # comments
576
+ "", # status
577
+ current_sample['original_speaker'], # original_speaker
578
+ current_sample['synthetic_speaker'], # synthetic_speaker
579
+ participant_id, # participant_id_display
580
+ None, # participant_guess
581
+ progress_text # progress_text
582
+ ]
583
+
584
+ def submit_rating(participant_id, sample_id, sample_type, naturalness, intelligibility, comments,
585
+ transcription, original_speaker, synthetic_speaker, participant_guess):
586
+ """Handle rating submission"""
587
+ if not participant_guess:
588
+ return [
589
+ gr.skip(), # audio_player
590
+ gr.skip(), # sample_id
591
+ gr.skip(), # sample_type
592
+ gr.skip(), # transcription
593
+ gr.skip(), # naturalness
594
+ gr.skip(), # intelligibility
595
+ gr.skip(), # comments (do not update)
596
+ "Please make a guess before submitting", # status
597
+ gr.skip(), # original_speaker
598
+ gr.skip(), # synthetic_speaker
599
+ gr.skip(), # participant_id_display
600
+ gr.skip(), # participant_guess
601
+ gr.skip() # progress_text
602
+ ]
603
+
604
+ # Save and get feedback
605
+ feedback = save_rating(
606
+ participant_id, sample_id, sample_type, naturalness, intelligibility, comments,
607
+ transcription, original_speaker, synthetic_speaker, participant_guess
608
+ )
609
+
610
+ # Get next sample
611
+ next_outputs = load_next_sample(participant_id)
612
+
613
+ # The order in load_next_sample is:
614
+ # [audio_file, file_id, sample_type_val, transcription, naturalness, intelligibility,
615
+ # comments, status, original_speaker, synthetic_speaker, participant_id, participant_guess, progress_text]
616
+
617
+ # Just update the status field (index 7) with the feedback
618
+ next_outputs[7] = feedback
619
+
620
+ return next_outputs
621
+
622
+ # Event handlers
623
+ submit_btn.click(
624
+ submit_rating,
625
+ inputs=[
626
+ current_participant_id, sample_id, sample_type, naturalness, intelligibility,
627
+ comments, transcription, original_speaker, synthetic_speaker, participant_guess
628
+ ],
629
+ outputs=[
630
+ audio_player, sample_id, sample_type, transcription, naturalness,
631
+ intelligibility, comments, status, original_speaker, synthetic_speaker,
632
+ participant_id_display, participant_guess, progress_text
633
+ ]
634
+ )
635
+
636
+ next_btn.click(
637
+ load_next_sample,
638
+ inputs=[current_participant_id],
639
+ outputs=[
640
+ audio_player, sample_id, sample_type, transcription, naturalness,
641
+ intelligibility, comments, status, original_speaker, synthetic_speaker,
642
+ participant_id_display, participant_guess, progress_text
643
+ ]
644
+ )
645
+
646
+ # Load first sample
647
+ demo.load(
648
+ load_next_sample,
649
+ inputs=[current_participant_id],
650
+ outputs=[
651
+ audio_player, sample_id, sample_type, transcription, naturalness,
652
+ intelligibility, comments, status, original_speaker, synthetic_speaker,
653
+ participant_id_display, participant_guess, progress_text
654
+ ]
655
+ )
656
+
657
+ return demo
658
+
659
+ # Create the interface
660
+ if __name__ == "__main__":
661
+ demo = create_experiment_interface()
662
+ demo.launch()
experiment_Results.csv ADDED
@@ -0,0 +1 @@
 
 
1
+ timestamp,participant_id,sample_id,sample_type,naturalness_rating,intelligibility_rating,comments,transcription,original_speaker,synthetic_speaker,participant_guess,guess_correct
gradio_app.py ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ from experiment1 import create_experiment_interface
3
+
4
+ # Create the Gradio interface
5
+ demo = create_experiment_interface()
6
+
7
+ # For Gradio Cloud deployment
8
+ if __name__ == "__main__":
9
+ demo.launch()
requirements.txt ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ gradio>=3.50.2
2
+ pandas
3
+ numpy
4
+ soundfile
5
+ librosa
6
+ noisereduce
7
+ uuid
torgo-synthetic/data/README.md ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Synthetic Audio Files
2
+
3
+ This directory should contain the synthetic audio files for the experiment.
4
+
5
+ Due to file size constraints, audio files are not included in the git repository but should be uploaded separately.
6
+
7
+ ## Audio File Format
8
+
9
+ The audio files should be WAV format named according to the pattern in the metadata.csv file.
10
+
11
+ ## Demo Mode
12
+
13
+ The application will run in demo mode if no audio files are found in this directory.
torgo-synthetic/metadata.csv ADDED
@@ -0,0 +1,99 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ file_name,transcription,original_speaker,synthetic_speaker,original_file
2
+ data/F_F03_Session1_0355.wav,air,F03,F03,data/F_F03_Session1_0038.wav
3
+ data/F_F04_Session1_0355.wav,air,F03,F04,data/F_F03_Session1_0038.wav
4
+ data/F_F01_Session1_0355.wav,air,F03,F01,data/F_F03_Session1_0038.wav
5
+ data/F_M01_Session1_0355.wav,air,F03,M01,data/F_F03_Session1_0038.wav
6
+ data/F_M04_Session1_0355.wav,air,F03,M04,data/F_F03_Session1_0038.wav
7
+ data/F_M03_Session1_0355.wav,air,F03,M03,data/F_F03_Session1_0038.wav
8
+ data/F_M02_Session1_0355.wav,air,F03,M02,data/F_F03_Session1_0038.wav
9
+ data/F_F03_Session1_0483.wav,when he speaks his voice is just a bit cracked and quivers a trifle,F03,F03,data/F_F03_Session1_0095.wav
10
+ data/F_F04_Session1_0483.wav,when he speaks his voice is just a bit cracked and quivers a trifle,F03,F04,data/F_F03_Session1_0095.wav
11
+ data/F_F01_Session1_0483.wav,when he speaks his voice is just a bit cracked and quivers a trifle,F03,F01,data/F_F03_Session1_0095.wav
12
+ data/F_M01_Session1_0483.wav,when he speaks his voice is just a bit cracked and quivers a trifle,F03,M01,data/F_F03_Session1_0095.wav
13
+ data/F_M04_Session1_0483.wav,when he speaks his voice is just a bit cracked and quivers a trifle,F03,M04,data/F_F03_Session1_0095.wav
14
+ data/F_M03_Session1_0483.wav,when he speaks his voice is just a bit cracked and quivers a trifle,F03,M03,data/F_F03_Session1_0095.wav
15
+ data/F_M02_Session1_0483.wav,when he speaks his voice is just a bit cracked and quivers a trifle,F03,M02,data/F_F03_Session1_0095.wav
16
+ data/F_F03_Session1_0840.wav,the quick brown fox jumps over the lazy dog,F04,F03,data/F_F04_Session1_0065.wav
17
+ data/F_F04_Session1_0840.wav,the quick brown fox jumps over the lazy dog,F04,F04,data/F_F04_Session1_0065.wav
18
+ data/F_F01_Session1_0840.wav,the quick brown fox jumps over the lazy dog,F04,F01,data/F_F04_Session1_0065.wav
19
+ data/F_M01_Session1_0840.wav,the quick brown fox jumps over the lazy dog,F04,M01,data/F_F04_Session1_0065.wav
20
+ data/F_M04_Session1_0840.wav,the quick brown fox jumps over the lazy dog,F04,M04,data/F_F04_Session1_0065.wav
21
+ data/F_M03_Session1_0840.wav,the quick brown fox jumps over the lazy dog,F04,M03,data/F_F04_Session1_0065.wav
22
+ data/F_M02_Session1_0840.wav,the quick brown fox jumps over the lazy dog,F04,M02,data/F_F04_Session1_0065.wav
23
+ data/F_F03_Session1_0959.wav,knew,F04,F03,data/F_F04_Session1_0008.wav
24
+ data/F_F04_Session1_0959.wav,knew,F04,F04,data/F_F04_Session1_0008.wav
25
+ data/F_F01_Session1_0959.wav,knew,F04,F01,data/F_F04_Session1_0008.wav
26
+ data/F_M01_Session1_0959.wav,knew,F04,M01,data/F_F04_Session1_0008.wav
27
+ data/F_M04_Session1_0959.wav,knew,F04,M04,data/F_F04_Session1_0008.wav
28
+ data/F_M03_Session1_0959.wav,knew,F04,M03,data/F_F04_Session1_0008.wav
29
+ data/F_M02_Session1_0959.wav,knew,F04,M02,data/F_F04_Session1_0008.wav
30
+ data/F_F03_Session1_0973.wav,tear ,F01,F03,data/F_F01_Session1_0007.wav
31
+ data/F_F04_Session1_0973.wav,tear ,F01,F04,data/F_F01_Session1_0007.wav
32
+ data/F_F01_Session1_0973.wav,tear ,F01,F01,data/F_F01_Session1_0007.wav
33
+ data/F_M01_Session1_0973.wav,tear ,F01,M01,data/F_F01_Session1_0007.wav
34
+ data/F_M04_Session1_0973.wav,tear ,F01,M04,data/F_F01_Session1_0007.wav
35
+ data/F_M03_Session1_0973.wav,tear ,F01,M03,data/F_F01_Session1_0007.wav
36
+ data/F_M02_Session1_0973.wav,tear ,F01,M02,data/F_F01_Session1_0007.wav
37
+ data/F_F03_Session1_1071.wav,storm,F01,F03,data/F_F01_Session1_0019.wav
38
+ data/F_F04_Session1_1071.wav,storm,F01,F04,data/F_F01_Session1_0019.wav
39
+ data/F_F01_Session1_1071.wav,storm,F01,F01,data/F_F01_Session1_0019.wav
40
+ data/F_M01_Session1_1071.wav,storm,F01,M01,data/F_F01_Session1_0019.wav
41
+ data/F_M04_Session1_1071.wav,storm,F01,M04,data/F_F01_Session1_0019.wav
42
+ data/F_M03_Session1_1071.wav,storm,F01,M03,data/F_F01_Session1_0019.wav
43
+ data/F_M02_Session1_1071.wav,storm,F01,M02,data/F_F01_Session1_0019.wav
44
+ data/F_F03_Session1_1143.wav,don't ask me to carry an oily rag like that,M01,F03,data/M_M01_Session1_0044.wav
45
+ data/F_F04_Session1_1143.wav,don't ask me to carry an oily rag like that,M01,F04,data/M_M01_Session1_0044.wav
46
+ data/F_F01_Session1_1143.wav,don't ask me to carry an oily rag like that,M01,F01,data/M_M01_Session1_0044.wav
47
+ data/F_M01_Session1_1143.wav,don't ask me to carry an oily rag like that,M01,M01,data/M_M01_Session1_0044.wav
48
+ data/F_M04_Session1_1143.wav,don't ask me to carry an oily rag like that,M01,M04,data/M_M01_Session1_0044.wav
49
+ data/F_M03_Session1_1143.wav,don't ask me to carry an oily rag like that,M01,M03,data/M_M01_Session1_0044.wav
50
+ data/F_M02_Session1_1143.wav,don't ask me to carry an oily rag like that,M01,M02,data/M_M01_Session1_0044.wav
51
+ data/F_F03_Session1_1173.wav,fee,M01,F03,data/M_M01_Session1_0008.wav
52
+ data/F_F04_Session1_1173.wav,fee,M01,F04,data/M_M01_Session1_0008.wav
53
+ data/F_F01_Session1_1173.wav,fee,M01,F01,data/M_M01_Session1_0008.wav
54
+ data/F_M01_Session1_1173.wav,fee,M01,M01,data/M_M01_Session1_0008.wav
55
+ data/F_M04_Session1_1173.wav,fee,M01,M04,data/M_M01_Session1_0008.wav
56
+ data/F_M03_Session1_1173.wav,fee,M01,M03,data/M_M01_Session1_0008.wav
57
+ data/F_M02_Session1_1173.wav,fee,M01,M02,data/M_M01_Session1_0008.wav
58
+ data/F_F03_Session1_1540.wav,both injuries were to the same leg,M04,F03,data/M_M04_Session2_0298.wav
59
+ data/F_F04_Session1_1540.wav,both injuries were to the same leg,M04,F04,data/M_M04_Session2_0298.wav
60
+ data/F_F01_Session1_1540.wav,both injuries were to the same leg,M04,F01,data/M_M04_Session2_0298.wav
61
+ data/F_M01_Session1_1540.wav,both injuries were to the same leg,M04,M01,data/M_M04_Session2_0298.wav
62
+ data/F_M04_Session1_1540.wav,both injuries were to the same leg,M04,M04,data/M_M04_Session2_0298.wav
63
+ data/F_M03_Session1_1540.wav,both injuries were to the same leg,M04,M03,data/M_M04_Session2_0298.wav
64
+ data/F_M02_Session1_1540.wav,both injuries were to the same leg,M04,M02,data/M_M04_Session2_0298.wav
65
+ data/F_F03_Session1_1837.wav,fee,M04,F03,data/M_M04_Session1_0024.wav
66
+ data/F_F04_Session1_1837.wav,fee,M04,F04,data/M_M04_Session1_0024.wav
67
+ data/F_F01_Session1_1837.wav,fee,M04,F01,data/M_M04_Session1_0024.wav
68
+ data/F_M01_Session1_1837.wav,fee,M04,M01,data/M_M04_Session1_0024.wav
69
+ data/F_M04_Session1_1837.wav,fee,M04,M04,data/M_M04_Session1_0024.wav
70
+ data/F_M03_Session1_1837.wav,fee,M04,M03,data/M_M04_Session1_0024.wav
71
+ data/F_M02_Session1_1837.wav,fee,M04,M02,data/M_M04_Session1_0024.wav
72
+ data/F_F03_Session1_1916.wav,know,M03,F03,data/M_M03_Session2_0003.wav
73
+ data/F_F04_Session1_1916.wav,know,M03,F04,data/M_M03_Session2_0003.wav
74
+ data/F_F01_Session1_1916.wav,know,M03,F01,data/M_M03_Session2_0003.wav
75
+ data/F_M01_Session1_1916.wav,know,M03,M01,data/M_M03_Session2_0003.wav
76
+ data/F_M04_Session1_1916.wav,know,M03,M04,data/M_M03_Session2_0003.wav
77
+ data/F_M03_Session1_1916.wav,know,M03,M03,data/M_M03_Session2_0003.wav
78
+ data/F_M02_Session1_1916.wav,know,M03,M02,data/M_M03_Session2_0003.wav
79
+ data/F_F03_Session1_2002.wav,but he always answers banana oil,M03,F03,data/M_M03_Session2_0074.wav
80
+ data/F_F04_Session1_2002.wav,but he always answers banana oil,M03,F04,data/M_M03_Session2_0074.wav
81
+ data/F_F01_Session1_2002.wav,but he always answers banana oil,M03,F01,data/M_M03_Session2_0074.wav
82
+ data/F_M01_Session1_2002.wav,but he always answers banana oil,M03,M01,data/M_M03_Session2_0074.wav
83
+ data/F_M04_Session1_2002.wav,but he always answers banana oil,M03,M04,data/M_M03_Session2_0074.wav
84
+ data/F_M03_Session1_2002.wav,but he always answers banana oil,M03,M03,data/M_M03_Session2_0074.wav
85
+ data/F_M02_Session1_2002.wav,but he always answers banana oil,M03,M02,data/M_M03_Session2_0074.wav
86
+ data/F_F03_Session1_2531.wav,he dresses himself in an ancient black frock coat,M02,F03,data/M_M02_Session1_0044.wav
87
+ data/F_F04_Session1_2531.wav,he dresses himself in an ancient black frock coat,M02,F04,data/M_M02_Session1_0044.wav
88
+ data/F_F01_Session1_2531.wav,he dresses himself in an ancient black frock coat,M02,F01,data/M_M02_Session1_0044.wav
89
+ data/F_M01_Session1_2531.wav,he dresses himself in an ancient black frock coat,M02,M01,data/M_M02_Session1_0044.wav
90
+ data/F_M04_Session1_2531.wav,he dresses himself in an ancient black frock coat,M02,M04,data/M_M02_Session1_0044.wav
91
+ data/F_M03_Session1_2531.wav,he dresses himself in an ancient black frock coat,M02,M03,data/M_M02_Session1_0044.wav
92
+ data/F_M02_Session1_2531.wav,he dresses himself in an ancient black frock coat,M02,M02,data/M_M02_Session1_0044.wav
93
+ data/F_F03_Session1_2631.wav,pat,M02,F03,data/M_M02_Session1_0009.wav
94
+ data/F_F04_Session1_2631.wav,pat,M02,F04,data/M_M02_Session1_0009.wav
95
+ data/F_F01_Session1_2631.wav,pat,M02,F01,data/M_M02_Session1_0009.wav
96
+ data/F_M01_Session1_2631.wav,pat,M02,M01,data/M_M02_Session1_0009.wav
97
+ data/F_M04_Session1_2631.wav,pat,M02,M04,data/M_M02_Session1_0009.wav
98
+ data/F_M03_Session1_2631.wav,pat,M02,M03,data/M_M02_Session1_0009.wav
99
+ data/F_M02_Session1_2631.wav,pat,M02,M02,data/M_M02_Session1_0009.wav
torgo_original/data/README.md ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Original Audio Files
2
+
3
+ This directory should contain the original audio files for the experiment.
4
+
5
+ Due to file size constraints, audio files are not included in the git repository but should be uploaded separately.
6
+
7
+ ## Audio File Format
8
+
9
+ The audio files should be WAV format named according to the pattern in the metadata.csv file.
10
+
11
+ ## Demo Mode
12
+
13
+ The application will run in demo mode if no audio files are found in this directory.
torgo_original/metadata.csv ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ file_name,transcription
2
+ data/M_M02_Session1_0009.wav,pat
3
+ data/M_M01_Session1_0008.wav,fee
4
+ data/F_F04_Session1_0065.wav,the quick brown fox jumps over the lazy dog
5
+ data/F_F01_Session1_0019.wav,storm
6
+ data/F_F01_Session1_0007.wav,tear
7
+ data/F_F03_Session1_0095.wav,when he speaks his voice is just a bit cracked and quivers a trifle
8
+ data/M_M03_Session2_0003.wav,know
9
+ data/M_M04_Session1_0024.wav,fee
10
+ data/M_M03_Session2_0074.wav,but he always answers banana oil
11
+ data/M_M02_Session1_0044.wav,he dresses himself in an ancient black frock coat
12
+ data/M_M01_Session1_0044.wav,don't ask me to carry an oily rag like that
13
+ data/F_F03_Session1_0038.wav,air
14
+ data/M_M04_Session2_0298.wav,both injuries were to the same leg
15
+ data/F_F04_Session1_0008.wav,knew