# TREA 2.0 Pipeline Audio question-answering dataset generator using ESC-50. Creates four task types: COUNT, DURATION, ORDER, and VOLUME. ## Quick Start ```bash # 1. Install dependencies pip install -r requirements.txt # 2. Preprocess ESC-50 (required for DURATION task only) python preprocess_esc50.py --config config.yaml # 3. Generate datasets python main.py --config config.yaml ``` ## Configuration Edit `config.yaml` to set: - **Task duration**: `task_duration_size` (hours) per task - **Clip duration range**: `min_clip_duration` to `max_clip_duration` (seconds) - **ESC-50 paths**: Point to your ESC-50 dataset location - **Enable/disable tasks**: Set `enabled: true/false` for each task ## Key Files - **`config.yaml`** - All configuration parameters - **`main.py`** - Pipeline entry point (runs all tasks) - **`preprocess_esc50.py`** - Preprocess ESC-50 for duration task - **`tasks/task_*.py`** - Individual task generators ## Tasks | Task | Question | Example | |------|----------|---------| | **COUNT** | "How many unique sounds?" | Audio with distinct sound types | | **DURATION** | "Which sound is longest/shortest?" | Compare sound durations | | **ORDER** | "Which sound is first/last/after X?" | Temporal sequence questions | | **VOLUME** | "Which sound is loudest/softest?" | Loudness comparison | ## Output Structure ``` output/{task}/ ├── audios/*.wav # Generated audio files ├── {task}_mcq.csv # Multiple choice questions ├── {task}_open_text.csv # Open-ended questions └── {task}_metadata.csv # Detailed metadata ``` ## Shell scripts (quick) Use the provided shell helpers for simple runs. Run full pipeline (uses `python main.py` under the hood): ```bash # Make executable and run (from pipeline/) ./run_pipeline.sh # With custom config, tasks, and output ./run_pipeline.sh --config my_config.yaml --tasks count,order --output ./my_dataset ``` Run the LLM answer generation across splits (uses `llm_answer_generator.py`): ```bash # Processes open_text CSVs across splits/tasks defined in the script ./run_llm_answers_all.sh # Or run per-file with the helper script directly python llm_answer_generator.py --input /path/to/count_open_text.csv --mode open_text --task count ``` ## Advanced Usage ```bash # Run specific tasks only python main.py --tasks count order # Use custom config python main.py --config my_config.yaml # Custom output directory python main.py --output /path/to/output # Preprocess with custom parameters python preprocess_esc50.py --config config.yaml \ --threshold-strategy noise_floor \ --noise-floor-percentile 2.0 \ --noise-floor-delta-db 5.0 ``` ## Documentation See **`DOCS.md`** for complete technical documentation including: - Mathematical formulations - Detailed algorithm explanations - Configuration parameter reference - Preprocessing pipeline details - Balancing mechanisms ## Requirements - Python 3.8+ - pydub - numpy - pandas - tqdm - pyyaml