deringeorge commited on
Commit
d901d8e
·
1 Parent(s): ce45408

docs: add PR template, CI workflow, architecture docs, research log, and module docstrings

Browse files
.github/pull_request_template.md ADDED
@@ -0,0 +1,79 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!--
2
+ Before submitting, link any related issue in the Description section below
3
+ by writing "Closes #issue_number" so GitHub auto-closes it on merge.
4
+ -->
5
+
6
+ ## Description
7
+
8
+ <!-- What does this PR do and why? Be specific: what problem does it solve,
9
+ what approach was taken, and what alternatives were considered and ruled out. -->
10
+
11
+ Closes #
12
+
13
+ ---
14
+
15
+ ## PR Type
16
+
17
+ - [ ] Bug fix
18
+ - [ ] New feature
19
+ - [ ] Data contribution (URN database)
20
+ - [ ] Documentation update
21
+ - [ ] Model / algorithm improvement
22
+ - [ ] Performance improvement
23
+ - [ ] Tests
24
+ - [ ] Other (describe below)
25
+
26
+ ---
27
+
28
+ ## Modules Affected
29
+
30
+ - [ ] Module 1 — Design Input Engine
31
+ - [ ] Module 2 — URN Prediction Core
32
+ - [ ] Module 3 — IMO Compliance Checker
33
+ - [ ] Module 4 — Marine Bioacoustic Impact Module
34
+ - [ ] Module 5 — Mitigation Recommendation Engine
35
+ - [ ] Module 6 — Open URN Database
36
+ - [ ] General / Infrastructure
37
+
38
+ ---
39
+
40
+ ## Scientific Basis
41
+
42
+ <!-- Required if this PR changes any prediction algorithm, bioacoustic
43
+ calculation, compliance logic, or data processing pipeline.
44
+ Provide either a citation to a published paper or a clear technical
45
+ explanation of the methodology. If this PR does not touch any of the
46
+ above, write N/A. -->
47
+
48
+ **N/A**
49
+
50
+ ---
51
+
52
+ ## Testing
53
+
54
+ <!-- Describe what tests were run, what the results were, and whether any
55
+ edge cases were covered. Paste the relevant pytest output below. -->
56
+ ```
57
+ # pytest output here
58
+ ```
59
+
60
+ ---
61
+
62
+ ## Screenshots / Output
63
+
64
+ <!-- For UI changes or visualization updates: paste a screenshot.
65
+ For model or algorithm changes: paste a before/after metric comparison.
66
+ For data contributions: paste a sample row showing the expected data format.
67
+ Not applicable for documentation-only PRs. -->
68
+
69
+ ---
70
+
71
+ ## Pre-Merge Checklist
72
+
73
+ - [ ] Code follows PEP 8 and all new functions include type hints
74
+ - [ ] All new functions have docstrings (Google-style)
75
+ - [ ] `pytest` passes locally with no new failures
76
+ - [ ] No raw data files, `.pt` model weights, or `.env` files are included
77
+ - [ ] `CHANGELOG.md` updated under `[Unreleased]` with a summary of changes
78
+ - [ ] If any module algorithm or methodology was changed, a scientific reference
79
+ or technical justification is provided in the Scientific Basis section above
.github/workflows/tests.yml ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # SONARIS Test Suite
2
+ # Runs on every push to any branch and every PR targeting main.
3
+
4
+ name: SONARIS Test Suite
5
+
6
+ on:
7
+ push:
8
+ branches: ["**"] # Catch all branches, not just main
9
+ pull_request:
10
+ branches: [main]
11
+
12
+ jobs:
13
+ test:
14
+ name: Run pytest
15
+ runs-on: ubuntu-latest
16
+
17
+ steps:
18
+ # Step 1: Pull the full repository into the runner
19
+ - name: Checkout repository
20
+ uses: actions/checkout@v4
21
+
22
+ # Step 2: Set up Python 3.11.
23
+ # Local development uses 3.14 on Windows, but 3.11 has stable wheel
24
+ # support on Ubuntu CI runners. All SONARIS dependencies install cleanly
25
+ # on 3.11 without C++ compiler errors.
26
+ - name: Set up Python 3.11
27
+ uses: actions/setup-python@v5
28
+ with:
29
+ python-version: "3.11"
30
+
31
+ # Step 3: Cache pip's download cache so repeated runs don't re-download
32
+ # packages. Cache key includes the OS and the hash of requirements.txt
33
+ # so the cache invalidates automatically when dependencies change.
34
+ - name: Cache pip dependencies
35
+ uses: actions/cache@v4
36
+ with:
37
+ path: ~/.cache/pip
38
+ key: ${{ runner.os }}-pip-${{ hashFiles('requirements.txt') }}
39
+ restore-keys: |
40
+ ${{ runner.os }}-pip-
41
+
42
+ # Step 4: Install all project dependencies from requirements.txt.
43
+ # Upgrade pip first to avoid resolver warnings on older bundled versions.
44
+ - name: Install dependencies
45
+ run: |
46
+ python -m pip install --upgrade pip
47
+ pip install -r requirements.txt
48
+ pip install pytest pytest-cov
49
+
50
+ # Step 5: Run pytest.
51
+ # --tb=short keeps failure output readable without being verbose.
52
+ # --cov generates a coverage report for the modules/ package.
53
+ # The || true at the end ensures the step exits 0 even when no test
54
+ # files are found, so the CI badge stays green during early development.
55
+ - name: Run tests
56
+ run: |
57
+ pytest tests/ -v --tb=short --cov=modules --cov-report=xml \
58
+ || ([ $? -eq 5 ] && echo "No tests collected, exiting cleanly." && exit 0)
59
+
60
+ # Step 6: Upload the XML coverage report as a workflow artifact so it
61
+ # can be retrieved and fed into a coverage service (e.g. Codecov) later.
62
+ - name: Upload coverage report
63
+ uses: actions/upload-artifact@v4
64
+ with:
65
+ name: coverage-report
66
+ path: coverage.xml
67
+ if-no-files-found: ignore # Don't fail if no coverage file was generated
docs/architecture.md CHANGED
@@ -1,217 +1,363 @@
1
- # SONARIS Project Architecture
2
 
3
- **Full Name:** Ship-Ocean Noise Acoustic Radiated Intelligence System
4
- **License:** MIT
5
- **Python:** 3.10+
 
 
6
 
7
- This document describes the complete folder structure of the SONARIS repository and the intended contents of every file.
 
 
 
 
 
 
 
8
 
9
  ---
10
 
11
- ## Folder Tree
12
-
13
  ```
14
- sonaris/
15
- |
16
- |-- app.py # Streamlit entry point; assembles all module UIs into one application
17
- |-- requirements.txt # All Python dependencies, pinned with minimum version constraints
18
- |-- .env.example # Template for environment variables (database path, API keys, debug flags)
19
- |-- .gitignore # Ignores venv, __pycache__, .env, model weights, large data files
20
- |-- LICENSE # MIT License text
21
- |-- README.md # Project overview, architecture summary, setup instructions
22
- |-- CONTRIBUTING.md # Contribution guidelines: code standards, PR process, scientific validation
23
- |-- CHANGELOG.md # Version history and release notes
24
- |
25
- |-- docs/
26
- | |-- architecture.md # This file: folder structure and per-file descriptions
27
- | |-- methodology.md # Scientific methodology: FW-H equation, MFCC application, BIS scoring
28
- | |-- imo_guidelines.md # Summary of IMO MEPC.1/Circ.906 Rev.1 (2024) limits used in Module 3
29
- | |-- datasets.md # Dataset descriptions, download instructions, and citation information
30
- | |-- api_reference.md # Python API reference for headless use of each module
31
- | |-- deployment.md # Instructions for deploying to Hugging Face Spaces and Streamlit Cloud
32
- |
33
- |-- sonaris/ # Main Python package
34
- | |-- __init__.py # Package init; exposes URNPredictor, ComplianceChecker, BioacousticImpact
35
- | |
36
- | |-- module1_input/ # Module 1: Design Input Engine
37
- | | |-- __init__.py
38
- | | |-- input_schema.py # Pydantic models defining and validating all vessel input parameters
39
- | | |-- input_ui.py # Streamlit UI component for Module 1: form fields, units, help tooltips
40
- | | |-- parameter_utils.py # Derived parameter calculations (e.g. advance ratio J from RPM and speed)
41
- | | |-- validators.py # Range checks and cross-parameter validation (e.g. propeller diameter vs draft)
42
- | |
43
- | |-- module2_urn/ # Module 2: URN Prediction Core
44
- | | |-- __init__.py
45
- | | |-- predictor.py # Main URNPredictor class: orchestrates physics and AI layers, returns spectrum
46
- | | |-- physics_layer.py # OpenFOAM run management and libAcoustics FW-H post-processing wrapper
47
- | | |-- ai_layer.py # Loads trained model, runs inference, returns residual correction to physics output
48
- | | |-- feature_engineering.py # MFCC extraction, spectral envelope fitting, 1/3-octave band aggregation
49
- | | |-- model_architecture.py # PyTorch definition of the 1D-CNN + LSTM URN prediction network
50
- | | |-- train.py # Training script: data loading, loss function, optimizer, checkpoint saving
51
- | | |-- evaluate.py # Evaluation script: computes per-band MAE, RMSE against held-out test set
52
- | | |-- uncertainty.py # Monte Carlo dropout for prediction confidence interval estimation
53
- | | |-- spectrum_utils.py # Conversion utilities: Pa to dB, 1/1-octave to 1/3-octave, frequency array generation
54
- | |
55
- | |-- module3_compliance/ # Module 3: IMO Compliance Checker
56
- | | |-- __init__.py
57
- | | |-- checker.py # ComplianceChecker class: loads limits, compares spectrum, returns verdict per band
58
- | | |-- imo_limits.py # Hard-coded URN limits from MEPC.1/Circ.906 Rev.1 by vessel type and frequency band
59
- | | |-- report_generator.py # Builds the downloadable PDF compliance report using ReportLab
60
- | | |-- compliance_ui.py # Streamlit UI component: compliance bar chart, pass/fail table, download button
61
- | |
62
- | |-- module4_bioacoustics/ # Module 4: Marine Bioacoustic Impact Module
63
- | | |-- __init__.py
64
- | | |-- impact_scorer.py # BioacousticImpact class: computes BIS per species group from input spectrum
65
- | | |-- audiograms.py # Digitized hearing sensitivity curves for all 5 functional hearing groups
66
- | | |-- masking_model.py # Psychoacoustic masking model: excitation patterns, masking threshold calculation
67
- | | |-- harmonic_overlap.py # Finds ship tonal peaks within +/- 1/3 octave of published species call frequencies
68
- | | |-- species_calls.py # Published frequency ranges for vocalizations of each target species group
69
- | | |-- bis_scoring.py # BIS formula: integrates masked proportion of species frequency range (0-100 scale)
70
- | | |-- bioacoustics_ui.py # Streamlit UI: spectrogram overlay, BIS gauges, species selection panel
71
- | |
72
- | |-- module5_mitigation/ # Module 5: Mitigation Recommendation Engine
73
- | | |-- __init__.py
74
- | | |-- recommender.py # MitigationRecommender class: takes compliance gaps and BIS, returns ranked actions
75
- | | |-- speed_optimizer.py # Predicts URN reduction as a function of speed reduction for given vessel type
76
- | | |-- propeller_advisor.py # Maps compliance gap magnitude to specific propeller geometry modification targets
77
- | | |-- hull_treatment.py # Recommends hull panel damping treatments based on dominant tonal frequencies
78
- | | |-- routing_advisor.py # Generates routing avoidance polygons around marine protected areas and known habitats
79
- | | |-- mitigation_ui.py # Streamlit UI: ranked recommendation cards with estimated dB reduction per action
80
- | |
81
- | |-- module6_database/ # Module 6: Open URN Database
82
- | | |-- __init__.py
83
- | | |-- models.py # SQLAlchemy ORM models: Ship, URNRecord, Submission, UserContribution
84
- | | |-- database.py # Database engine setup, session factory, connection handling
85
- | | |-- crud.py # Create, read, update, delete operations for all database tables
86
- | | |-- submission_pipeline.py # Validates, normalizes, and ingests community-submitted URN records
87
- | | |-- quality_control.py # Checks submissions against ShipsEar/QiandaoEar22 baseline distributions
88
- | | |-- search.py # Query functions: filter by vessel type, speed, frequency band, submission date
89
- | | |-- database_ui.py # Streamlit UI: search interface, submission form, record detail view
90
- | | |-- migrations/ # Alembic migration scripts directory
91
- | | |-- env.py # Alembic environment configuration
92
- | | |-- versions/ # Auto-generated migration version files go here
93
- | |
94
- | |-- shared/ # Shared utilities used by more than one module
95
- | |-- __init__.py
96
- | |-- constants.py # Physical constants, frequency band definitions, species group identifiers
97
- | |-- logging_config.py # Loguru logger configuration applied consistently across all modules
98
- | |-- config.py # Loads and exposes .env and config.yaml settings to all modules
99
- | |-- file_utils.py # Helpers for reading/writing WAV, CSV, JSON, and HDF5 files
100
- | |-- plot_utils.py # Shared Matplotlib/Plotly helper functions for consistent chart styling
101
- |
102
- |-- models/ # Trained model weights and metadata (git-ignored for large files)
103
- | |-- urn_predictor_v1.pt # Saved PyTorch model checkpoint after initial training run
104
- | |-- urn_predictor_v1_meta.json # Training metadata: dataset split, hyperparameters, validation metrics
105
- |
106
- |-- data/ # Local data storage (git-ignored except for structure and seed files)
107
- | |-- raw/ # Raw downloaded datasets, unmodified
108
- | | |-- shipsear/ # ShipsEar dataset audio files and metadata
109
- | | |-- qiandaoear22/ # QiandaoEar22 dataset audio files and metadata
110
- | | |-- audiograms/ # Published audiogram CSVs per species group
111
- | |
112
- | |-- processed/ # Preprocessed features ready for model training
113
- | | |-- mfcc_features.h5 # Extracted MFCC feature matrix for all training samples
114
- | | |-- octave_spectra.h5 # 1/3-octave spectra computed from all training audio files
115
- | | |-- labels.csv # Vessel type labels and metadata for each training sample
116
- | |
117
- | |-- seed/ # Small seed data committed to the repository
118
- | |-- imo_limits.json # IMO MEPC.1/Circ.906 Rev.1 limit tables in machine-readable form
119
- | |-- species_audiograms.json # Digitized audiogram data for all 5 functional hearing groups
120
- | |-- species_calls.json # Published vocalization frequency ranges per species group
121
- |
122
- |-- notebooks/ # Research and development notebooks
123
- | |-- 01_dataset_exploration.ipynb # Initial exploration of ShipsEar and QiandaoEar22 distributions
124
- | |-- 02_feature_engineering.ipynb # MFCC pipeline development and 1/3-octave band analysis
125
- | |-- 03_model_training.ipynb # URN prediction model training experiments and loss curves
126
- | |-- 04_compliance_validation.ipynb # Verification of compliance checker against known test cases
127
- | |-- 05_bioacoustic_analysis.ipynb # BIS scoring development and masking model calibration
128
- | |-- 06_mitigation_experiments.ipynb # Speed-noise relationship analysis for mitigation module
129
- | |-- 07_database_schema_design.ipynb # URN database schema development and query prototyping
130
- |
131
- |-- scripts/ # Standalone scripts for data preparation and model management
132
- | |-- download_shipsear.py # Downloads ShipsEar dataset from source and places it in data/raw/shipsear/
133
- | |-- download_qiandaoear.py # Downloads QiandaoEar22 dataset and places it in data/raw/qiandaoear22/
134
- | |-- preprocess_audio.py # Runs full audio-to-features pipeline and writes to data/processed/
135
- | |-- train_model.py # CLI wrapper around module2_urn/train.py for scheduled training runs
136
- | |-- export_model.py # Exports trained model to ONNX format for deployment environments
137
- | |-- init_database.py # Creates database schema and loads seed data on first setup
138
- | |-- seed_database.py # Populates URN database with curated example records for demonstration
139
- |
140
- |-- tests/ # Test suite
141
- | |-- __init__.py
142
- | |-- conftest.py # Shared pytest fixtures: sample vessel parameters, synthetic spectra, db session
143
- | |
144
- | |-- test_module1/
145
- | | |-- test_input_schema.py # Tests that valid and invalid vessel parameter inputs are handled correctly
146
- | | |-- test_validators.py # Tests for all cross-parameter validation rules
147
- | |
148
- | |-- test_module2/
149
- | | |-- test_feature_engineering.py # Tests MFCC output shape, 1/3-octave band count, and numerical stability
150
- | | |-- test_model_architecture.py # Tests model forward pass shape and output range
151
- | | |-- test_spectrum_utils.py # Tests dB conversion, band aggregation, and frequency array generation
152
- | |
153
- | |-- test_module3/
154
- | | |-- test_checker.py # Tests compliance verdicts against known pass and fail spectra
155
- | | |-- test_report_generator.py # Tests PDF generation and checks that required sections are present
156
- | |
157
- | |-- test_module4/
158
- | | |-- test_masking_model.py # Tests masking threshold outputs against published psychoacoustic reference values
159
- | | |-- test_bis_scoring.py # Tests BIS edge cases: zero noise, full masking, single frequency input
160
- | |
161
- | |-- test_module5/
162
- | | |-- test_recommender.py # Tests that recommendations are ranked and non-empty for all compliance scenarios
163
- | | |-- test_speed_optimizer.py # Tests speed-to-noise reduction curve against known ship data
164
- | |
165
- | |-- test_module6/
166
- | |-- test_crud.py # Tests all database read and write operations against an in-memory SQLite instance
167
- | |-- test_quality_control.py # Tests rejection of out-of-distribution and malformed URN submissions
168
- |
169
- |-- config/
170
- |-- config.yaml # Default configuration: model paths, database URL, logging level, band definitions
171
- |-- logging.yaml # Loguru handler configuration for file and console output
172
  ```
173
 
174
  ---
175
 
176
- ## File Count Summary
177
-
178
- | Directory | Files |
179
- |---|---|
180
- | Root | 8 |
181
- | docs/ | 6 |
182
- | sonaris/ (package) | 52 |
183
- | models/ | 2 |
184
- | data/ | 9 |
185
- | notebooks/ | 7 |
186
- | scripts/ | 8 |
187
- | tests/ | 17 |
188
- | config/ | 2 |
189
- | **Total** | **111** |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
190
 
191
  ---
192
 
193
- ## Module to Directory Mapping
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
194
 
195
- | Module | Directory |
196
- |---|---|
197
- | Module 1: Design Input Engine | `sonaris/module1_input/` |
198
- | Module 2: URN Prediction Core | `sonaris/module2_urn/` |
199
- | Module 3: IMO Compliance Checker | `sonaris/module3_compliance/` |
200
- | Module 4: Marine Bioacoustic Impact | `sonaris/module4_bioacoustics/` |
201
- | Module 5: Mitigation Recommendation Engine | `sonaris/module5_mitigation/` |
202
- | Module 6: Open URN Database | `sonaris/module6_database/` |
203
- | Shared Utilities | `sonaris/shared/` |
204
 
205
  ---
206
 
207
- ## Key Design Decisions
208
 
209
- **Single package, modular internals.** All six modules live inside the `sonaris/` package. This means the Python API (`from sonaris import URNPredictor`) works without any knowledge of the internal module structure, while the internals remain cleanly separated.
 
 
 
 
 
210
 
211
- **Seed data is version-controlled; raw audio is not.** The `data/seed/` directory (IMO limits, audiograms, species call ranges) is committed to the repository so the tool works out of the box. Raw audio datasets are large and externally hosted; the download scripts in `scripts/` handle retrieval.
 
 
 
 
 
212
 
213
- **Model weights are not committed.** Trained `.pt` files live in `models/` which is git-ignored. The `scripts/train_model.py` script reproduces them from the processed dataset. A pre-trained checkpoint will be hosted separately on Hugging Face Hub.
 
 
 
 
214
 
215
- **Migrations directory is tracked.** The `sonaris/module6_database/migrations/` directory and its `env.py` are committed. Individual migration version files are generated by Alembic as the schema evolves and should also be committed.
 
 
 
 
 
216
 
217
- **One UI file per module.** Each module has a `_ui.py` file that defines its Streamlit component as a callable function. `app.py` imports and assembles these into a single multi-page application. This keeps UI logic out of the scientific core.
 
 
 
 
 
1
+ # SONARIS System Architecture
2
 
3
+ This document describes the technical structure of the SONARIS codebase, the
4
+ purpose of every file in the repository, the data flow between modules, and
5
+ the design decisions made during Phase 0 setup. It serves as the primary
6
+ reference for contributors onboarding to the project and will be cited in the
7
+ SONARIS research paper as the system architecture reference.
8
 
9
+ The six core modules form a linear processing pipeline: user-supplied vessel
10
+ parameters enter Module 1 and are progressively transformed into an acoustic
11
+ spectrum (Module 2), a compliance verdict (Module 3), a biological impact score
12
+ (Module 4), and a set of mitigation recommendations (Module 5). Every record
13
+ produced or consumed by this pipeline can optionally be written to or read from
14
+ the Open URN Database (Module 6). No module performs computation before the
15
+ previous module's output is available, which keeps the data contract between
16
+ modules explicit and testable.
17
 
18
  ---
19
 
20
+ ## Repository Tree
 
21
  ```
22
+ SONARIS/
23
+ ├── app.py
24
+ ├── requirements.txt
25
+ ├── README.md
26
+ ├── CONTRIBUTING.md
27
+ ├── CODE_OF_CONDUCT.md
28
+ ├── CHANGELOG.md
29
+ ├── .env.example
30
+ ├── .gitignore
31
+ ├── config/
32
+ │ └── settings.py
33
+ ├── data/
34
+ ├── raw/
35
+ ├── processed/
36
+ └── databases/
37
+ ├── modules/
38
+ ├── __init__.py
39
+ ├── input_engine/
40
+ │ │ ├── __init__.py
41
+ │ │ └── design_input.py
42
+ ├── urn_prediction/
43
+ │ ├── __init__.py
44
+ │ ├── physics_layer.py
45
+ └── ai_layer.py
46
+ ├── imo_compliance/
47
+ ├── __init__.py
48
+ └── compliance_checker.py
49
+ ├── bioacoustic/
50
+ │ ├── __init__.py
51
+ │ ├── audiogram_data.py
52
+ └── bis_calculator.py
53
+ ├── mitigation/
54
+ ├── __init__.py
55
+ └── recommender.py
56
+ └── urn_database/
57
+ │ ├── __init__.py
58
+ │ └── db_manager.py
59
+ ├── models/
60
+ ├── trained/
61
+ └── architectures/
62
+ ├── notebooks/
63
+ └── 01_ShipsEar_EDA.ipynb
64
+ ├── tests/
65
+ ├── __init__.py
66
+ └── test_modules.py
67
+ ├── docs/
68
+ ├── architecture.md
69
+ ├── api_reference.md
70
+ └── research_notes.md
71
+ ├── ui/
72
+ ├── pages/
73
+ └── components/
74
+ └── .github/
75
+ ├── pull_request_template.md
76
+ ├── workflows/
77
+ └── tests.yml
78
+ └── ISSUE_TEMPLATE/
79
+ ├── bug_report.md
80
+ ├── feature_request.md
81
+ └── data_contribution.md
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
82
  ```
83
 
84
  ---
85
 
86
+ ## Per-File Descriptions
87
+
88
+ ### Root
89
+
90
+ **app.py:** Entry point for the Streamlit application. Initialises the UI, routes
91
+ user interactions to the appropriate module calls, and renders output visualizations.
92
+
93
+ **requirements.txt:** Pinned list of all Python dependencies for the project.
94
+ Used by both local development setup and CI.
95
+
96
+ **README.md:** Public-facing project overview covering installation, usage, module
97
+ descriptions, and contribution instructions.
98
+
99
+ **CONTRIBUTING.md:** Contribution guide covering branch naming, commit conventions,
100
+ code style requirements, and the PR review process.
101
+
102
+ **CODE_OF_CONDUCT.md:** Contributor Covenant code of conduct for the SONARIS
103
+ open-source community.
104
+
105
+ **CHANGELOG.md:** Versioned log of all changes to the project, following Keep a
106
+ Changelog format.
107
+
108
+ **.env.example:** Template of all required environment variables with placeholder
109
+ values. Actual `.env` file is never committed.
110
+
111
+ **.gitignore:** Specifies files and directories excluded from version control,
112
+ including `.env`, trained model weights, raw datasets, and Python cache files.
113
+
114
+ ### config/
115
+
116
+ **settings.py:** Central configuration for the application: database paths, model
117
+ paths, default operational parameters, and environment variable loading.
118
+
119
+ ### data/
120
+
121
+ **raw/:** Storage directory for unprocessed source datasets (ShipsEar,
122
+ QiandaoEar22). Not committed to version control.
123
+
124
+ **processed/:** Storage directory for cleaned and feature-extracted datasets
125
+ ready for model training or evaluation.
126
+
127
+ **databases/:** Storage directory for the SQLite development database file.
128
+
129
+ ### modules/
130
+
131
+ **modules/\_\_init\_\_.py:** Top-level package initialiser for the modules namespace.
132
+ Does not execute computation on import.
133
+
134
+ #### modules/input_engine/
135
+
136
+ **\_\_init\_\_.py:** Package initialiser for the Design Input Engine module.
137
+
138
+ **design_input.py:** Validates and structures all user-supplied vessel parameters
139
+ (hull coefficients, propeller geometry, engine type, speed) into a standardised
140
+ `VesselParameters` dataclass passed downstream.
141
+
142
+ #### modules/urn_prediction/
143
+
144
+ **\_\_init\_\_.py:** Package initialiser for the URN Prediction Core module.
145
+
146
+ **physics_layer.py:** Interfaces with OpenFOAM and libAcoustics to run
147
+ propeller cavitation simulations and extract the physics-based component of the
148
+ noise spectrum.
149
+
150
+ **ai_layer.py:** Loads the trained PyTorch neural network, runs inference on
151
+ the structured vessel parameters, and returns a predicted 1/3-octave band
152
+ spectrum in dB re 1 μPa at 1 m.
153
+
154
+ #### modules/imo_compliance/
155
+
156
+ **\_\_init\_\_.py:** Package initialiser for the IMO Compliance Checker module.
157
+
158
+ **compliance_checker.py:** Compares the predicted URN spectrum against the
159
+ vessel-type-specific limit tables from IMO MEPC.1/Circ.906 Rev.1 (2024) and
160
+ returns a structured `ComplianceResult` with per-band pass/fail flags.
161
+
162
+ #### modules/bioacoustic/
163
+
164
+ **\_\_init\_\_.py:** Package initialiser for the Marine Bioacoustic Impact Module.
165
+
166
+ **audiogram_data.py:** Contains the published audiogram sensitivity curves for
167
+ five marine mammal functional hearing groups: low-frequency cetaceans, mid-frequency
168
+ cetaceans, high-frequency cetaceans, phocid pinnipeds in water, and otariid pinnipeds
169
+ in water.
170
+
171
+ **bis_calculator.py:** Applies MFCC decomposition and spectral masking analysis
172
+ to quantify the overlap between the ship noise spectrum and each species group's
173
+ hearing sensitivity range, producing a Biological Interference Score per group.
174
+
175
+ #### modules/mitigation/
176
+
177
+ **\_\_init\_\_.py:** Package initialiser for the Mitigation Recommendation Engine.
178
+
179
+ **recommender.py:** Receives upstream outputs (URN spectrum, compliance result,
180
+ BIS scores) and generates ranked mitigation recommendations with estimated noise
181
+ reduction in dB for each option.
182
+
183
+ #### modules/urn_database/
184
+
185
+ **\_\_init\_\_.py:** Package initialiser for the Open URN Database module.
186
+
187
+ **db_manager.py:** Handles all database operations: schema initialisation, record
188
+ insertion, querying by vessel type or frequency band, and export to JSON or CSV.
189
+
190
+ ### models/
191
+
192
+ **trained/:** Storage directory for serialised trained model weights (`.pt` files).
193
+ Not committed to version control.
194
+
195
+ **architectures/:** Python files defining the PyTorch neural network architectures
196
+ used in Module 2.
197
+
198
+ ### notebooks/
199
+
200
+ **01_ShipsEar_EDA.ipynb:** Exploratory data analysis notebook for the ShipsEar
201
+ dataset. Covers class distribution, spectrogram inspection, signal-to-noise
202
+ assessment, and feature extraction prototyping.
203
+
204
+ ### tests/
205
+
206
+ **tests/\_\_init\_\_.py:** Makes the tests directory a Python package.
207
+
208
+ **test_modules.py:** pytest test suite covering unit tests for all six modules.
209
+ Integration tests covering the full pipeline are added here as modules mature.
210
+
211
+ ### docs/
212
+
213
+ **architecture.md:** This file. Technical reference for the repository structure,
214
+ data flow, module interfaces, and design decisions.
215
+
216
+ **api_reference.md:** Auto-generated or manually maintained documentation of all
217
+ public functions and classes across the six modules.
218
+
219
+ **research_notes.md:** Running scientific journal for the project. Logs dataset
220
+ assessments, methodology decisions, and literature references that feed into the
221
+ eventual SONARIS research paper.
222
+
223
+ ### ui/
224
+
225
+ **pages/:** Streamlit multi-page app files, one per major UI view.
226
+
227
+ **components/:** Reusable Streamlit UI components shared across pages.
228
+
229
+ ### .github/
230
+
231
+ **pull_request_template.md:** Template automatically loaded when a contributor
232
+ opens a pull request, including the pre-merge checklist and scientific basis
233
+ section.
234
+
235
+ **workflows/tests.yml:** GitHub Actions workflow that runs the pytest test suite
236
+ on every push and every pull request targeting main.
237
+
238
+ **ISSUE_TEMPLATE/bug_report.md:** Structured template for reporting bugs.
239
+
240
+ **ISSUE_TEMPLATE/feature_request.md:** Structured template for proposing new
241
+ features.
242
+
243
+ **ISSUE_TEMPLATE/data_contribution.md:** Structured template for contributing
244
+ URN measurement records to the Open URN Database.
245
+
246
+ ---
247
+
248
+ ## Data Flow
249
+
250
+ 1. The user supplies vessel parameters through the Streamlit UI or directly via
251
+ the Python API. Inputs include hull form coefficients (Cb, Cp, L/B ratio),
252
+ propeller geometry (blade count, pitch ratio, diameter), engine type, rated
253
+ RPM, and operational speed in knots.
254
+
255
+ 2. **Module 1 (Design Input Engine)** validates all inputs, applies physical
256
+ plausibility checks, and packages them into a `VesselParameters` dataclass.
257
+ This object is the sole input passed to Module 2.
258
+
259
+ 3. **Module 2 (URN Prediction Core)** receives the `VesselParameters` object.
260
+ The physics layer optionally runs an OpenFOAM simulation to produce a
261
+ physics-derived partial spectrum. The AI layer runs the trained PyTorch
262
+ model to produce a data-driven full spectrum. The two layers are fused into
263
+ a single `URNSpectrum` object: a dict mapping each 1/3-octave band center
264
+ frequency (Hz) to a level in dB re 1 μPa at 1 m.
265
+
266
+ 4. **Module 3 (IMO Compliance Checker)** receives the `URNSpectrum` and the
267
+ vessel type string from `VesselParameters`. It returns a `ComplianceResult`
268
+ object containing a per-band pass/fail dict, an overall compliance flag, and
269
+ the reference limit values used for comparison.
270
+
271
+ 5. **Module 4 (Marine Bioacoustic Impact Module)** receives the `URNSpectrum`.
272
+ It applies MFCC decomposition across the spectrum and computes spectral overlap
273
+ against each of the five audiogram curves. It returns a `BISResult` object:
274
+ a dict mapping each marine mammal group name to its Biological Interference
275
+ Score (0 to 100) and the frequency bands driving the interference.
276
+
277
+ 6. **Module 5 (Mitigation Recommendation Engine)** receives the `URNSpectrum`,
278
+ the `ComplianceResult`, and the `BISResult`. It returns a ranked list of
279
+ `MitigationRecommendation` objects, each containing a description, the
280
+ expected noise reduction in dB, and the applicable species groups or
281
+ frequency bands.
282
+
283
+ 7. All outputs can optionally be written to **Module 6 (Open URN Database)** as
284
+ a structured record containing vessel metadata, measurement conditions, and
285
+ the full `URNSpectrum`. Records are also queryable from the database to
286
+ populate the community dataset.
287
 
288
  ---
289
 
290
+ ## Module Interfaces
291
+
292
+ **Module 1: Design Input Engine**
293
+ Input: Raw user-supplied values via form or dict. Hull coefficients, propeller
294
+ geometry, engine type, RPM, speed in knots.
295
+ Output: `VesselParameters` dataclass with validated and typed fields.
296
+ Key dependencies: `pydantic` or `dataclasses`, `scipy` for range validation.
297
+
298
+ **Module 2: URN Prediction Core**
299
+ Input: `VesselParameters` dataclass.
300
+ Output: `URNSpectrum` — dict mapping 1/3-octave band center frequencies (Hz)
301
+ to levels in dB re 1 μPa at 1 m, covering 20 Hz to 20 kHz.
302
+ Key dependencies: `torch`, `numpy`, `scipy`, OpenFOAM CLI (optional physics layer).
303
+
304
+ **Module 3: IMO Compliance Checker**
305
+ Input: `URNSpectrum`, vessel type string.
306
+ Output: `ComplianceResult` — overall pass/fail flag, per-band results, limit
307
+ values from IMO MEPC.1/Circ.906 Rev.1 (2024).
308
+ Key dependencies: Internal lookup tables encoding IMO limit values by vessel type.
309
+
310
+ **Module 4: Marine Bioacoustic Impact Module**
311
+ Input: `URNSpectrum`.
312
+ Output: `BISResult` — per-species-group Biological Interference Score and
313
+ contributing frequency band list. Spectrogram overlay data for visualization.
314
+ Key dependencies: `librosa`, `scipy.signal`, `numpy`, `matplotlib`.
315
+
316
+ **Module 5: Mitigation Recommendation Engine**
317
+ Input: `URNSpectrum`, `ComplianceResult`, `BISResult`.
318
+ Output: List of `MitigationRecommendation` objects ranked by expected dB
319
+ reduction.
320
+ Key dependencies: Internal rule engine and lookup tables. No external ML.
321
 
322
+ **Module 6: Open URN Database**
323
+ Input (write): Vessel metadata dict + `URNSpectrum` + measurement conditions dict.
324
+ Input (read): Query parameters (vessel type, IMO number, frequency range, date range).
325
+ Output (read): List of matching database records as dicts or Pandas DataFrames.
326
+ Key dependencies: `sqlite3` (development), `psycopg2` (production), `pandas`.
 
 
 
 
327
 
328
  ---
329
 
330
+ ## Design Decisions
331
 
332
+ **1. mSOUND removed from pip, manual source integration planned for Phase 3.**
333
+ mSOUND is not available as a pip-installable package compatible with the project's
334
+ Python environment. Rather than vendor an untested integration in Phase 0, the
335
+ physics simulation layer is scoped to OpenFOAM plus libAcoustics for Phases 1 and 2.
336
+ mSOUND will be integrated manually from source in Phase 3 once the core pipeline
337
+ is stable.
338
 
339
+ **2. pyaudio and spectrum removed due to Python 3.14 wheel unavailability.**
340
+ Neither `pyaudio` nor `spectrum` publish binary wheels for Python 3.14 on Windows
341
+ as of Phase 0. Both would require a local C++ build toolchain. All audio processing
342
+ and spectral estimation needed for the MFCC pipeline is available through `librosa`
343
+ and `scipy.signal`, which do provide 3.14-compatible wheels. The two packages were
344
+ removed from `requirements.txt` with no loss of required functionality.
345
 
346
+ **3. MFCC pipeline uses librosa and scipy only.**
347
+ The bioacoustic masking and MFCC analysis in Module 4 are implemented entirely
348
+ with `librosa` for feature extraction and `scipy.signal` for spectral processing.
349
+ This keeps the dependency count low, avoids binary-only packages, and gives full
350
+ access to the intermediate signal representations needed for the BIS calculation.
351
 
352
+ **4. CI uses Python 3.11 while local development uses Python 3.14.**
353
+ GitHub Actions Ubuntu runners have stable binary wheel availability for all SONARIS
354
+ dependencies on Python 3.11. Python 3.14 is used locally on Windows because it is
355
+ the active development environment, but forcing 3.14 in CI would require compiling
356
+ several packages from source and would make CI fragile. The API surface used by
357
+ SONARIS does not differ between 3.11 and 3.14 for any dependency in scope.
358
 
359
+ **5. SQLite used for development, PostgreSQL targeted for production.**
360
+ SQLite requires no server process and works identically across all developer
361
+ machines. The `db_manager.py` abstraction layer uses SQLAlchemy-compatible
362
+ connection strings so that switching to PostgreSQL in production requires changing
363
+ one configuration value, not the query logic.
docs/research_notes.md CHANGED
@@ -0,0 +1,130 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # SONARIS Research Notes
2
+
3
+ This file is a running scientific journal for the SONARIS project. Every
4
+ significant research finding, dataset assessment, methodology decision, and
5
+ literature reference is logged here during development. Entries feed directly
6
+ into the methodology and literature review sections of the SONARIS research paper.
7
+
8
+ **Usage rules:**
9
+ - New entries go at the top of the file, below this header.
10
+ - Each entry must carry a date in ISO format (YYYY-MM-DD).
11
+ - If an entry is based on published literature, include the full reference at the
12
+ bottom of that entry.
13
+ - Write for a technical reader who is encountering this topic for the first time
14
+ in this project's context.
15
+
16
+ ---
17
+
18
+ ## 2026-03-03 — IMO MEPC.1/Circ.906 Rev.1 (2024): Regulatory Gap and Project Motivation
19
+
20
+ ### What the guidelines require
21
+
22
+ IMO MEPC.1/Circ.906 Rev.1 (2024) is the current iteration of the IMO's voluntary
23
+ guidelines for the reduction of underwater noise from commercial shipping. The
24
+ circular applies to all new-build vessels and requests that shipowners and operators
25
+ monitor, manage, and where practicable reduce underwater radiated noise (URN) across
26
+ the ship's operational speed range. The guidelines specify that URN should be
27
+ characterized using 1/3-octave band measurements and reported in dB re 1 μPa at 1 m,
28
+ covering the frequency range relevant to marine mammal hearing (approximately 10 Hz
29
+ to 20 kHz depending on species). Vessels are categorised by type (bulk carrier,
30
+ container ship, tanker, passenger vessel, and others), and the guidelines provide
31
+ recommended management practices for each category. Crucially, the 2024 revision
32
+ sharpened the language from the 2014 original: shipyards and operators are now
33
+ explicitly encouraged to apply URN prediction tools at the design stage, not only
34
+ during sea trials.
35
+
36
+ ### The gap: no open-source design-phase tool exists
37
+
38
+ The IMO GloNoise Partnership Programme conducted a structured gap analysis in October
39
+ 2025. Its finding was direct: while IMO mandates URN management and design-stage
40
+ assessment, no open-source prediction or compliance-checking tool exists for
41
+ shipyards and researchers to use during the design phase. The only tools capable
42
+ of performing full-spectrum URN prediction (dBSea, ANSYS Fluent with acoustic
43
+ modules, GL ShipNoise) are commercial, proprietary, and priced out of reach for
44
+ most of the world's shipyards. A small shipyard in Southeast Asia, a research
45
+ institution in West Africa, or a student designing a vessel for a class project
46
+ cannot access these tools. This is not a niche gap: the IMO GloNoise report
47
+ notes that the majority of the global shipbuilding output by vessel count comes
48
+ from yards in developing economies where no licensed acoustic software is in use.
49
+
50
+ The consequence of this gap is that URN compliance is assessed, when it is assessed
51
+ at all, only at the sea trial stage. By that point the hull form is fixed, the
52
+ propeller is installed, and the engine mounts are set. Retrofitting for noise
53
+ reduction at sea trial is expensive and typically limited to operational adjustments
54
+ (speed reduction, routing) rather than design-level changes. The entire value of
55
+ design-stage prediction is lost.
56
+
57
+ ### The frequency bands that matter
58
+
59
+ The IMO guidelines and the underlying marine bioacoustics literature converge on
60
+ three frequency bands of primary concern:
61
+
62
+ **Low frequency: 10 Hz to 1000 Hz.** This is the primary hearing range of baleen
63
+ whales (mysticetes), including blue, fin, sei, minke, and humpback whales.
64
+ Commercial shipping noise is dominated in this band by engine tonal components,
65
+ propeller shaft harmonics, and hull flow noise. The overlap is severe: fin whale
66
+ 20 Hz calls sit directly in the band occupied by machinery noise from large
67
+ slow-speed diesel engines. This band is the primary driver of the chronic acoustic
68
+ masking problem for large whales in shipping lanes.
69
+
70
+ **Mid frequency: 1 kHz to 10 kHz.** This is the primary communication and
71
+ echolocation range of dolphins, porpoises, and toothed whales (odontocetes).
72
+ Propeller cavitation noise, which typically peaks between 1 kHz and 5 kHz
73
+ depending on ship speed and propeller design, falls directly in this band.
74
+ Bottlenose dolphin signature whistles, which are critical for individual
75
+ identification and group cohesion, occupy 3 kHz to 20 kHz. The interference
76
+ in this band is intermittent (tied to cavitation onset at specific speeds) but
77
+ acoustically intense.
78
+
79
+ **The blade-pass overlap zone: 50 Hz to 500 Hz.** The propeller blade-pass
80
+ frequency (BPF) is calculated as RPM/60 × blade count. For a typical
81
+ large commercial vessel running at 100 RPM with a 5-blade propeller, BPF is
82
+ approximately 8.3 Hz, with harmonics at 16.6 Hz, 25 Hz, and so on. At higher
83
+ shaft speeds typical of medium vessels, these harmonics fall squarely in the
84
+ 50 Hz to 500 Hz range where both baleen whale low-frequency calls and toothed
85
+ whale social calls are concentrated. This harmonic overlap is what makes the
86
+ SONARIS Module 4 BIS calculation non-trivial: it is not sufficient to compare
87
+ broadband levels. Tonal interference at specific harmonics must be resolved.
88
+ This is precisely where the audio-engineering concept of spectral masking, as
89
+ used in perceptual audio codecs and MFCC-based speech analysis, applies directly
90
+ to the bioacoustic problem.
91
+
92
+ ### Industry confirmation of demand
93
+
94
+ In October 2024, BIMCO (Baltic and International Maritime Council) and ICS
95
+ (International Chamber of Shipping), the two largest shipping industry bodies by
96
+ member tonnage, published a joint URN Management Guide. The guide explicitly
97
+ recommends that all member operators implement URN monitoring and compliance
98
+ workflows and calls on the industry to develop accessible tools for design-stage
99
+ prediction. This is the clearest statement from industry itself that demand for
100
+ a tool like SONARIS exists and that the current commercial tool landscape does not
101
+ meet it.
102
+
103
+ ### Connection to SONARIS modules
104
+
105
+ Module 2 (URN Prediction Core) closes the prediction gap identified by the IMO
106
+ GloNoise analysis by producing a full 1/3-octave band spectrum at design stage
107
+ from vessel geometry and operational parameters alone, without requiring a physical
108
+ prototype or sea trial.
109
+
110
+ Module 3 (IMO Compliance Checker) closes the compliance assessment gap by
111
+ directly mapping the Module 2 output against the MEPC.1/Circ.906 Rev.1 (2024)
112
+ limit tables and issuing a structured pass/fail report, formatted for inclusion
113
+ in a ship's technical documentation package.
114
+
115
+ Module 4 (Marine Bioacoustic Impact Module) goes beyond what IMO currently
116
+ requires. The BIS metric provides a species-specific, frequency-resolved measure
117
+ of biological harm that the IMO guidelines acknowledge in principle but do not
118
+ operationalize. This positions SONARIS as a tool not just for compliance but for
119
+ genuine environmental assessment.
120
+
121
+ ### References
122
+
123
+ - IMO MEPC.1/Circ.906 Rev.1 (2024). *2024 Guidelines for the Reduction of
124
+ Underwater Noise from Commercial Shipping to Address Adverse Impacts on Marine
125
+ Life.* International Maritime Organization, London.
126
+ - IMO GloNoise Partnership Programme (October 2025). *Gap Analysis: Underwater
127
+ Radiated Noise Management Tools for Design-Phase Application.* International
128
+ Maritime Organization, London.
129
+ - BIMCO and ICS (October 2024). *URN Management Guide for Shipping Operators.*
130
+ Baltic and International Maritime Council / International Chamber of Shipping.
modules/__init__.py CHANGED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ SONARIS top-level modules package.
3
+
4
+ This package contains the six core processing modules that form the SONARIS
5
+ pipeline: Design Input Engine, URN Prediction Core, IMO Compliance Checker,
6
+ Marine Bioacoustic Impact Module, Mitigation Recommendation Engine, and Open
7
+ URN Database. Importing this package makes the module namespace available but
8
+ executes no computation. Each sub-module must be imported explicitly by the
9
+ caller.
10
+
11
+ Libraries: None at this level. Dependencies are declared within each sub-module.
12
+
13
+ Pipeline position: This is the root of the processing pipeline. Nothing feeds
14
+ into this package; it exposes the six modules to app.py and to the test suite.
15
+ """
modules/bioacoustic/__init__.py CHANGED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Module 4: Marine Bioacoustic Impact Module.
3
+
4
+ Maps the predicted ship noise spectrum against the published audiogram
5
+ sensitivity curves of five marine mammal functional hearing groups and
6
+ quantifies the biological interference caused by the ship at each frequency band.
7
+ The five groups follow the NOAA/NMFS (2018) marine mammal acoustic weighting
8
+ function classification: low-frequency cetaceans (baleen whales), mid-frequency
9
+ cetaceans (dolphins and most toothed whales), high-frequency cetaceans (porpoises
10
+ and some small odontocetes), phocid pinnipeds in water, and otariid pinnipeds in
11
+ water.
12
+
13
+ The core method borrows two techniques from audio signal processing. First,
14
+ Mel-Frequency Cepstral Coefficient (MFCC) decomposition is applied to the ship
15
+ noise spectrum to extract a perceptually weighted frequency representation aligned
16
+ with the non-linear frequency sensitivity of the mammal ear. Second, spectral
17
+ masking analysis quantifies how much of the ship noise spectrum overlaps with and
18
+ energetically masks the frequency bands where each species group communicates and
19
+ echolocates. The combination of these two analyses produces the Biological
20
+ Interference Score (BIS), a value from 0 to 100 for each species group, where
21
+ 100 represents complete masking of the group's functional hearing range. The BIS
22
+ is not a standardised regulatory metric; it is a SONARIS-defined index intended
23
+ to rank relative biological harm across design alternatives and routing scenarios.
24
+
25
+ Libraries: ``librosa`` for MFCC extraction, ``scipy.signal`` for spectral
26
+ masking computation, ``numpy``, ``matplotlib`` and ``plotly`` for spectrogram
27
+ overlay visualization.
28
+
29
+ Pipeline position: Fourth stage, running in parallel with Module 3. Receives
30
+ ``URNSpectrum`` from Module 2. Outputs a ``BISResult`` object consumed by
31
+ Module 5.
32
+ """
33
+
34
+ from modules.bioacoustic.audiogram_data import AUDIOGRAM_CURVES
35
+ from modules.bioacoustic.bis_calculator import calculate_bis, BISResult
36
+
37
+ __all__ = ["AUDIOGRAM_CURVES", "calculate_bis", "BISResult"]
modules/imo_compliance/__init__.py CHANGED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Module 3: IMO Compliance Checker.
3
+
4
+ Receives the predicted URN spectrum from Module 2 and the vessel type string
5
+ from the ``VesselParameters`` object produced by Module 1. Checks each 1/3-octave
6
+ band level against the vessel-type-specific limit values defined in IMO
7
+ MEPC.1/Circ.906 Rev.1 (2024) and returns a structured compliance result. The
8
+ result includes a per-band pass/fail flag, the limit value applied at each band,
9
+ the measured (predicted) level, and the exceedance in dB where a band fails.
10
+ An overall compliance flag aggregates the per-band results. The module can also
11
+ generate a formatted compliance report suitable for inclusion in a vessel's
12
+ technical documentation package.
13
+
14
+ The limit tables are encoded directly in this module as Python dicts keyed by
15
+ vessel type and center frequency. When IMO updates its guidelines, these tables
16
+ are the only code that requires updating.
17
+
18
+ Libraries: Internal lookup tables only. ``fpdf2`` or ``reportlab`` for PDF
19
+ report generation (added in Phase 2).
20
+
21
+ Pipeline position: Third stage. Receives ``URNSpectrum`` from Module 2 and
22
+ vessel type from Module 1. Outputs a ``ComplianceResult`` object consumed by
23
+ Module 5 and optionally written to Module 6.
24
+ """
25
+
26
+ from modules.imo_compliance.compliance_checker import check_compliance, ComplianceResult
27
+
28
+ __all__ = ["check_compliance", "ComplianceResult"]
modules/input_engine/__init__.py CHANGED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Module 1: Design Input Engine.
3
+
4
+ Accepts all user-supplied vessel parameters required for URN prediction and
5
+ validates them against physical plausibility bounds before passing them
6
+ downstream. Inputs include hull form coefficients (block coefficient Cb,
7
+ prismatic coefficient Cp, length-to-beam ratio L/B), propeller geometry (blade
8
+ count, pitch ratio P/D, diameter in meters), engine type (slow-speed diesel,
9
+ medium-speed diesel, gas turbine), rated shaft RPM, and operational speed in
10
+ knots. All inputs are validated and packaged into a ``VesselParameters``
11
+ dataclass. If any input falls outside physically meaningful bounds, this module
12
+ raises a descriptive ``ValidationError`` before any computation reaches Module 2.
13
+ No acoustic prediction logic lives here; the sole responsibility of this module
14
+ is input integrity.
15
+
16
+ Libraries: ``dataclasses`` (stdlib), ``typing`` (stdlib). Optional use of
17
+ ``pydantic`` for schema validation if added in a later phase.
18
+
19
+ Pipeline position: First stage. Receives raw user input from the Streamlit UI
20
+ or from a direct Python API call. Outputs a ``VesselParameters`` object consumed
21
+ by the URN Prediction Core (Module 2).
22
+ """
23
+
24
+ from modules.input_engine.design_input import VesselParameters, validate_inputs
25
+
26
+ __all__ = ["VesselParameters", "validate_inputs"]
modules/mitigation/__init__.py CHANGED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Module 5: Mitigation Recommendation Engine.
3
+
4
+ Generates ranked mitigation recommendations based on the upstream outputs of
5
+ Modules 2, 3, and 4. Each recommendation addresses either a compliance failure
6
+ identified by Module 3 or a high Biological Interference Score identified by
7
+ Module 4, or both. Recommendations span two categories: operational (speed
8
+ reduction targets expressed in knots and the corresponding predicted dB
9
+ reduction, geographic routing avoidance zones keyed to known cetacean habitat
10
+ polygons) and design-level (hull coating options with published noise reduction
11
+ coefficients, propeller geometry modifications including skew angle adjustments
12
+ and blade count changes with expected BPF harmonic shifts). Each recommendation
13
+ is returned as a ``MitigationRecommendation`` object carrying a description,
14
+ the expected noise reduction in dB across affected frequency bands, the
15
+ species groups that benefit, and a confidence level based on how well the
16
+ recommendation type is supported by the available literature.
17
+
18
+ Recommendations are ranked by total expected noise reduction weighted by the
19
+ BIS scores of the affected species groups, so recommendations that reduce noise
20
+ in the most biologically sensitive frequency bands rank above those that reduce
21
+ broadband levels without addressing critical masking zones.
22
+
23
+ Libraries: ``numpy`` for ranking arithmetic. No external ML dependencies; the
24
+ recommendation logic is a rule-based system operating on structured inputs.
25
+
26
+ Pipeline position: Fifth and final computation stage. Receives ``URNSpectrum``
27
+ from Module 2, ``ComplianceResult`` from Module 3, and ``BISResult`` from
28
+ Module 4. Outputs a ranked list of ``MitigationRecommendation`` objects
29
+ rendered by the Streamlit UI and optionally stored in Module 6.
30
+ """
31
+
32
+ from modules.mitigation.recommender import generate_recommendations, MitigationRecommendation
33
+
34
+ __all__ = ["generate_recommendations", "MitigationRecommendation"]
modules/urn_database/__init__.py CHANGED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Module 6: Open URN Database.
3
+
4
+ Manages the Open URN Database, the first open-source community-contributed
5
+ database of ship underwater radiated noise signatures. Each record stores vessel
6
+ metadata (IMO number, vessel type, length overall, beam, draft, propeller blade
7
+ count, engine type), measurement or prediction conditions (ship speed, loading
8
+ condition, water depth, measurement method), and the full 1/3-octave band
9
+ spectrum from 20 Hz to 20 kHz in dB re 1 μPa at 1 m. Records contributed by
10
+ the community are flagged with a provenance field indicating whether the spectrum
11
+ was measured at sea trial, predicted by SONARIS, or sourced from a published
12
+ dataset.
13
+
14
+ The development database is SQLite, stored locally under ``data/databases/``.
15
+ The production database targets PostgreSQL. The abstraction layer in
16
+ ``db_manager.py`` uses connection strings compatible with both backends so that
17
+ the transition requires a single configuration change. All write operations
18
+ include input validation to enforce the schema before insertion. Query operations
19
+ support filtering by vessel type, IMO number, frequency band range, and date of
20
+ contribution, and return results as either Python dicts or Pandas DataFrames
21
+ depending on the caller's request.
22
+
23
+ Libraries: ``sqlite3`` (stdlib) for development, ``psycopg2`` for production,
24
+ ``sqlalchemy`` for the abstraction layer, ``pandas`` for DataFrame output.
25
+
26
+ Pipeline position: Optional terminal stage that runs alongside or after Module 5.
27
+ Receives any combination of ``VesselParameters``, ``URNSpectrum``,
28
+ ``ComplianceResult``, and ``BISResult`` objects for storage. Also serves as an
29
+ input source when Module 2 queries historical records for training data augmentation.
30
+ """
31
+
32
+ from modules.urn_database.db_manager import DatabaseManager
33
+
34
+ __all__ = ["DatabaseManager"]
modules/urn_prediction/__init__.py CHANGED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Module 2: URN Prediction Core.
3
+
4
+ Predicts a ship's underwater radiated noise spectrum from the vessel parameters
5
+ produced by Module 1. The prediction is hybrid: a physics layer and an AI layer
6
+ run in sequence, and their outputs are fused into a single spectrum.
7
+
8
+ The physics layer interfaces with OpenFOAM and libAcoustics to simulate propeller
9
+ cavitation and hull-induced turbulence noise. This layer is computationally
10
+ intensive and optional during development; it can be bypassed to run the AI layer
11
+ alone. The AI layer is a deep neural network implemented in PyTorch, trained on
12
+ the ShipsEar dataset (Santos-Dominguez et al., 2016) and the QiandaoEar22 dataset.
13
+ The network takes the structured ``VesselParameters`` object as input features
14
+ and returns a predicted noise level for each 1/3-octave band from 20 Hz to 20 kHz.
15
+ Output levels are in dB re 1 μPa at 1 m, the standard reference for underwater
16
+ source levels used in IMO MEPC.1/Circ.906 Rev.1 (2024).
17
+
18
+ Libraries: ``torch``, ``numpy``, ``scipy.signal``. OpenFOAM is invoked as a
19
+ subprocess via the physics layer; it is not a Python dependency.
20
+
21
+ Pipeline position: Second stage. Receives ``VesselParameters`` from Module 1.
22
+ Outputs a ``URNSpectrum`` object (a dict mapping Hz to dB re 1 μPa at 1 m)
23
+ consumed by Module 3 and Module 4.
24
+ """
25
+
26
+ from modules.urn_prediction.physics_layer import run_physics_prediction
27
+ from modules.urn_prediction.ai_layer import run_ai_prediction, fuse_spectra
28
+
29
+ __all__ = ["run_physics_prediction", "run_ai_prediction", "fuse_spectra"]