Spaces:
Sleeping
feat(webapp): working local server checkpoint
Browse filesCore functionality now working:
- Model loading fixed (TensorFlow 2.15 + quad_model imports)
- PSI prediction pipeline operational
- Force plot visualization working
- CSV/JSON/TSV export with proper file downloads
- SQLite database with health check
- ViennaRNA RNA structure prediction
Key changes:
- predictor.py: Simplified model loading using quad_model decorators
- routes.py: Fixed export with Content-Disposition headers
- routes.py: Fixed SQLAlchemy 2.0 text() for health check
- requirements.txt: Pinned TensorFlow 2.15 for Keras 2 compatibility
Added:
- webapp/TODO.md: Comprehensive remaining work documentation
- test_model.py: Simple model test script
- Skills docs for TensorFlow/Keras model loading
Note: UI is basic (inline HTML), needs improvement. See TODO.md.
- .claude/skills/agent-log.md +105 -0
- .claude/skills/tensorflow-keras-model-loading.md +109 -0
- requirements.txt +10 -4
- test_model.py +57 -0
- webapp/TODO.md +455 -0
- webapp/app/api/routes.py +17 -8
- webapp/app/config.py +7 -4
- webapp/app/services/predictor.py +14 -31
- webapp/requirements.txt +4 -2
|
@@ -139,6 +139,111 @@ See: `/Users/sachin/.claude/plans/tingly-sauteeing-bengio.md`
|
|
| 139 |
|
| 140 |
---
|
| 141 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 142 |
## Future Sessions
|
| 143 |
|
| 144 |
_Sessions will be logged here as work progresses._
|
|
|
|
| 139 |
|
| 140 |
---
|
| 141 |
|
| 142 |
+
## Session 2 - 2026-01-12
|
| 143 |
+
|
| 144 |
+
### Session Start
|
| 145 |
+
- **Task**: Run and test the pre-trained splicing model locally
|
| 146 |
+
- **Status**: COMPLETE
|
| 147 |
+
|
| 148 |
+
### Problem
|
| 149 |
+
User could not load the pre-trained model (`custom_adjacency_regularizer_20210731_124_step3.h5`) with their existing Python 3.12 + TensorFlow 2.20 setup.
|
| 150 |
+
|
| 151 |
+
### Errors Encountered
|
| 152 |
+
1. `ValueError: Unknown layer: 'SlicingOpLambda'`
|
| 153 |
+
2. `ValueError: Unknown layer: 'Custom>RegularizedBiasLayer'`
|
| 154 |
+
3. `IndexError: list index out of range` in Keras functional.py
|
| 155 |
+
|
| 156 |
+
### Investigation
|
| 157 |
+
|
| 158 |
+
#### Key Information from User
|
| 159 |
+
User provided context from the original model creator:
|
| 160 |
+
- Model location: `output/custom_adjacency_regularizer_20210731_124_step3.h5`
|
| 161 |
+
- Reference notebook: `figures/generate_csv_for_supplementary.ipynb`
|
| 162 |
+
- Additional notebooks: `2022_03_11_figures/` folder (visualization notebooks)
|
| 163 |
+
|
| 164 |
+
#### What We Discovered
|
| 165 |
+
1. **From `figures/generate_csv_for_supplementary.ipynb`**:
|
| 166 |
+
- Simple loading approach: `from quad_model import *` then `load_model()`
|
| 167 |
+
- No manual custom_objects needed
|
| 168 |
+
|
| 169 |
+
2. **From `2022_03_11_figures/position_specific_activations.ipynb`**:
|
| 170 |
+
- Notebook was run April 2022 with TensorFlow ~2.8
|
| 171 |
+
- Model loads with simple `tf.keras.models.load_model()`
|
| 172 |
+
|
| 173 |
+
3. **From `figures/quad_model.py`**:
|
| 174 |
+
- All custom layers use `@tf.keras.utils.register_keras_serializable()` decorator
|
| 175 |
+
- This auto-registers layers when module is imported
|
| 176 |
+
|
| 177 |
+
4. **Root Cause**:
|
| 178 |
+
- TensorFlow 2.16+ uses Keras 3 (breaking changes)
|
| 179 |
+
- Keras 3 cannot load H5 models with Lambda layers from Keras 2
|
| 180 |
+
- `tf_keras` compatibility layer is buggy for complex models
|
| 181 |
+
|
| 182 |
+
### Solution Implemented
|
| 183 |
+
|
| 184 |
+
1. **Installed Python 3.10 via pyenv**:
|
| 185 |
+
```bash
|
| 186 |
+
pyenv install 3.10.13
|
| 187 |
+
```
|
| 188 |
+
|
| 189 |
+
2. **Created new virtual environment**:
|
| 190 |
+
```bash
|
| 191 |
+
~/.pyenv/versions/3.10.13/bin/python -m venv venv310
|
| 192 |
+
source venv310/bin/activate
|
| 193 |
+
```
|
| 194 |
+
|
| 195 |
+
3. **Installed TensorFlow 2.15** (last version with native Keras 2):
|
| 196 |
+
```bash
|
| 197 |
+
pip install tensorflow==2.15.0 numpy pandas joblib scikit-learn matplotlib seaborn tqdm scipy
|
| 198 |
+
```
|
| 199 |
+
|
| 200 |
+
4. **Updated `test_model.py`** to use simple loading approach:
|
| 201 |
+
```python
|
| 202 |
+
import sys
|
| 203 |
+
sys.path.insert(0, 'figures')
|
| 204 |
+
from quad_model import * # Auto-registers custom layers
|
| 205 |
+
from tensorflow.keras.models import load_model
|
| 206 |
+
|
| 207 |
+
model = load_model('output/...h5')
|
| 208 |
+
```
|
| 209 |
+
|
| 210 |
+
5. **Updated `requirements.txt`**:
|
| 211 |
+
- Changed from `tensorflow>=2.15.0` to `tensorflow==2.15.0`
|
| 212 |
+
- Added setup instructions for Python 3.10
|
| 213 |
+
- Removed `tf_keras` (not needed)
|
| 214 |
+
|
| 215 |
+
### Results
|
| 216 |
+
```
|
| 217 |
+
Model loaded successfully!
|
| 218 |
+
Number of test samples: 47962
|
| 219 |
+
MSE: 0.032396
|
| 220 |
+
R2 Score: 0.8224
|
| 221 |
+
Correlation: 0.9069
|
| 222 |
+
```
|
| 223 |
+
|
| 224 |
+
### Files Modified
|
| 225 |
+
- `test_model.py` - Simplified to use quad_model.py approach
|
| 226 |
+
- `requirements.txt` - Pinned TensorFlow 2.15, added setup instructions
|
| 227 |
+
|
| 228 |
+
### Files Created
|
| 229 |
+
- `venv310/` - New Python 3.10 virtual environment
|
| 230 |
+
- `.claude/skills/tensorflow-keras-model-loading.md` - Skill documentation
|
| 231 |
+
|
| 232 |
+
### Key Learnings
|
| 233 |
+
1. **TF 2.16+ breaks old H5 models** - Must use TF 2.15 or earlier for Keras 2 models
|
| 234 |
+
2. **Python 3.12 requires TF 2.16+** - So must downgrade Python to 3.10/3.11
|
| 235 |
+
3. **Check original notebooks first** - They show the working approach
|
| 236 |
+
4. **`@register_keras_serializable()` is key** - Import the module to register layers
|
| 237 |
+
5. **`tf_keras` is unreliable** - For complex models, use native TF 2.15 instead
|
| 238 |
+
|
| 239 |
+
### Environment Summary
|
| 240 |
+
| Environment | Python | TensorFlow | Status |
|
| 241 |
+
|-------------|--------|------------|--------|
|
| 242 |
+
| `venv` (old) | 3.12 | 2.20 | BROKEN - can delete |
|
| 243 |
+
| `venv310` | 3.10.13 | 2.15.0 | WORKING |
|
| 244 |
+
|
| 245 |
+
---
|
| 246 |
+
|
| 247 |
## Future Sessions
|
| 248 |
|
| 249 |
_Sessions will be logged here as work progresses._
|
|
@@ -0,0 +1,109 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Skill: Loading Legacy TensorFlow/Keras Models
|
| 2 |
+
|
| 3 |
+
## Problem Encountered
|
| 4 |
+
When trying to load a pre-trained H5 model created in 2021 with TensorFlow 2.5, we encountered multiple errors with TensorFlow 2.20 (Python 3.12):
|
| 5 |
+
|
| 6 |
+
1. `ValueError: Unknown layer: 'SlicingOpLambda'`
|
| 7 |
+
2. `ValueError: Unknown layer: 'Custom>RegularizedBiasLayer'`
|
| 8 |
+
3. `IndexError: list index out of range` in `process_node`
|
| 9 |
+
|
| 10 |
+
## Root Cause
|
| 11 |
+
- **TensorFlow 2.16+ uses Keras 3** which has breaking changes for loading old H5 models
|
| 12 |
+
- Models with Lambda layers and custom layers saved with Keras 2 cannot be loaded with Keras 3
|
| 13 |
+
- The `tf_keras` compatibility layer does NOT fully work for complex models with Lambda layers
|
| 14 |
+
|
| 15 |
+
## Solution
|
| 16 |
+
|
| 17 |
+
### 1. Use Python 3.10 + TensorFlow 2.15
|
| 18 |
+
TensorFlow 2.15 is the **last version with native Keras 2 support**:
|
| 19 |
+
|
| 20 |
+
```bash
|
| 21 |
+
# Install Python 3.10 via pyenv
|
| 22 |
+
pyenv install 3.10.13
|
| 23 |
+
|
| 24 |
+
# Create virtual environment
|
| 25 |
+
~/.pyenv/versions/3.10.13/bin/python -m venv venv310
|
| 26 |
+
source venv310/bin/activate
|
| 27 |
+
|
| 28 |
+
# Install TensorFlow 2.15 (NOT 2.16+)
|
| 29 |
+
pip install tensorflow==2.15.0
|
| 30 |
+
```
|
| 31 |
+
|
| 32 |
+
### 2. Use `@register_keras_serializable()` Pattern
|
| 33 |
+
The original codebase uses decorators to auto-register custom layers:
|
| 34 |
+
|
| 35 |
+
```python
|
| 36 |
+
@tf.keras.utils.register_keras_serializable()
|
| 37 |
+
class MyCustomLayer(Layer):
|
| 38 |
+
...
|
| 39 |
+
```
|
| 40 |
+
|
| 41 |
+
When you import from the module containing these decorators, the layers are automatically registered:
|
| 42 |
+
|
| 43 |
+
```python
|
| 44 |
+
# This auto-registers all custom layers
|
| 45 |
+
from quad_model import *
|
| 46 |
+
|
| 47 |
+
# Then you can load the model directly
|
| 48 |
+
model = load_model('model.h5')
|
| 49 |
+
```
|
| 50 |
+
|
| 51 |
+
### 3. Don't Pass Custom Objects Manually (Usually)
|
| 52 |
+
If the original code uses `@register_keras_serializable()`, you typically don't need to pass `custom_objects` to `load_model()`. The decorators handle registration.
|
| 53 |
+
|
| 54 |
+
## TensorFlow/Keras Version Compatibility Matrix
|
| 55 |
+
|
| 56 |
+
| Python | TensorFlow | Keras | Can Load Old H5? |
|
| 57 |
+
|--------|------------|-------|------------------|
|
| 58 |
+
| 3.12 | 2.16-2.20 | 3.x | NO - Lambda layer bugs |
|
| 59 |
+
| 3.11 | 2.15 | 2.15 | YES |
|
| 60 |
+
| 3.10 | 2.10-2.15 | 2.x | YES |
|
| 61 |
+
| 3.10 | 2.8-2.9 | 2.x | YES |
|
| 62 |
+
|
| 63 |
+
## Key Lessons
|
| 64 |
+
|
| 65 |
+
### DO:
|
| 66 |
+
- Check when the model was created and what TensorFlow version was used
|
| 67 |
+
- Look for existing notebooks that successfully load the model
|
| 68 |
+
- Match the Python + TensorFlow version to the model's creation era
|
| 69 |
+
- Use `@register_keras_serializable()` for custom layers
|
| 70 |
+
- Pin TensorFlow version in requirements.txt (`tensorflow==2.15.0`)
|
| 71 |
+
|
| 72 |
+
### DON'T:
|
| 73 |
+
- Assume latest TensorFlow will load old models
|
| 74 |
+
- Use `tf_keras` for complex models with Lambda layers (it's buggy)
|
| 75 |
+
- Try to manually pass all custom objects if decorators exist
|
| 76 |
+
- Use Python 3.12 with TensorFlow < 2.16 (incompatible)
|
| 77 |
+
|
| 78 |
+
## Quick Diagnosis
|
| 79 |
+
|
| 80 |
+
If you see these errors, it's likely a Keras 2 vs 3 compatibility issue:
|
| 81 |
+
- `Unknown layer: 'SlicingOpLambda'`
|
| 82 |
+
- `Unknown layer: 'Custom>...'`
|
| 83 |
+
- `IndexError: list index out of range` in functional.py
|
| 84 |
+
- Errors mentioning `_inbound_nodes`
|
| 85 |
+
|
| 86 |
+
## Files to Check in Legacy Projects
|
| 87 |
+
|
| 88 |
+
1. Look for `quad_model.py` or similar files with custom layer definitions
|
| 89 |
+
2. Check if layers use `@tf.keras.utils.register_keras_serializable()`
|
| 90 |
+
3. Find notebooks that successfully load the model (check their imports)
|
| 91 |
+
4. Check model creation date from filename (e.g., `_20210731_` = July 2021)
|
| 92 |
+
|
| 93 |
+
## Working Example
|
| 94 |
+
|
| 95 |
+
```python
|
| 96 |
+
"""Load legacy Keras model (created with TF 2.5-2.10)"""
|
| 97 |
+
import sys
|
| 98 |
+
sys.path.insert(0, 'figures') # or wherever quad_model.py lives
|
| 99 |
+
|
| 100 |
+
# Import registers all custom layers via decorators
|
| 101 |
+
from quad_model import *
|
| 102 |
+
from tensorflow.keras.models import load_model
|
| 103 |
+
|
| 104 |
+
# Now load works without custom_objects
|
| 105 |
+
model = load_model('output/model.h5')
|
| 106 |
+
|
| 107 |
+
# Make predictions
|
| 108 |
+
predictions = model.predict(data)
|
| 109 |
+
```
|
|
@@ -1,14 +1,14 @@
|
|
| 1 |
# Interpretable Splicing Model - Dependencies
|
| 2 |
-
# Python 3.
|
| 3 |
|
| 4 |
# Core ML dependencies
|
| 5 |
-
tensorflow
|
| 6 |
-
|
| 7 |
-
numpy>=1.26.0
|
| 8 |
pandas>=2.1.0
|
| 9 |
joblib>=1.3.0
|
| 10 |
scikit-learn>=1.4.0
|
| 11 |
tqdm
|
|
|
|
| 12 |
|
| 13 |
# Visualization (for figures and notebooks)
|
| 14 |
matplotlib>=3.8.0
|
|
@@ -26,3 +26,9 @@ drawsvg
|
|
| 26 |
# macOS: brew tap brewsci/bio && brew install brewsci/bio/viennarna
|
| 27 |
# Ubuntu: sudo apt install vienna-rna
|
| 28 |
# Verify: RNAfold --version
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# Interpretable Splicing Model - Dependencies
|
| 2 |
+
# Requires Python 3.10 (TensorFlow 2.15 with Keras 2)
|
| 3 |
|
| 4 |
# Core ML dependencies
|
| 5 |
+
tensorflow==2.15.0 # Must use 2.15 (last version with Keras 2) for model compatibility
|
| 6 |
+
numpy>=1.26.0,<2.0
|
|
|
|
| 7 |
pandas>=2.1.0
|
| 8 |
joblib>=1.3.0
|
| 9 |
scikit-learn>=1.4.0
|
| 10 |
tqdm
|
| 11 |
+
scipy
|
| 12 |
|
| 13 |
# Visualization (for figures and notebooks)
|
| 14 |
matplotlib>=3.8.0
|
|
|
|
| 26 |
# macOS: brew tap brewsci/bio && brew install brewsci/bio/viennarna
|
| 27 |
# Ubuntu: sudo apt install vienna-rna
|
| 28 |
# Verify: RNAfold --version
|
| 29 |
+
|
| 30 |
+
# Setup instructions:
|
| 31 |
+
# 1. Install Python 3.10 via pyenv: pyenv install 3.10.13
|
| 32 |
+
# 2. Create venv: ~/.pyenv/versions/3.10.13/bin/python -m venv venv310
|
| 33 |
+
# 3. Activate: source venv310/bin/activate
|
| 34 |
+
# 4. Install: pip install -r requirements.txt
|
|
@@ -0,0 +1,57 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Simple script to test the pre-trained splicing model.
|
| 2 |
+
|
| 3 |
+
This script uses the approach from the original notebooks:
|
| 4 |
+
- figures/generate_csv_for_supplementary.ipynb
|
| 5 |
+
- 2022_03_11_figures/position_specific_activations.ipynb
|
| 6 |
+
|
| 7 |
+
Requires: Python 3.10 + TensorFlow 2.10 (see README for setup)
|
| 8 |
+
"""
|
| 9 |
+
|
| 10 |
+
import sys
|
| 11 |
+
|
| 12 |
+
# Add figures directory to path so we can import quad_model
|
| 13 |
+
sys.path.insert(0, 'figures')
|
| 14 |
+
|
| 15 |
+
# Import from quad_model - this auto-registers all custom layers
|
| 16 |
+
# via @tf.keras.utils.register_keras_serializable() decorators
|
| 17 |
+
from quad_model import *
|
| 18 |
+
from tensorflow.keras.models import load_model
|
| 19 |
+
from joblib import load as jload
|
| 20 |
+
import numpy as np
|
| 21 |
+
|
| 22 |
+
print("Loading model...")
|
| 23 |
+
model = load_model('output/custom_adjacency_regularizer_20210731_124_step3.h5')
|
| 24 |
+
print("Model loaded successfully!")
|
| 25 |
+
|
| 26 |
+
print("\nLoading test data...")
|
| 27 |
+
xTe = jload('data/xTe_ES7_HeLa_ABC.pkl.gz')
|
| 28 |
+
yTe = jload('data/yTe_ES7_HeLa_ABC.pkl.gz')
|
| 29 |
+
|
| 30 |
+
num_samples = len(xTe[0]) if isinstance(xTe, list) else len(xTe)
|
| 31 |
+
print(f"Number of test samples: {num_samples}")
|
| 32 |
+
|
| 33 |
+
print("\nRunning predictions...")
|
| 34 |
+
predictions = model.predict(xTe, verbose=0)
|
| 35 |
+
|
| 36 |
+
print(f"\nResults:")
|
| 37 |
+
print(f"Predictions shape: {predictions.shape}")
|
| 38 |
+
print(f"\nFirst 10 predictions vs actual PSI values:")
|
| 39 |
+
print("-" * 50)
|
| 40 |
+
print(f"{'Predicted PSI':<15} {'Actual PSI':<15} {'Diff':<10}")
|
| 41 |
+
print("-" * 50)
|
| 42 |
+
for i in range(min(10, len(predictions))):
|
| 43 |
+
pred = predictions[i, 0]
|
| 44 |
+
actual = yTe[i]
|
| 45 |
+
diff = pred - actual
|
| 46 |
+
print(f"{pred:<15.4f} {actual:<15.4f} {diff:<10.4f}")
|
| 47 |
+
|
| 48 |
+
# Calculate overall metrics
|
| 49 |
+
from sklearn.metrics import mean_squared_error, r2_score
|
| 50 |
+
mse = mean_squared_error(yTe, predictions)
|
| 51 |
+
r2 = r2_score(yTe, predictions)
|
| 52 |
+
correlation = np.corrcoef(yTe.flatten(), predictions.flatten())[0, 1]
|
| 53 |
+
|
| 54 |
+
print(f"\nOverall Metrics:")
|
| 55 |
+
print(f" MSE: {mse:.6f}")
|
| 56 |
+
print(f" R2 Score: {r2:.4f}")
|
| 57 |
+
print(f" Correlation: {correlation:.4f}")
|
|
@@ -0,0 +1,455 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Splicing Predictor Web Application - Remaining Work
|
| 2 |
+
|
| 3 |
+
> **Current Status**: Core prediction functionality working. UI needs significant improvements.
|
| 4 |
+
>
|
| 5 |
+
> **Last Updated**: 2026-01-12
|
| 6 |
+
|
| 7 |
+
---
|
| 8 |
+
|
| 9 |
+
## Table of Contents
|
| 10 |
+
|
| 11 |
+
1. [Completed Work](#completed-work)
|
| 12 |
+
2. [UI/UX Improvements (HIGH PRIORITY)](#1-uiux-improvements-high-priority)
|
| 13 |
+
3. [Missing Content](#2-missing-content)
|
| 14 |
+
4. [Feature Gaps](#3-feature-gaps)
|
| 15 |
+
5. [Technical Debt](#4-technical-debt)
|
| 16 |
+
6. [Deployment](#5-deployment)
|
| 17 |
+
7. [NAR Web Server Compliance](#6-nar-web-server-compliance)
|
| 18 |
+
8. [Testing](#7-testing)
|
| 19 |
+
|
| 20 |
+
---
|
| 21 |
+
|
| 22 |
+
## Completed Work
|
| 23 |
+
|
| 24 |
+
- [x] Model loading with TensorFlow 2.15 (Keras 2 compatibility)
|
| 25 |
+
- [x] PSI prediction pipeline
|
| 26 |
+
- [x] RNA secondary structure prediction (ViennaRNA integration)
|
| 27 |
+
- [x] Force plot visualization (Plotly)
|
| 28 |
+
- [x] Single sequence prediction API
|
| 29 |
+
- [x] Batch prediction API
|
| 30 |
+
- [x] CSV/JSON/TSV export with proper file downloads
|
| 31 |
+
- [x] SQLite database for job storage
|
| 32 |
+
- [x] Health check endpoint
|
| 33 |
+
- [x] Example sequences endpoint
|
| 34 |
+
- [x] Basic result page with force plot
|
| 35 |
+
|
| 36 |
+
---
|
| 37 |
+
|
| 38 |
+
## 1. UI/UX Improvements (HIGH PRIORITY)
|
| 39 |
+
|
| 40 |
+
### Current Problems
|
| 41 |
+
|
| 42 |
+
The current UI is a basic inline HTML fallback with no design system:
|
| 43 |
+
|
| 44 |
+
- **No proper template system** - HTML is embedded in Python code (`webapp/app/main.py`)
|
| 45 |
+
- **No CSS framework** - Using inline `<style>` tags
|
| 46 |
+
- **No navigation** - Users can't easily move between pages
|
| 47 |
+
- **No responsive design** - Doesn't work well on mobile
|
| 48 |
+
- **No loading states** - No spinners or progress indicators
|
| 49 |
+
- **No error messages UI** - Errors show as basic alerts
|
| 50 |
+
- **Inconsistent styling** - Each page styled separately
|
| 51 |
+
|
| 52 |
+
### Required Improvements
|
| 53 |
+
|
| 54 |
+
- [ ] **Move to Jinja2 templates** (`webapp/templates/`)
|
| 55 |
+
- [ ] `base.html` - Base template with navigation
|
| 56 |
+
- [ ] `index.html` - Home/prediction page
|
| 57 |
+
- [ ] `result.html` - Results display
|
| 58 |
+
- [ ] `about.html` - About the model
|
| 59 |
+
- [ ] `methodology.html` - Technical details
|
| 60 |
+
- [ ] `help.html` - User guide
|
| 61 |
+
- [ ] `batch.html` - Batch upload interface
|
| 62 |
+
|
| 63 |
+
- [ ] **Add CSS framework** (Tailwind CSS recommended)
|
| 64 |
+
- [ ] Install Tailwind or use CDN
|
| 65 |
+
- [ ] Create consistent design system
|
| 66 |
+
- [ ] Add dark mode support (optional)
|
| 67 |
+
|
| 68 |
+
- [ ] **Navigation header**
|
| 69 |
+
- [ ] Logo/branding
|
| 70 |
+
- [ ] Links: Home, About, Methodology, Help, API Docs
|
| 71 |
+
- [ ] Mobile hamburger menu
|
| 72 |
+
|
| 73 |
+
- [ ] **Footer**
|
| 74 |
+
- [ ] Citation information
|
| 75 |
+
- [ ] Contact/feedback link
|
| 76 |
+
- [ ] Privacy policy link
|
| 77 |
+
- [ ] Funding acknowledgments
|
| 78 |
+
|
| 79 |
+
- [ ] **Loading states**
|
| 80 |
+
- [ ] Spinner during prediction
|
| 81 |
+
- [ ] Progress bar for batch uploads
|
| 82 |
+
- [ ] Skeleton loaders for async content
|
| 83 |
+
|
| 84 |
+
- [ ] **Error handling**
|
| 85 |
+
- [ ] Toast notifications for errors
|
| 86 |
+
- [ ] Inline validation messages
|
| 87 |
+
- [ ] Friendly error pages (404, 500)
|
| 88 |
+
|
| 89 |
+
- [ ] **Responsive design**
|
| 90 |
+
- [ ] Mobile-friendly layout
|
| 91 |
+
- [ ] Touch-friendly buttons
|
| 92 |
+
- [ ] Readable text on all devices
|
| 93 |
+
|
| 94 |
+
---
|
| 95 |
+
|
| 96 |
+
## 2. Missing Content
|
| 97 |
+
|
| 98 |
+
### About the Model
|
| 99 |
+
|
| 100 |
+
The landing page has almost no information about what the model does. Need to add:
|
| 101 |
+
|
| 102 |
+
- [ ] **What it predicts**
|
| 103 |
+
- PSI (Percent Spliced In) values
|
| 104 |
+
- Range: 0 (completely skipped) to 1 (completely included)
|
| 105 |
+
- Alternative splicing outcomes
|
| 106 |
+
|
| 107 |
+
- [ ] **How it works (simplified)**
|
| 108 |
+
- Takes 70nt exon sequence as input
|
| 109 |
+
- Adds flanking sequences
|
| 110 |
+
- Predicts RNA secondary structure
|
| 111 |
+
- Neural network predicts splicing outcome
|
| 112 |
+
|
| 113 |
+
- [ ] **Who should use it**
|
| 114 |
+
- Researchers studying RNA splicing
|
| 115 |
+
- Designing synthetic exons
|
| 116 |
+
- Understanding splicing regulation
|
| 117 |
+
|
| 118 |
+
- [ ] **Limitations**
|
| 119 |
+
- Only works with 70nt exon sequences
|
| 120 |
+
- Trained on HeLa cell data (ES7 library)
|
| 121 |
+
- May not generalize to all cell types
|
| 122 |
+
- Does not consider cellular context
|
| 123 |
+
|
| 124 |
+
### Model Architecture Page
|
| 125 |
+
|
| 126 |
+
- [ ] **Input features**
|
| 127 |
+
- Sequence one-hot encoding (90×4)
|
| 128 |
+
- Structure one-hot encoding (90×3)
|
| 129 |
+
- Wobble pair indicators (90×1)
|
| 130 |
+
|
| 131 |
+
- [ ] **Architecture diagram**
|
| 132 |
+
- Sequence branch: Conv1D (20 filters, width 6)
|
| 133 |
+
- Structure branch: Conv1D (8 filters, width 30)
|
| 134 |
+
- Position-specific biases
|
| 135 |
+
- Inclusion vs skipping energy computation
|
| 136 |
+
- Residual tuner MLP
|
| 137 |
+
- Sigmoid output
|
| 138 |
+
|
| 139 |
+
- [ ] **Interpretability features**
|
| 140 |
+
- Position-specific bias visualization
|
| 141 |
+
- Separate inclusion/skipping branches
|
| 142 |
+
- Force plot explanation
|
| 143 |
+
|
| 144 |
+
### Research Background
|
| 145 |
+
|
| 146 |
+
- [ ] **Citation**
|
| 147 |
+
```
|
| 148 |
+
Liao SE, Sudarshan M, and Regev O.
|
| 149 |
+
"Machine learning for discovery: deciphering RNA splicing logic."
|
| 150 |
+
bioRxiv (2022).
|
| 151 |
+
```
|
| 152 |
+
|
| 153 |
+
- [ ] **Link to paper** (bioRxiv)
|
| 154 |
+
- [ ] **Link to GitHub** (original repo)
|
| 155 |
+
- [ ] **Contact information** for authors
|
| 156 |
+
|
| 157 |
+
### Training Data Information
|
| 158 |
+
|
| 159 |
+
- [ ] **Dataset**: ES7_HeLa (A, B, C libraries)
|
| 160 |
+
- [ ] **Size**: ~150,000 synthetic exons
|
| 161 |
+
- [ ] **Cell type**: HeLa cells
|
| 162 |
+
- [ ] **Experimental method**: MPRA (Massively Parallel Reporter Assay)
|
| 163 |
+
|
| 164 |
+
### Performance Metrics
|
| 165 |
+
|
| 166 |
+
- [ ] **Test R²**: ~0.85
|
| 167 |
+
- [ ] **Test RMSE**: ~0.12
|
| 168 |
+
- [ ] **Correlation**: ~0.92
|
| 169 |
+
- [ ] **Binary KL Loss**: ~0.015-0.020
|
| 170 |
+
|
| 171 |
+
---
|
| 172 |
+
|
| 173 |
+
## 3. Feature Gaps
|
| 174 |
+
|
| 175 |
+
### High Priority
|
| 176 |
+
|
| 177 |
+
- [ ] **Batch file upload**
|
| 178 |
+
- [ ] Accept FASTA format
|
| 179 |
+
- [ ] Accept CSV format (one sequence per line)
|
| 180 |
+
- [ ] Validate all sequences before processing
|
| 181 |
+
- [ ] Show progress during batch processing
|
| 182 |
+
- [ ] Allow download of all results
|
| 183 |
+
|
| 184 |
+
- [ ] **Improved force plot**
|
| 185 |
+
- [ ] Show sequence letters on x-axis
|
| 186 |
+
- [ ] Highlight key positions
|
| 187 |
+
- [ ] Add structure annotation
|
| 188 |
+
- [ ] Export as PNG/SVG
|
| 189 |
+
|
| 190 |
+
- [ ] **Result sharing**
|
| 191 |
+
- [ ] Permalink to results (already have job IDs)
|
| 192 |
+
- [ ] Copy link button
|
| 193 |
+
- [ ] Social sharing (optional)
|
| 194 |
+
|
| 195 |
+
### Medium Priority
|
| 196 |
+
|
| 197 |
+
- [ ] **PDF export**
|
| 198 |
+
- [ ] Formatted report with all results
|
| 199 |
+
- [ ] Include force plot image
|
| 200 |
+
- [ ] Include input sequence
|
| 201 |
+
- [ ] Include methodology summary
|
| 202 |
+
|
| 203 |
+
- [ ] **Sequence editor**
|
| 204 |
+
- [ ] Syntax highlighting for nucleotides
|
| 205 |
+
- [ ] Visual feedback for invalid characters
|
| 206 |
+
- [ ] Complement/reverse complement tools
|
| 207 |
+
|
| 208 |
+
- [ ] **Multiple examples**
|
| 209 |
+
- [ ] Show all 3 examples in UI
|
| 210 |
+
- [ ] Explain what each demonstrates
|
| 211 |
+
- [ ] Allow users to modify and re-predict
|
| 212 |
+
|
| 213 |
+
### Low Priority
|
| 214 |
+
|
| 215 |
+
- [ ] **Email notifications**
|
| 216 |
+
- [ ] Send results when job completes
|
| 217 |
+
- [ ] Optional (don't require email)
|
| 218 |
+
|
| 219 |
+
- [ ] **Job history**
|
| 220 |
+
- [ ] Show recent predictions
|
| 221 |
+
- [ ] Allow re-running previous jobs
|
| 222 |
+
- [ ] LocalStorage for client-side history
|
| 223 |
+
|
| 224 |
+
- [ ] **API key management** (if needed for rate limiting)
|
| 225 |
+
|
| 226 |
+
---
|
| 227 |
+
|
| 228 |
+
## 4. Technical Debt
|
| 229 |
+
|
| 230 |
+
### Code Quality
|
| 231 |
+
|
| 232 |
+
- [ ] **Extract HTML to templates**
|
| 233 |
+
- Move all inline HTML from `main.py` to `templates/`
|
| 234 |
+
- Use Jinja2 template inheritance
|
| 235 |
+
|
| 236 |
+
- [ ] **CSS refactoring**
|
| 237 |
+
- Move inline styles to `static/css/`
|
| 238 |
+
- Use CSS variables for theming
|
| 239 |
+
- Consider CSS framework
|
| 240 |
+
|
| 241 |
+
- [ ] **JavaScript improvements**
|
| 242 |
+
- Move inline scripts to `static/js/`
|
| 243 |
+
- Use modern ES6+ syntax
|
| 244 |
+
- Consider Alpine.js or htmx for interactivity
|
| 245 |
+
|
| 246 |
+
### API Improvements
|
| 247 |
+
|
| 248 |
+
- [ ] **Rate limiting**
|
| 249 |
+
- Prevent abuse
|
| 250 |
+
- Per-IP limits
|
| 251 |
+
- Optional API keys for higher limits
|
| 252 |
+
|
| 253 |
+
- [ ] **Request validation**
|
| 254 |
+
- Better error messages
|
| 255 |
+
- Sequence format validation
|
| 256 |
+
- Input sanitization
|
| 257 |
+
|
| 258 |
+
- [ ] **Response caching**
|
| 259 |
+
- Cache identical predictions
|
| 260 |
+
- Reduce computation for repeated requests
|
| 261 |
+
|
| 262 |
+
### Database
|
| 263 |
+
|
| 264 |
+
- [ ] **Job cleanup**
|
| 265 |
+
- Scheduled task to delete old jobs
|
| 266 |
+
- Configurable retention period
|
| 267 |
+
|
| 268 |
+
- [ ] **Indexes**
|
| 269 |
+
- Add indexes for common queries
|
| 270 |
+
- Optimize job lookup by ID
|
| 271 |
+
|
| 272 |
+
### Logging
|
| 273 |
+
|
| 274 |
+
- [ ] **Structured logging**
|
| 275 |
+
- JSON format for production
|
| 276 |
+
- Request/response logging
|
| 277 |
+
- Error tracking
|
| 278 |
+
|
| 279 |
+
- [ ] **Monitoring**
|
| 280 |
+
- Request latency metrics
|
| 281 |
+
- Error rate tracking
|
| 282 |
+
- Model prediction time
|
| 283 |
+
|
| 284 |
+
---
|
| 285 |
+
|
| 286 |
+
## 5. Deployment
|
| 287 |
+
|
| 288 |
+
### Docker Configuration
|
| 289 |
+
|
| 290 |
+
- [ ] **Dockerfile**
|
| 291 |
+
```dockerfile
|
| 292 |
+
FROM python:3.10-slim
|
| 293 |
+
# Install ViennaRNA
|
| 294 |
+
# Copy application
|
| 295 |
+
# Install dependencies
|
| 296 |
+
# Run with gunicorn
|
| 297 |
+
```
|
| 298 |
+
|
| 299 |
+
- [ ] **docker-compose.yml**
|
| 300 |
+
- Web service
|
| 301 |
+
- Volume for database
|
| 302 |
+
- Environment variables
|
| 303 |
+
|
| 304 |
+
- [ ] **.dockerignore**
|
| 305 |
+
- Exclude venv, __pycache__, .git
|
| 306 |
+
|
| 307 |
+
### Production Server
|
| 308 |
+
|
| 309 |
+
- [ ] **Gunicorn configuration**
|
| 310 |
+
- Multiple workers
|
| 311 |
+
- Timeout settings
|
| 312 |
+
- Logging
|
| 313 |
+
|
| 314 |
+
- [ ] **Nginx reverse proxy**
|
| 315 |
+
- SSL termination
|
| 316 |
+
- Static file serving
|
| 317 |
+
- Rate limiting
|
| 318 |
+
|
| 319 |
+
- [ ] **SSL/HTTPS**
|
| 320 |
+
- Let's Encrypt certificate
|
| 321 |
+
- Auto-renewal
|
| 322 |
+
|
| 323 |
+
### Environment Management
|
| 324 |
+
|
| 325 |
+
- [ ] **Environment variables**
|
| 326 |
+
- Database path
|
| 327 |
+
- Debug mode
|
| 328 |
+
- Secret key
|
| 329 |
+
- SMTP settings
|
| 330 |
+
|
| 331 |
+
- [ ] **.env.example**
|
| 332 |
+
- Document all variables
|
| 333 |
+
- Provide defaults
|
| 334 |
+
|
| 335 |
+
### Cloud Deployment Options
|
| 336 |
+
|
| 337 |
+
- [ ] **Option A: VPS (DigitalOcean, Linode)**
|
| 338 |
+
- Full control
|
| 339 |
+
- Manual setup required
|
| 340 |
+
|
| 341 |
+
- [ ] **Option B: Platform as a Service**
|
| 342 |
+
- Railway, Render, Fly.io
|
| 343 |
+
- Easier deployment
|
| 344 |
+
- May have cold start issues
|
| 345 |
+
|
| 346 |
+
- [ ] **Option C: Container service**
|
| 347 |
+
- Google Cloud Run
|
| 348 |
+
- AWS Fargate
|
| 349 |
+
- Auto-scaling
|
| 350 |
+
|
| 351 |
+
---
|
| 352 |
+
|
| 353 |
+
## 6. NAR Web Server Compliance
|
| 354 |
+
|
| 355 |
+
For publication in Nucleic Acids Research Web Server issue:
|
| 356 |
+
|
| 357 |
+
### Required Pages
|
| 358 |
+
|
| 359 |
+
- [ ] **Privacy policy**
|
| 360 |
+
- What data is collected
|
| 361 |
+
- How long it's stored
|
| 362 |
+
- Who has access
|
| 363 |
+
|
| 364 |
+
- [ ] **Terms of service**
|
| 365 |
+
- Usage restrictions
|
| 366 |
+
- Disclaimer
|
| 367 |
+
- License
|
| 368 |
+
|
| 369 |
+
- [ ] **Contact information**
|
| 370 |
+
- Email for support
|
| 371 |
+
- Issue reporting
|
| 372 |
+
|
| 373 |
+
- [ ] **Funding acknowledgments**
|
| 374 |
+
- Grant numbers
|
| 375 |
+
- Institution
|
| 376 |
+
|
| 377 |
+
### Accessibility (WCAG 2.1)
|
| 378 |
+
|
| 379 |
+
- [ ] **Keyboard navigation**
|
| 380 |
+
- [ ] **Screen reader support**
|
| 381 |
+
- [ ] **Color contrast ratios**
|
| 382 |
+
- [ ] **Alt text for images**
|
| 383 |
+
- [ ] **Focus indicators**
|
| 384 |
+
|
| 385 |
+
### Mobile Support
|
| 386 |
+
|
| 387 |
+
- [ ] **Responsive layout**
|
| 388 |
+
- [ ] **Touch-friendly targets**
|
| 389 |
+
- [ ] **Readable font sizes**
|
| 390 |
+
|
| 391 |
+
### Reliability
|
| 392 |
+
|
| 393 |
+
- [ ] **99.9% uptime target**
|
| 394 |
+
- [ ] **Monitoring and alerting**
|
| 395 |
+
- [ ] **Backup strategy**
|
| 396 |
+
- [ ] **Disaster recovery plan**
|
| 397 |
+
|
| 398 |
+
---
|
| 399 |
+
|
| 400 |
+
## 7. Testing
|
| 401 |
+
|
| 402 |
+
### Unit Tests
|
| 403 |
+
|
| 404 |
+
- [ ] **API endpoint tests**
|
| 405 |
+
- Test all routes
|
| 406 |
+
- Test error cases
|
| 407 |
+
- Test validation
|
| 408 |
+
|
| 409 |
+
- [ ] **Model wrapper tests**
|
| 410 |
+
- Test prediction pipeline
|
| 411 |
+
- Test input preparation
|
| 412 |
+
- Test output format
|
| 413 |
+
|
| 414 |
+
- [ ] **Database tests**
|
| 415 |
+
- Test job creation
|
| 416 |
+
- Test job retrieval
|
| 417 |
+
- Test job deletion
|
| 418 |
+
|
| 419 |
+
### Integration Tests
|
| 420 |
+
|
| 421 |
+
- [ ] **End-to-end prediction flow**
|
| 422 |
+
- [ ] **Batch processing**
|
| 423 |
+
- [ ] **Export functionality**
|
| 424 |
+
|
| 425 |
+
### Load Testing
|
| 426 |
+
|
| 427 |
+
- [ ] **Concurrent requests**
|
| 428 |
+
- [ ] **Response time under load**
|
| 429 |
+
- [ ] **Memory usage**
|
| 430 |
+
|
| 431 |
+
---
|
| 432 |
+
|
| 433 |
+
## Quick Start for Next Session
|
| 434 |
+
|
| 435 |
+
To continue development:
|
| 436 |
+
|
| 437 |
+
```bash
|
| 438 |
+
# 1. Activate environment
|
| 439 |
+
source venv310/bin/activate
|
| 440 |
+
|
| 441 |
+
# 2. Start server
|
| 442 |
+
python -m uvicorn webapp.app.main:app --reload --port 8000
|
| 443 |
+
|
| 444 |
+
# 3. View app
|
| 445 |
+
open http://localhost:8000
|
| 446 |
+
```
|
| 447 |
+
|
| 448 |
+
## Priority Order
|
| 449 |
+
|
| 450 |
+
1. **UI/UX + Content** - Make it look professional and informative
|
| 451 |
+
2. **Templates** - Move HTML out of Python code
|
| 452 |
+
3. **Batch upload** - Key feature for usability
|
| 453 |
+
4. **Docker** - For deployment
|
| 454 |
+
5. **Testing** - For reliability
|
| 455 |
+
6. **NAR compliance** - For publication
|
|
@@ -4,8 +4,10 @@ import uuid
|
|
| 4 |
import json
|
| 5 |
from datetime import datetime, timedelta
|
| 6 |
from typing import Optional
|
| 7 |
-
from fastapi import APIRouter, Depends, HTTPException, Query
|
|
|
|
| 8 |
from sqlalchemy.orm import Session
|
|
|
|
| 9 |
|
| 10 |
from webapp.app.database import get_db
|
| 11 |
from webapp.app.models.job import Job
|
|
@@ -40,7 +42,7 @@ async def health_check(db: Session = Depends(get_db)):
|
|
| 40 |
|
| 41 |
db_connected = False
|
| 42 |
try:
|
| 43 |
-
db.execute("SELECT 1")
|
| 44 |
db_connected = True
|
| 45 |
except Exception:
|
| 46 |
pass
|
|
@@ -319,7 +321,7 @@ async def get_example_sequences():
|
|
| 319 |
@router.get("/export/{job_id}/{format}", tags=["export"])
|
| 320 |
async def export_results(
|
| 321 |
job_id: str,
|
| 322 |
-
format: str =
|
| 323 |
db: Session = Depends(get_db),
|
| 324 |
):
|
| 325 |
"""
|
|
@@ -335,7 +337,12 @@ async def export_results(
|
|
| 335 |
raise HTTPException(status_code=400, detail="Job not yet complete")
|
| 336 |
|
| 337 |
if format == "json":
|
| 338 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 339 |
|
| 340 |
elif format in ("csv", "tsv"):
|
| 341 |
delimiter = "," if format == "csv" else "\t"
|
|
@@ -367,9 +374,11 @@ async def export_results(
|
|
| 367 |
]
|
| 368 |
content = delimiter.join(header) + "\n" + delimiter.join(row)
|
| 369 |
|
| 370 |
-
|
| 371 |
-
|
| 372 |
-
|
| 373 |
-
|
|
|
|
|
|
|
| 374 |
|
| 375 |
raise HTTPException(status_code=400, detail=f"Unsupported format: {format}")
|
|
|
|
| 4 |
import json
|
| 5 |
from datetime import datetime, timedelta
|
| 6 |
from typing import Optional
|
| 7 |
+
from fastapi import APIRouter, Depends, HTTPException, Query, Path
|
| 8 |
+
from fastapi.responses import Response
|
| 9 |
from sqlalchemy.orm import Session
|
| 10 |
+
from sqlalchemy import text
|
| 11 |
|
| 12 |
from webapp.app.database import get_db
|
| 13 |
from webapp.app.models.job import Job
|
|
|
|
| 42 |
|
| 43 |
db_connected = False
|
| 44 |
try:
|
| 45 |
+
db.execute(text("SELECT 1"))
|
| 46 |
db_connected = True
|
| 47 |
except Exception:
|
| 48 |
pass
|
|
|
|
| 321 |
@router.get("/export/{job_id}/{format}", tags=["export"])
|
| 322 |
async def export_results(
|
| 323 |
job_id: str,
|
| 324 |
+
format: str = Path(..., pattern="^(csv|json|tsv)$"),
|
| 325 |
db: Session = Depends(get_db),
|
| 326 |
):
|
| 327 |
"""
|
|
|
|
| 337 |
raise HTTPException(status_code=400, detail="Job not yet complete")
|
| 338 |
|
| 339 |
if format == "json":
|
| 340 |
+
content = json.dumps(job.to_dict(), indent=2)
|
| 341 |
+
return Response(
|
| 342 |
+
content=content,
|
| 343 |
+
media_type="application/json",
|
| 344 |
+
headers={"Content-Disposition": f'attachment; filename="result_{job_id}.json"'}
|
| 345 |
+
)
|
| 346 |
|
| 347 |
elif format in ("csv", "tsv"):
|
| 348 |
delimiter = "," if format == "csv" else "\t"
|
|
|
|
| 374 |
]
|
| 375 |
content = delimiter.join(header) + "\n" + delimiter.join(row)
|
| 376 |
|
| 377 |
+
media_type = "text/csv" if format == "csv" else "text/tab-separated-values"
|
| 378 |
+
return Response(
|
| 379 |
+
content=content,
|
| 380 |
+
media_type=media_type,
|
| 381 |
+
headers={"Content-Disposition": f'attachment; filename="result_{job_id}.{format}"'}
|
| 382 |
+
)
|
| 383 |
|
| 384 |
raise HTTPException(status_code=400, detail=f"Unsupported format: {format}")
|
|
@@ -13,14 +13,17 @@ class Settings(BaseSettings):
|
|
| 13 |
app_version: str = "1.0.0"
|
| 14 |
debug: bool = False
|
| 15 |
|
| 16 |
-
# Paths
|
| 17 |
-
|
|
|
|
|
|
|
| 18 |
model_path: Path = project_root / "output" / "custom_adjacency_regularizer_20210731_124_step3.h5"
|
| 19 |
data_path: Path = project_root / "data"
|
| 20 |
database_path: Path = Path(__file__).parent.parent / "splicing.db"
|
| 21 |
|
| 22 |
-
|
| 23 |
-
database_url
|
|
|
|
| 24 |
|
| 25 |
# Job settings
|
| 26 |
job_retention_days: int = 30
|
|
|
|
| 13 |
app_version: str = "1.0.0"
|
| 14 |
debug: bool = False
|
| 15 |
|
| 16 |
+
# Paths - computed at class definition time
|
| 17 |
+
# __file__ = webapp/app/config.py
|
| 18 |
+
# parent.parent.parent = interpretable-splicing-model/
|
| 19 |
+
project_root: Path = Path(__file__).parent.parent.parent
|
| 20 |
model_path: Path = project_root / "output" / "custom_adjacency_regularizer_20210731_124_step3.h5"
|
| 21 |
data_path: Path = project_root / "data"
|
| 22 |
database_path: Path = Path(__file__).parent.parent / "splicing.db"
|
| 23 |
|
| 24 |
+
@property
|
| 25 |
+
def database_url(self) -> str:
|
| 26 |
+
return f"sqlite:///{self.database_path}"
|
| 27 |
|
| 28 |
# Job settings
|
| 29 |
job_retention_days: int = 30
|
|
@@ -6,26 +6,19 @@ import tensorflow as tf
|
|
| 6 |
from typing import List, Tuple, Optional, Dict, Any
|
| 7 |
from pathlib import Path
|
| 8 |
import logging
|
|
|
|
| 9 |
|
| 10 |
from webapp.app.config import settings
|
| 11 |
|
| 12 |
# Set up logging
|
| 13 |
logger = logging.getLogger(__name__)
|
| 14 |
|
| 15 |
-
#
|
| 16 |
-
|
| 17 |
-
sys.path.insert(0, str(settings.project_root))
|
| 18 |
-
from
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
ResidualTuner,
|
| 22 |
-
SumDiff,
|
| 23 |
-
RegularizedBiasLayer,
|
| 24 |
-
MultiRegularizer,
|
| 25 |
-
pos_reg,
|
| 26 |
-
adj_reg_fo,
|
| 27 |
-
adj_reg_so,
|
| 28 |
-
)
|
| 29 |
|
| 30 |
|
| 31 |
class SplicingPredictor:
|
|
@@ -49,23 +42,13 @@ class SplicingPredictor:
|
|
| 49 |
"""Load the pre-trained TensorFlow model."""
|
| 50 |
logger.info(f"Loading model from {settings.model_path}")
|
| 51 |
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
"
|
| 56 |
-
|
| 57 |
-
"
|
| 58 |
-
|
| 59 |
-
"pos_reg": pos_reg,
|
| 60 |
-
"adj_reg_fo": adj_reg_fo,
|
| 61 |
-
"adj_reg_so": adj_reg_so,
|
| 62 |
-
}
|
| 63 |
-
|
| 64 |
-
self._model = tf.keras.models.load_model(
|
| 65 |
-
str(settings.model_path),
|
| 66 |
-
custom_objects=custom_objects,
|
| 67 |
-
)
|
| 68 |
-
logger.info("Model loaded successfully")
|
| 69 |
|
| 70 |
@property
|
| 71 |
def model(self) -> tf.keras.Model:
|
|
|
|
| 6 |
from typing import List, Tuple, Optional, Dict, Any
|
| 7 |
from pathlib import Path
|
| 8 |
import logging
|
| 9 |
+
import sys
|
| 10 |
|
| 11 |
from webapp.app.config import settings
|
| 12 |
|
| 13 |
# Set up logging
|
| 14 |
logger = logging.getLogger(__name__)
|
| 15 |
|
| 16 |
+
# Add figures directory to path - this auto-registers custom layers
|
| 17 |
+
# via @register_keras_serializable decorators when quad_model is imported
|
| 18 |
+
sys.path.insert(0, str(settings.project_root / 'figures'))
|
| 19 |
+
from quad_model import * # noqa: E402, F401, F403
|
| 20 |
+
|
| 21 |
+
from tensorflow.keras.models import load_model
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 22 |
|
| 23 |
|
| 24 |
class SplicingPredictor:
|
|
|
|
| 42 |
"""Load the pre-trained TensorFlow model."""
|
| 43 |
logger.info(f"Loading model from {settings.model_path}")
|
| 44 |
|
| 45 |
+
try:
|
| 46 |
+
# Simple load - custom layers already registered via quad_model import
|
| 47 |
+
self._model = load_model(str(settings.model_path))
|
| 48 |
+
logger.info("Model loaded successfully")
|
| 49 |
+
except Exception as e:
|
| 50 |
+
logger.error(f"Failed to load model: {e}")
|
| 51 |
+
raise
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 52 |
|
| 53 |
@property
|
| 54 |
def model(self) -> tf.keras.Model:
|
|
@@ -8,10 +8,12 @@ sqlalchemy>=2.0.0
|
|
| 8 |
aiosqlite>=0.19.0
|
| 9 |
|
| 10 |
# Model & ML
|
| 11 |
-
tensorflow
|
| 12 |
-
numpy>=1.26.0
|
| 13 |
joblib>=1.3.0
|
| 14 |
scikit-learn>=1.4.0
|
|
|
|
|
|
|
| 15 |
|
| 16 |
# Visualization
|
| 17 |
plotly>=5.18.0
|
|
|
|
| 8 |
aiosqlite>=0.19.0
|
| 9 |
|
| 10 |
# Model & ML
|
| 11 |
+
tensorflow==2.15.0 # Pin to 2.15 (last Keras 2 version) for model compatibility
|
| 12 |
+
numpy>=1.26.0,<2.0
|
| 13 |
joblib>=1.3.0
|
| 14 |
scikit-learn>=1.4.0
|
| 15 |
+
tqdm # Required by figutils
|
| 16 |
+
scipy # Required by figutils
|
| 17 |
|
| 18 |
# Visualization
|
| 19 |
plotly>=5.18.0
|