Spaces:

sachin1801
/

splicing-predictor

Sleeping

sachin1801 commited on Jan 12

Commit

65da5d3

1 Parent(s): fc65a00

feat(webapp): working local server checkpoint

Core functionality now working:
- Model loading fixed (TensorFlow 2.15 + quad_model imports)
- PSI prediction pipeline operational
- Force plot visualization working
- CSV/JSON/TSV export with proper file downloads
- SQLite database with health check
- ViennaRNA RNA structure prediction

Key changes:
- predictor.py: Simplified model loading using quad_model decorators
- routes.py: Fixed export with Content-Disposition headers
- routes.py: Fixed SQLAlchemy 2.0 text() for health check
- requirements.txt: Pinned TensorFlow 2.15 for Keras 2 compatibility

Added:
- webapp/TODO.md: Comprehensive remaining work documentation
- test_model.py: Simple model test script
- Skills docs for TensorFlow/Keras model loading

Note: UI is basic (inline HTML), needs improvement. See TODO.md.

Files changed (9) hide show

.claude/skills/agent-log.md +105 -0
.claude/skills/tensorflow-keras-model-loading.md +109 -0
requirements.txt +10 -4
test_model.py +57 -0
webapp/TODO.md +455 -0
webapp/app/api/routes.py +17 -8
webapp/app/config.py +7 -4
webapp/app/services/predictor.py +14 -31
webapp/requirements.txt +4 -2

.claude/skills/agent-log.md CHANGED Viewed

@@ -139,6 +139,111 @@ See: `/Users/sachin/.claude/plans/tingly-sauteeing-bengio.md`
 ---
 ## Future Sessions
 _Sessions will be logged here as work progresses._

 ---
+## Session 2 - 2026-01-12
+### Session Start
+- **Task**: Run and test the pre-trained splicing model locally
+- **Status**: COMPLETE
+### Problem
+User could not load the pre-trained model (`custom_adjacency_regularizer_20210731_124_step3.h5`) with their existing Python 3.12 + TensorFlow 2.20 setup.
+### Errors Encountered
+1. `ValueError: Unknown layer: 'SlicingOpLambda'`
+2. `ValueError: Unknown layer: 'Custom>RegularizedBiasLayer'`
+3. `IndexError: list index out of range` in Keras functional.py
+### Investigation
+#### Key Information from User
+User provided context from the original model creator:
+- Model location: `output/custom_adjacency_regularizer_20210731_124_step3.h5`
+- Reference notebook: `figures/generate_csv_for_supplementary.ipynb`
+- Additional notebooks: `2022_03_11_figures/` folder (visualization notebooks)
+#### What We Discovered
+1. **From `figures/generate_csv_for_supplementary.ipynb`**:
+   - Simple loading approach: `from quad_model import *` then `load_model()`
+   - No manual custom_objects needed
+2. **From `2022_03_11_figures/position_specific_activations.ipynb`**:
+   - Notebook was run April 2022 with TensorFlow ~2.8
+   - Model loads with simple `tf.keras.models.load_model()`
+3. **From `figures/quad_model.py`**:
+   - All custom layers use `@tf.keras.utils.register_keras_serializable()` decorator
+   - This auto-registers layers when module is imported
+4. **Root Cause**:
+   - TensorFlow 2.16+ uses Keras 3 (breaking changes)
+   - Keras 3 cannot load H5 models with Lambda layers from Keras 2
+   - `tf_keras` compatibility layer is buggy for complex models
+### Solution Implemented
+1. **Installed Python 3.10 via pyenv**:
+   ```bash
+   pyenv install 3.10.13
+   ```
+2. **Created new virtual environment**:
+   ```bash
+   ~/.pyenv/versions/3.10.13/bin/python -m venv venv310
+   source venv310/bin/activate
+   ```
+3. **Installed TensorFlow 2.15** (last version with native Keras 2):
+   ```bash
+   pip install tensorflow==2.15.0 numpy pandas joblib scikit-learn matplotlib seaborn tqdm scipy
+   ```
+4. **Updated `test_model.py`** to use simple loading approach:
+   ```python
+   import sys
+   sys.path.insert(0, 'figures')
+   from quad_model import *  # Auto-registers custom layers
+   from tensorflow.keras.models import load_model
+   model = load_model('output/...h5')
+   ```
+5. **Updated `requirements.txt`**:
+   - Changed from `tensorflow>=2.15.0` to `tensorflow==2.15.0`
+   - Added setup instructions for Python 3.10
+   - Removed `tf_keras` (not needed)
+### Results
+```
+Model loaded successfully!
+Number of test samples: 47962
+MSE: 0.032396
+R2 Score: 0.8224
+Correlation: 0.9069
+```
+### Files Modified
+- `test_model.py` - Simplified to use quad_model.py approach
+- `requirements.txt` - Pinned TensorFlow 2.15, added setup instructions
+### Files Created
+- `venv310/` - New Python 3.10 virtual environment
+- `.claude/skills/tensorflow-keras-model-loading.md` - Skill documentation
+### Key Learnings
+1. **TF 2.16+ breaks old H5 models** - Must use TF 2.15 or earlier for Keras 2 models
+2. **Python 3.12 requires TF 2.16+** - So must downgrade Python to 3.10/3.11
+3. **Check original notebooks first** - They show the working approach
+4. **`@register_keras_serializable()` is key** - Import the module to register layers
+5. **`tf_keras` is unreliable** - For complex models, use native TF 2.15 instead
+### Environment Summary
+| Environment | Python | TensorFlow | Status |
+|-------------|--------|------------|--------|
+| `venv` (old) | 3.12 | 2.20 | BROKEN - can delete |
+| `venv310` | 3.10.13 | 2.15.0 | WORKING |
+---
 ## Future Sessions
 _Sessions will be logged here as work progresses._

.claude/skills/tensorflow-keras-model-loading.md ADDED Viewed

	@@ -0,0 +1,109 @@

+# Skill: Loading Legacy TensorFlow/Keras Models
+## Problem Encountered
+When trying to load a pre-trained H5 model created in 2021 with TensorFlow 2.5, we encountered multiple errors with TensorFlow 2.20 (Python 3.12):
+1. `ValueError: Unknown layer: 'SlicingOpLambda'`
+2. `ValueError: Unknown layer: 'Custom>RegularizedBiasLayer'`
+3. `IndexError: list index out of range` in `process_node`
+## Root Cause
+- **TensorFlow 2.16+ uses Keras 3** which has breaking changes for loading old H5 models
+- Models with Lambda layers and custom layers saved with Keras 2 cannot be loaded with Keras 3
+- The `tf_keras` compatibility layer does NOT fully work for complex models with Lambda layers
+## Solution
+### 1. Use Python 3.10 + TensorFlow 2.15
+TensorFlow 2.15 is the **last version with native Keras 2 support**:
+```bash
+# Install Python 3.10 via pyenv
+pyenv install 3.10.13
+# Create virtual environment
+~/.pyenv/versions/3.10.13/bin/python -m venv venv310
+source venv310/bin/activate
+# Install TensorFlow 2.15 (NOT 2.16+)
+pip install tensorflow==2.15.0
+```
+### 2. Use `@register_keras_serializable()` Pattern
+The original codebase uses decorators to auto-register custom layers:
+```python
+@tf.keras.utils.register_keras_serializable()
+class MyCustomLayer(Layer):
+    ...
+```
+When you import from the module containing these decorators, the layers are automatically registered:
+```python
+# This auto-registers all custom layers
+from quad_model import *
+# Then you can load the model directly
+model = load_model('model.h5')
+```
+### 3. Don't Pass Custom Objects Manually (Usually)
+If the original code uses `@register_keras_serializable()`, you typically don't need to pass `custom_objects` to `load_model()`. The decorators handle registration.
+## TensorFlow/Keras Version Compatibility Matrix
+| Python | TensorFlow | Keras | Can Load Old H5? |
+|--------|------------|-------|------------------|
+| 3.12   | 2.16-2.20  | 3.x   | NO - Lambda layer bugs |
+| 3.11   | 2.15       | 2.15  | YES |
+| 3.10   | 2.10-2.15  | 2.x   | YES |
+| 3.10   | 2.8-2.9    | 2.x   | YES |
+## Key Lessons
+### DO:
+- Check when the model was created and what TensorFlow version was used
+- Look for existing notebooks that successfully load the model
+- Match the Python + TensorFlow version to the model's creation era
+- Use `@register_keras_serializable()` for custom layers
+- Pin TensorFlow version in requirements.txt (`tensorflow==2.15.0`)
+### DON'T:
+- Assume latest TensorFlow will load old models
+- Use `tf_keras` for complex models with Lambda layers (it's buggy)
+- Try to manually pass all custom objects if decorators exist
+- Use Python 3.12 with TensorFlow < 2.16 (incompatible)
+## Quick Diagnosis
+If you see these errors, it's likely a Keras 2 vs 3 compatibility issue:
+- `Unknown layer: 'SlicingOpLambda'`
+- `Unknown layer: 'Custom>...'`
+- `IndexError: list index out of range` in functional.py
+- Errors mentioning `_inbound_nodes`
+## Files to Check in Legacy Projects
+1. Look for `quad_model.py` or similar files with custom layer definitions
+2. Check if layers use `@tf.keras.utils.register_keras_serializable()`
+3. Find notebooks that successfully load the model (check their imports)
+4. Check model creation date from filename (e.g., `_20210731_` = July 2021)
+## Working Example
+```python
+"""Load legacy Keras model (created with TF 2.5-2.10)"""
+import sys
+sys.path.insert(0, 'figures')  # or wherever quad_model.py lives
+# Import registers all custom layers via decorators
+from quad_model import *
+from tensorflow.keras.models import load_model
+# Now load works without custom_objects
+model = load_model('output/model.h5')
+# Make predictions
+predictions = model.predict(data)
+```

requirements.txt CHANGED Viewed

@@ -1,14 +1,14 @@
 # Interpretable Splicing Model - Dependencies
-# Python 3.12+ compatible versions
 # Core ML dependencies
-tensorflow>=2.15.0
-tf_keras  # Keras 2.x compatibility layer (required for loading pre-trained model)
-numpy>=1.26.0
 pandas>=2.1.0
 joblib>=1.3.0
 scikit-learn>=1.4.0
 tqdm
 # Visualization (for figures and notebooks)
 matplotlib>=3.8.0
@@ -26,3 +26,9 @@ drawsvg
 #   macOS: brew tap brewsci/bio && brew install brewsci/bio/viennarna
 #   Ubuntu: sudo apt install vienna-rna
 # Verify: RNAfold --version

 # Interpretable Splicing Model - Dependencies
+# Requires Python 3.10 (TensorFlow 2.15 with Keras 2)
 # Core ML dependencies
+tensorflow==2.15.0  # Must use 2.15 (last version with Keras 2) for model compatibility
+numpy>=1.26.0,<2.0
 pandas>=2.1.0
 joblib>=1.3.0
 scikit-learn>=1.4.0
 tqdm
+scipy
 # Visualization (for figures and notebooks)
 matplotlib>=3.8.0
 #   macOS: brew tap brewsci/bio && brew install brewsci/bio/viennarna
 #   Ubuntu: sudo apt install vienna-rna
 # Verify: RNAfold --version
+# Setup instructions:
+# 1. Install Python 3.10 via pyenv: pyenv install 3.10.13
+# 2. Create venv: ~/.pyenv/versions/3.10.13/bin/python -m venv venv310
+# 3. Activate: source venv310/bin/activate
+# 4. Install: pip install -r requirements.txt

test_model.py ADDED Viewed

	@@ -0,0 +1,57 @@

+"""Simple script to test the pre-trained splicing model.
+This script uses the approach from the original notebooks:
+- figures/generate_csv_for_supplementary.ipynb
+- 2022_03_11_figures/position_specific_activations.ipynb
+Requires: Python 3.10 + TensorFlow 2.10 (see README for setup)
+"""
+import sys
+# Add figures directory to path so we can import quad_model
+sys.path.insert(0, 'figures')
+# Import from quad_model - this auto-registers all custom layers
+# via @tf.keras.utils.register_keras_serializable() decorators
+from quad_model import *
+from tensorflow.keras.models import load_model
+from joblib import load as jload
+import numpy as np
+print("Loading model...")
+model = load_model('output/custom_adjacency_regularizer_20210731_124_step3.h5')
+print("Model loaded successfully!")
+print("\nLoading test data...")
+xTe = jload('data/xTe_ES7_HeLa_ABC.pkl.gz')
+yTe = jload('data/yTe_ES7_HeLa_ABC.pkl.gz')
+num_samples = len(xTe[0]) if isinstance(xTe, list) else len(xTe)
+print(f"Number of test samples: {num_samples}")
+print("\nRunning predictions...")
+predictions = model.predict(xTe, verbose=0)
+print(f"\nResults:")
+print(f"Predictions shape: {predictions.shape}")
+print(f"\nFirst 10 predictions vs actual PSI values:")
+print("-" * 50)
+print(f"{'Predicted PSI':<15} {'Actual PSI':<15} {'Diff':<10}")
+print("-" * 50)
+for i in range(min(10, len(predictions))):
+    pred = predictions[i, 0]
+    actual = yTe[i]
+    diff = pred - actual
+    print(f"{pred:<15.4f} {actual:<15.4f} {diff:<10.4f}")
+# Calculate overall metrics
+from sklearn.metrics import mean_squared_error, r2_score
+mse = mean_squared_error(yTe, predictions)
+r2 = r2_score(yTe, predictions)
+correlation = np.corrcoef(yTe.flatten(), predictions.flatten())[0, 1]
+print(f"\nOverall Metrics:")
+print(f"  MSE: {mse:.6f}")
+print(f"  R2 Score: {r2:.4f}")
+print(f"  Correlation: {correlation:.4f}")

webapp/TODO.md ADDED Viewed

	@@ -0,0 +1,455 @@

+# Splicing Predictor Web Application - Remaining Work
+> **Current Status**: Core prediction functionality working. UI needs significant improvements.
+>
+> **Last Updated**: 2026-01-12
+---
+## Table of Contents
+1. [Completed Work](#completed-work)
+2. [UI/UX Improvements (HIGH PRIORITY)](#1-uiux-improvements-high-priority)
+3. [Missing Content](#2-missing-content)
+4. [Feature Gaps](#3-feature-gaps)
+5. [Technical Debt](#4-technical-debt)
+6. [Deployment](#5-deployment)
+7. [NAR Web Server Compliance](#6-nar-web-server-compliance)
+8. [Testing](#7-testing)
+---
+## Completed Work
+- [x] Model loading with TensorFlow 2.15 (Keras 2 compatibility)
+- [x] PSI prediction pipeline
+- [x] RNA secondary structure prediction (ViennaRNA integration)
+- [x] Force plot visualization (Plotly)
+- [x] Single sequence prediction API
+- [x] Batch prediction API
+- [x] CSV/JSON/TSV export with proper file downloads
+- [x] SQLite database for job storage
+- [x] Health check endpoint
+- [x] Example sequences endpoint
+- [x] Basic result page with force plot
+---
+## 1. UI/UX Improvements (HIGH PRIORITY)
+### Current Problems
+The current UI is a basic inline HTML fallback with no design system:
+- **No proper template system** - HTML is embedded in Python code (`webapp/app/main.py`)
+- **No CSS framework** - Using inline `<style>` tags
+- **No navigation** - Users can't easily move between pages
+- **No responsive design** - Doesn't work well on mobile
+- **No loading states** - No spinners or progress indicators
+- **No error messages UI** - Errors show as basic alerts
+- **Inconsistent styling** - Each page styled separately
+### Required Improvements
+- [ ] **Move to Jinja2 templates** (`webapp/templates/`)
+  - [ ] `base.html` - Base template with navigation
+  - [ ] `index.html` - Home/prediction page
+  - [ ] `result.html` - Results display
+  - [ ] `about.html` - About the model
+  - [ ] `methodology.html` - Technical details
+  - [ ] `help.html` - User guide
+  - [ ] `batch.html` - Batch upload interface
+- [ ] **Add CSS framework** (Tailwind CSS recommended)
+  - [ ] Install Tailwind or use CDN
+  - [ ] Create consistent design system
+  - [ ] Add dark mode support (optional)
+- [ ] **Navigation header**
+  - [ ] Logo/branding
+  - [ ] Links: Home, About, Methodology, Help, API Docs
+  - [ ] Mobile hamburger menu
+- [ ] **Footer**
+  - [ ] Citation information
+  - [ ] Contact/feedback link
+  - [ ] Privacy policy link
+  - [ ] Funding acknowledgments
+- [ ] **Loading states**
+  - [ ] Spinner during prediction
+  - [ ] Progress bar for batch uploads
+  - [ ] Skeleton loaders for async content
+- [ ] **Error handling**
+  - [ ] Toast notifications for errors
+  - [ ] Inline validation messages
+  - [ ] Friendly error pages (404, 500)
+- [ ] **Responsive design**
+  - [ ] Mobile-friendly layout
+  - [ ] Touch-friendly buttons
+  - [ ] Readable text on all devices
+---
+## 2. Missing Content
+### About the Model
+The landing page has almost no information about what the model does. Need to add:
+- [ ] **What it predicts**
+  - PSI (Percent Spliced In) values
+  - Range: 0 (completely skipped) to 1 (completely included)
+  - Alternative splicing outcomes
+- [ ] **How it works (simplified)**
+  - Takes 70nt exon sequence as input
+  - Adds flanking sequences
+  - Predicts RNA secondary structure
+  - Neural network predicts splicing outcome
+- [ ] **Who should use it**
+  - Researchers studying RNA splicing
+  - Designing synthetic exons
+  - Understanding splicing regulation
+- [ ] **Limitations**
+  - Only works with 70nt exon sequences
+  - Trained on HeLa cell data (ES7 library)
+  - May not generalize to all cell types
+  - Does not consider cellular context
+### Model Architecture Page
+- [ ] **Input features**
+  - Sequence one-hot encoding (90×4)
+  - Structure one-hot encoding (90×3)
+  - Wobble pair indicators (90×1)
+- [ ] **Architecture diagram**
+  - Sequence branch: Conv1D (20 filters, width 6)
+  - Structure branch: Conv1D (8 filters, width 30)
+  - Position-specific biases
+  - Inclusion vs skipping energy computation
+  - Residual tuner MLP
+  - Sigmoid output
+- [ ] **Interpretability features**
+  - Position-specific bias visualization
+  - Separate inclusion/skipping branches
+  - Force plot explanation
+### Research Background
+- [ ] **Citation**
+  ```
+  Liao SE, Sudarshan M, and Regev O.
+  "Machine learning for discovery: deciphering RNA splicing logic."
+  bioRxiv (2022).
+  ```
+- [ ] **Link to paper** (bioRxiv)
+- [ ] **Link to GitHub** (original repo)
+- [ ] **Contact information** for authors
+### Training Data Information
+- [ ] **Dataset**: ES7_HeLa (A, B, C libraries)
+- [ ] **Size**: ~150,000 synthetic exons
+- [ ] **Cell type**: HeLa cells
+- [ ] **Experimental method**: MPRA (Massively Parallel Reporter Assay)
+### Performance Metrics
+- [ ] **Test R²**: ~0.85
+- [ ] **Test RMSE**: ~0.12
+- [ ] **Correlation**: ~0.92
+- [ ] **Binary KL Loss**: ~0.015-0.020
+---
+## 3. Feature Gaps
+### High Priority
+- [ ] **Batch file upload**
+  - [ ] Accept FASTA format
+  - [ ] Accept CSV format (one sequence per line)
+  - [ ] Validate all sequences before processing
+  - [ ] Show progress during batch processing
+  - [ ] Allow download of all results
+- [ ] **Improved force plot**
+  - [ ] Show sequence letters on x-axis
+  - [ ] Highlight key positions
+  - [ ] Add structure annotation
+  - [ ] Export as PNG/SVG
+- [ ] **Result sharing**
+  - [ ] Permalink to results (already have job IDs)
+  - [ ] Copy link button
+  - [ ] Social sharing (optional)
+### Medium Priority
+- [ ] **PDF export**
+  - [ ] Formatted report with all results
+  - [ ] Include force plot image
+  - [ ] Include input sequence
+  - [ ] Include methodology summary
+- [ ] **Sequence editor**
+  - [ ] Syntax highlighting for nucleotides
+  - [ ] Visual feedback for invalid characters
+  - [ ] Complement/reverse complement tools
+- [ ] **Multiple examples**
+  - [ ] Show all 3 examples in UI
+  - [ ] Explain what each demonstrates
+  - [ ] Allow users to modify and re-predict
+### Low Priority
+- [ ] **Email notifications**
+  - [ ] Send results when job completes
+  - [ ] Optional (don't require email)
+- [ ] **Job history**
+  - [ ] Show recent predictions
+  - [ ] Allow re-running previous jobs
+  - [ ] LocalStorage for client-side history
+- [ ] **API key management** (if needed for rate limiting)
+---
+## 4. Technical Debt
+### Code Quality
+- [ ] **Extract HTML to templates**
+  - Move all inline HTML from `main.py` to `templates/`
+  - Use Jinja2 template inheritance
+- [ ] **CSS refactoring**
+  - Move inline styles to `static/css/`
+  - Use CSS variables for theming
+  - Consider CSS framework
+- [ ] **JavaScript improvements**
+  - Move inline scripts to `static/js/`
+  - Use modern ES6+ syntax
+  - Consider Alpine.js or htmx for interactivity
+### API Improvements
+- [ ] **Rate limiting**
+  - Prevent abuse
+  - Per-IP limits
+  - Optional API keys for higher limits
+- [ ] **Request validation**
+  - Better error messages
+  - Sequence format validation
+  - Input sanitization
+- [ ] **Response caching**
+  - Cache identical predictions
+  - Reduce computation for repeated requests
+### Database
+- [ ] **Job cleanup**
+  - Scheduled task to delete old jobs
+  - Configurable retention period
+- [ ] **Indexes**
+  - Add indexes for common queries
+  - Optimize job lookup by ID
+### Logging
+- [ ] **Structured logging**
+  - JSON format for production
+  - Request/response logging
+  - Error tracking
+- [ ] **Monitoring**
+  - Request latency metrics
+  - Error rate tracking
+  - Model prediction time
+---
+## 5. Deployment
+### Docker Configuration
+- [ ] **Dockerfile**
+  ```dockerfile
+  FROM python:3.10-slim
+  # Install ViennaRNA
+  # Copy application
+  # Install dependencies
+  # Run with gunicorn
+  ```
+- [ ] **docker-compose.yml**
+  - Web service
+  - Volume for database
+  - Environment variables
+- [ ] **.dockerignore**
+  - Exclude venv, __pycache__, .git
+### Production Server
+- [ ] **Gunicorn configuration**
+  - Multiple workers
+  - Timeout settings
+  - Logging
+- [ ] **Nginx reverse proxy**
+  - SSL termination
+  - Static file serving
+  - Rate limiting
+- [ ] **SSL/HTTPS**
+  - Let's Encrypt certificate
+  - Auto-renewal
+### Environment Management
+- [ ] **Environment variables**
+  - Database path
+  - Debug mode
+  - Secret key
+  - SMTP settings
+- [ ] **.env.example**
+  - Document all variables
+  - Provide defaults
+### Cloud Deployment Options
+- [ ] **Option A: VPS (DigitalOcean, Linode)**
+  - Full control
+  - Manual setup required
+- [ ] **Option B: Platform as a Service**
+  - Railway, Render, Fly.io
+  - Easier deployment
+  - May have cold start issues
+- [ ] **Option C: Container service**
+  - Google Cloud Run
+  - AWS Fargate
+  - Auto-scaling
+---
+## 6. NAR Web Server Compliance
+For publication in Nucleic Acids Research Web Server issue:
+### Required Pages
+- [ ] **Privacy policy**
+  - What data is collected
+  - How long it's stored
+  - Who has access
+- [ ] **Terms of service**
+  - Usage restrictions
+  - Disclaimer
+  - License
+- [ ] **Contact information**
+  - Email for support
+  - Issue reporting
+- [ ] **Funding acknowledgments**
+  - Grant numbers
+  - Institution
+### Accessibility (WCAG 2.1)
+- [ ] **Keyboard navigation**
+- [ ] **Screen reader support**
+- [ ] **Color contrast ratios**
+- [ ] **Alt text for images**
+- [ ] **Focus indicators**
+### Mobile Support
+- [ ] **Responsive layout**
+- [ ] **Touch-friendly targets**
+- [ ] **Readable font sizes**
+### Reliability
+- [ ] **99.9% uptime target**
+- [ ] **Monitoring and alerting**
+- [ ] **Backup strategy**
+- [ ] **Disaster recovery plan**
+---
+## 7. Testing
+### Unit Tests
+- [ ] **API endpoint tests**
+  - Test all routes
+  - Test error cases
+  - Test validation
+- [ ] **Model wrapper tests**
+  - Test prediction pipeline
+  - Test input preparation
+  - Test output format
+- [ ] **Database tests**
+  - Test job creation
+  - Test job retrieval
+  - Test job deletion
+### Integration Tests
+- [ ] **End-to-end prediction flow**
+- [ ] **Batch processing**
+- [ ] **Export functionality**
+### Load Testing
+- [ ] **Concurrent requests**
+- [ ] **Response time under load**
+- [ ] **Memory usage**
+---
+## Quick Start for Next Session
+To continue development:
+```bash
+# 1. Activate environment
+source venv310/bin/activate
+# 2. Start server
+python -m uvicorn webapp.app.main:app --reload --port 8000
+# 3. View app
+open http://localhost:8000
+```
+## Priority Order
+1. **UI/UX + Content** - Make it look professional and informative
+2. **Templates** - Move HTML out of Python code
+3. **Batch upload** - Key feature for usability
+4. **Docker** - For deployment
+5. **Testing** - For reliability
+6. **NAR compliance** - For publication

webapp/app/api/routes.py CHANGED Viewed

@@ -4,8 +4,10 @@ import uuid
 import json
 from datetime import datetime, timedelta
 from typing import Optional
-from fastapi import APIRouter, Depends, HTTPException, Query
 from sqlalchemy.orm import Session
 from webapp.app.database import get_db
 from webapp.app.models.job import Job
@@ -40,7 +42,7 @@ async def health_check(db: Session = Depends(get_db)):
     db_connected = False
     try:
-        db.execute("SELECT 1")
         db_connected = True
     except Exception:
         pass
@@ -319,7 +321,7 @@ async def get_example_sequences():
 @router.get("/export/{job_id}/{format}", tags=["export"])
 async def export_results(
     job_id: str,
-    format: str = Query(..., regex="^(csv|json|tsv)$"),
     db: Session = Depends(get_db),
 ):
     """
@@ -335,7 +337,12 @@ async def export_results(
         raise HTTPException(status_code=400, detail="Job not yet complete")
     if format == "json":
-        return job.to_dict()
     elif format in ("csv", "tsv"):
         delimiter = "," if format == "csv" else "\t"
@@ -367,9 +374,11 @@ async def export_results(
             ]
             content = delimiter.join(header) + "\n" + delimiter.join(row)
-        return {
-            "content": content,
-            "filename": f"result_{job_id}.{format}",
-        }
     raise HTTPException(status_code=400, detail=f"Unsupported format: {format}")

 import json
 from datetime import datetime, timedelta
 from typing import Optional
+from fastapi import APIRouter, Depends, HTTPException, Query, Path
+from fastapi.responses import Response
 from sqlalchemy.orm import Session
+from sqlalchemy import text
 from webapp.app.database import get_db
 from webapp.app.models.job import Job
     db_connected = False
     try:
+        db.execute(text("SELECT 1"))
         db_connected = True
     except Exception:
         pass
 @router.get("/export/{job_id}/{format}", tags=["export"])
 async def export_results(
     job_id: str,
+    format: str = Path(..., pattern="^(csv|json|tsv)$"),
     db: Session = Depends(get_db),
 ):
     """
         raise HTTPException(status_code=400, detail="Job not yet complete")
     if format == "json":
+        content = json.dumps(job.to_dict(), indent=2)
+        return Response(
+            content=content,
+            media_type="application/json",
+            headers={"Content-Disposition": f'attachment; filename="result_{job_id}.json"'}
+        )
     elif format in ("csv", "tsv"):
         delimiter = "," if format == "csv" else "\t"
             ]
             content = delimiter.join(header) + "\n" + delimiter.join(row)
+        media_type = "text/csv" if format == "csv" else "text/tab-separated-values"
+        return Response(
+            content=content,
+            media_type=media_type,
+            headers={"Content-Disposition": f'attachment; filename="result_{job_id}.{format}"'}
+        )
     raise HTTPException(status_code=400, detail=f"Unsupported format: {format}")

webapp/app/config.py CHANGED Viewed

@@ -13,14 +13,17 @@ class Settings(BaseSettings):
     app_version: str = "1.0.0"
     debug: bool = False
-    # Paths
-    project_root: Path = Path(__file__).parent.parent.parent.parent
     model_path: Path = project_root / "output" / "custom_adjacency_regularizer_20210731_124_step3.h5"
     data_path: Path = project_root / "data"
     database_path: Path = Path(__file__).parent.parent / "splicing.db"
-    # Database
-    database_url: str = f"sqlite:///{database_path}"
     # Job settings
     job_retention_days: int = 30

     app_version: str = "1.0.0"
     debug: bool = False
+    # Paths - computed at class definition time
+    # __file__ = webapp/app/config.py
+    # parent.parent.parent = interpretable-splicing-model/
+    project_root: Path = Path(__file__).parent.parent.parent
     model_path: Path = project_root / "output" / "custom_adjacency_regularizer_20210731_124_step3.h5"
     data_path: Path = project_root / "data"
     database_path: Path = Path(__file__).parent.parent / "splicing.db"
+    @property
+    def database_url(self) -> str:
+        return f"sqlite:///{self.database_path}"
     # Job settings
     job_retention_days: int = 30

webapp/app/services/predictor.py CHANGED Viewed

@@ -6,26 +6,19 @@ import tensorflow as tf
 from typing import List, Tuple, Optional, Dict, Any
 from pathlib import Path
 import logging
 from webapp.app.config import settings
 # Set up logging
 logger = logging.getLogger(__name__)
-# Import custom layers from model_training
-import sys
-sys.path.insert(0, str(settings.project_root))
-from model_training.model import (
-    binary_KL,
-    Selector,
-    ResidualTuner,
-    SumDiff,
-    RegularizedBiasLayer,
-    MultiRegularizer,
-    pos_reg,
-    adj_reg_fo,
-    adj_reg_so,
-)
 class SplicingPredictor:
@@ -49,23 +42,13 @@ class SplicingPredictor:
         """Load the pre-trained TensorFlow model."""
         logger.info(f"Loading model from {settings.model_path}")
-        custom_objects = {
-            "binary_KL": binary_KL,
-            "Selector": Selector,
-            "ResidualTuner": ResidualTuner,
-            "SumDiff": SumDiff,
-            "RegularizedBiasLayer": RegularizedBiasLayer,
-            "MultiRegularizer": MultiRegularizer,
-            "pos_reg": pos_reg,
-            "adj_reg_fo": adj_reg_fo,
-            "adj_reg_so": adj_reg_so,
-        }
-        self._model = tf.keras.models.load_model(
-            str(settings.model_path),
-            custom_objects=custom_objects,
-        )
-        logger.info("Model loaded successfully")
     @property
     def model(self) -> tf.keras.Model:

 from typing import List, Tuple, Optional, Dict, Any
 from pathlib import Path
 import logging
+import sys
 from webapp.app.config import settings
 # Set up logging
 logger = logging.getLogger(__name__)
+# Add figures directory to path - this auto-registers custom layers
+# via @register_keras_serializable decorators when quad_model is imported
+sys.path.insert(0, str(settings.project_root / 'figures'))
+from quad_model import *  # noqa: E402, F401, F403
+from tensorflow.keras.models import load_model
 class SplicingPredictor:
         """Load the pre-trained TensorFlow model."""
         logger.info(f"Loading model from {settings.model_path}")
+        try:
+            # Simple load - custom layers already registered via quad_model import
+            self._model = load_model(str(settings.model_path))
+            logger.info("Model loaded successfully")
+        except Exception as e:
+            logger.error(f"Failed to load model: {e}")
+            raise
     @property
     def model(self) -> tf.keras.Model:

webapp/requirements.txt CHANGED Viewed

@@ -8,10 +8,12 @@ sqlalchemy>=2.0.0
 aiosqlite>=0.19.0
 # Model & ML
-tensorflow>=2.15.0
-numpy>=1.26.0
 joblib>=1.3.0
 scikit-learn>=1.4.0
 # Visualization
 plotly>=5.18.0

 aiosqlite>=0.19.0
 # Model & ML
+tensorflow==2.15.0  # Pin to 2.15 (last Keras 2 version) for model compatibility
+numpy>=1.26.0,<2.0
 joblib>=1.3.0
 scikit-learn>=1.4.0
+tqdm  # Required by figutils
+scipy  # Required by figutils
 # Visualization
 plotly>=5.18.0