Spaces:
Sleeping
PyShiny Visualization Specification
For: Visualization teammate assignment Project: RNA Splicing Prediction Web Application Date: 2026-01-12
Executive Summary
This document specifies all visualization work needed for the interpretable RNA splicing prediction webapp. The project uses PyShiny for interactive visualizations. Your task is to implement these visualizations to complement the existing FastAPI + Jinja2 frontend.
Critical path: Force Plot Backend β Force Plot Frontend
Table of Contents
- Project Context
- Current Architecture
- Visualization Tasks
- Reference Code
- Technical Details
- Getting Started
1. Project Context
What This App Does
This web application predicts PSI (Percent Spliced In) values for RNA exon sequences:
- Input: 70-nucleotide DNA sequence (exon)
- Output: PSI value (0-1) indicating how often the exon is included in mature mRNA
- PSI = 1: Exon always included
- PSI = 0: Exon always skipped
Why Visualizations Matter
The model is interpretable - it can show WHY it made a prediction by visualizing:
- Which positions in the sequence promote inclusion
- Which positions promote skipping
- How RNA secondary structure affects splicing
Current State
| Component | Status | Notes |
|---|---|---|
| Backend API | β Complete | FastAPI, predictions work |
| HTML Templates | β Complete | Jinja2 + Tailwind CSS |
| Basic Force Plot | β οΈ Partial | Shows bars but data incomplete |
| Advanced Visualizations | β Not started | Your task |
2. Current Architecture
File Structure
webapp/
βββ app/
β βββ main.py # FastAPI app + routes
β βββ api/routes.py # API endpoints
β βββ services/
β βββ predictor.py # Model wrapper (MODIFY THIS)
βββ templates/
β βββ result.html # Results page (force plot here)
βββ static/
β βββ js/
β βββ result.js # Current Plotly visualization
βββ docs/
βββ PYSHINY_VISUALIZATION_SPEC.md # This file
Data Flow
User submits sequence
β
/api/predict endpoint
β
SplicingPredictor.predict_single()
βββ add_flanking(70nt β 90nt)
βββ nts_to_vector() β one-hot encoding
βββ get_structure() β ViennaRNA call
βββ model.predict() β PSI value
βββ get_force_plot_data() β [INCOMPLETE - needs work]
β
Store in database (Job model)
β
/result/{job_id} page
β
result.js fetches /api/result/{job_id}
β
Plotly renders force plot
Current Force Plot Issue
The get_force_plot_data() method extracts raw neural network activations but doesn't:
- Cluster filters by behavior
- Aggregate into meaningful "forces"
- Apply the link function for PSI scale
This is the critical blocking task.
3. Visualization Tasks
TASK 1: Force Plot Backend (CRITICAL - Do First)
Location: webapp/app/services/predictor.py
Priority: BLOCKING - all other viz tasks depend on this
What You Need to Implement
def _compute_forces(self, sequence: str) -> dict:
"""
Compute position-wise force contributions for the force plot.
Returns:
{
"positions": [1, 2, ..., 90],
"inclusion_forces": {
"group_1": [force_at_pos_1, force_at_pos_2, ...],
"group_2": [...],
...
},
"skipping_forces": {
"group_1": [...],
...
},
"delta_force": [incl_1 - skip_1, incl_2 - skip_2, ...],
"annotations": ["incl_seq_0", "skip_struct_1", ...],
"psi_scale": {
"midpoint": 0.5,
"positions": [...] # for secondary y-axis
}
}
"""
Steps to Implement
Extract layer outputs (partially done):
# Get intermediate layer outputs qc_incl = model.get_layer('qc_incl').output # inclusion activations qc_skip = model.get_layer('qc_skip').output # skipping activationsCluster filters (NEW - reference
figures/force_plot.py:get_membership_dict()):# Group filters by correlation of their activations # Creates groups like: [filter_0, filter_3, filter_7] β "group_A"Aggregate activations (NEW):
# Sum ReLU activations within each group # Result: one force value per position per groupApply link function (NEW - reference
figures/force_plot.py:get_model_midpoint()):# Map force values to PSI scale # Find the midpoint where PSI = 0.5
Reference Implementation
Study these files carefully:
/figures/force_plot.py- Lines 100-250 have the clustering logic/figures/force_plot.py:draw_force_plot()- The full visualization pipeline/2022_03_11_figures/position_specific_activations.ipynb- Working examples
TASK 2: Enhanced Force Plot Frontend
Location: webapp/static/js/result.js OR new PyShiny component
Priority: HIGH (after Task 1)
Current Implementation (Basic)
// webapp/static/js/result.js
function createForcePlot(forceData) {
// Simple bar chart with green/red colors
// Doesn't show filter groups
// Doesn't have secondary PSI axis
}
Target Implementation
Option A: Enhanced Plotly (Recommended for now)
- Stacked bar chart showing filter group contributions
- Color each segment by group
- Secondary y-axis showing PSI values
- Hover shows: position, nucleotide, structure, force breakdown
Option B: PyShiny + Plotly
- Full PyShiny component with reactive updates
- Filter group selector
- Interactive highlighting
Visual Design
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Force Plot β
β PSI βββββββββββββββββββββββββββββββββββββββββββββ 0.9 β
β β
β ββββ ββββββ β
β ββββ βββ βββ ββββββ βββ β
β ββββ βββ ββ βββββ ββββββ βββ ββ β
β βββββββββββββββββββββββββββββββββββββββββββββββββ 0.5 β
β βββ βββ β
β βββ βββββ ββββ β
β β
β |----5' flank----|--------EXON--------|--3' flank--| β
β 1 10 20 30 40 50 60 70 80 90 β
β β
β β Inclusion forces (by group) β
β β Skipping forces (by group) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
TASK 3: Position Saliency Heatmap
Location: New component in result page Priority: HIGH
What It Shows
A heatmap showing which positions in the sequence are most important for the prediction.
Position: 1 2 3 4 5 ... 86 87 88 89 90
ββββββββββββββββββββββββββββββββββββ
Filter 1 ββββββββββββββββββββββββββββββββββββ
Filter 2 ββββββββββββββββββββββββββββββββββββ
Filter 3 ββββββββββββββββββββββββββββββββββββ
... β β
Filter 20 ββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββ
ββ = High activation (important)
ββ = Low activation (less important)
Data Needed
{
"positions": [1-90],
"filters": ["filter_1", "filter_2", ...],
"activations": [
[pos1_f1, pos2_f1, ...], # filter 1 activations
[pos1_f2, pos2_f2, ...], # filter 2 activations
...
],
"filter_types": ["sequence", "structure", ...],
"filter_roles": ["inclusion", "skipping", ...]
}
Implementation
- Plotly heatmap with custom colorscale
- Blue for inclusion filters, Red for skipping filters
- Click to highlight in force plot
- Hover to show exact values
TASK 4: RNA Structure Viewer
Location: Result page, below force plot Priority: HIGH
Current State
Just text display of dot-bracket notation:
Structure: ...(((...)))...((((....))))...
MFE: -12.30 kcal/mol
Target: Option A - Styled Text (Simpler)
<div class="structure-viewer">
<span class="unpaired">...</span>
<span class="paired-left">(((</span>
<span class="unpaired">...</span>
<span class="paired-right">)))</span>
...
</div>
- Color-coded by pairing status
- Hover to highlight paired bases
- Show nucleotide sequence aligned below
Target: Option B - Interactive Diagram (More Complex)
Use Forna.js library to render actual 2D structure:
- Nucleotides as circles
- Base pairs as lines
- Stems and loops clearly visible
- Click to highlight positions
TASK 5: PSI Gauge/Indicator
Location: Result page, prominent display Priority: MEDIUM
Current State
Just a colored number:
<p class="text-5xl font-bold text-green-600">0.963</p>
Target: Gauge Chart
High Inclusion
β²
ββββββββββ΄βββββββββ
β± β²
β ββββββββββββββββ β 0.96
β β
β² β±
ββββββββββ¬βββββββββ
βΌ
High Skipping
- Plotly gauge or indicator
- Color gradient: Red (0) β Yellow (0.5) β Green (1)
- Animated needle/indicator
- Clear labels for interpretation
TASK 6: Batch Results Visualization
Location: New batch results page Priority: MEDIUM
What It Shows
When user submits multiple sequences:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Batch Results (15 sequences) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β PSI Distribution Summary Stats β
β ββββββββββββββ βββββββββββββ β
β β ββββ β Mean: 0.62 β
β β ββββββββ β Std: 0.28 β
β ββββββββββββββ Min: 0.08 β
β ββββββββββββββ Max: 0.97 β
β 0 0.5 1 β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β # β Sequence (first 20nt) β PSI β Category β Plot β
βββββΌβββββββββββββββββββββββββΌββββββββΌβββββββββββΌβββββββββ€
β 1 β GGTAGTACGCCAATTCGCC... β 0.963 β High β [βββ] β
β 2 β CTACCACCTCCCAAGCTTA... β 0.487 β Variable β [βββ] β
β 3 β ACACTCCGCAGCACACTCG... β 0.008 β Low β [βββ] β
βββββ΄βββββββββββββββββββββββββ΄ββββββββ΄βββββββββββ΄βββββββββ
Components
- PSI histogram - Distribution of predictions
- Summary statistics - Mean, std, min, max
- Sortable table - Click headers to sort
- Mini force plots - Small inline visualization per row
- Click to expand - Full details for each sequence
TASK 7: Activation Gallery (Advanced)
Location: New page /methodology/activations
Priority: LOW (nice to have)
What It Shows
A gallery of what each neural network filter has learned:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Filter Gallery - Understanding What the Model Learned β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Filter 1 β β Filter 2 β β Filter 3 β β
β β [Seq Logo] β β [Seq Logo] β β [Seq Logo] β β
β β Type: Seq β β Type: Struct β β Type: Seq β β
β β Role: Incl β β Role: Skip β β Role: Incl β β
β β [Click] β β [Click] β β [Click] β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
This is complex - only do if time permits.
4. Reference Code
Key Files to Study
| File | What to Learn |
|---|---|
/figures/force_plot.py |
CRITICAL - Main force plot algorithm |
/figures/figutils.py |
Data preparation utilities |
/figures/quad_model.py |
Model architecture, custom layers |
/figures/sequence_logo.py |
Sequence logo visualization |
/2022_03_11_figures/position_specific_activations.ipynb |
Working visualization examples |
/2022_03_11_figures/figure_force_plots.ipynb |
Force plot examples |
force_plot.py Key Functions
# Get filter groupings by correlation
get_membership_dict(model, activations) β {filter_id: group_id}
# Compute the PSI midpoint for scaling
get_model_midpoint(model) β float
# Main visualization function
draw_force_plot(
sequences, # List of 70nt sequences
annotations, # Labels for each sequence
highlight_forces=[], # Which forces to emphasize
figsize=(20, 5),
vertical=False,
custom_model=model,
) β matplotlib figure
Model Layer Names
# Key layers in the trained model
"qc_incl" # Inclusion branch convolution output
"qc_skip" # Skipping branch convolution output
"position_bias_incl" # Position-specific inclusion bias
"position_bias_skip" # Position-specific skipping bias
"energy_seq_struct" # Link function (energy to PSI)
5. Technical Details
Model Input/Output
Input (90 positions Γ 8 features):
sequence_onehot # Shape: (90, 4) - A, C, G, T
structure_onehot # Shape: (90, 3) - unpaired, left-pair, right-pair
wobble_indicator # Shape: (90, 1) - G-U wobble base pairs
Output:
psi # Shape: (1,) - float between 0 and 1
Intermediate Activations
# After convolution, before aggregation
qc_incl_activations # Shape: (90-5, 20) for 20 filters, width 6
qc_skip_activations # Shape: (90-29, 8) for 8 filters, width 30
# After position bias
inclusion_energy # Shape: (1,) - summed inclusion forces
skipping_energy # Shape: (1,) - summed skipping forces
Color Scheme
| Element | Color | Hex |
|---|---|---|
| Inclusion (positive) | Green | #22c55e |
| Skipping (negative) | Red | #ef4444 |
| Neutral | Gray | #9ca3af |
| Primary (buttons) | Blue | #3b82f6 |
| Background | Light gray | #f9fafb |
6. Getting Started
Setup Environment
# 1. Navigate to project
cd /path/to/interpretable-splicing-model
# 2. Activate virtual environment
source venv310/bin/activate
# 3. Install dependencies (if not done)
pip install -r webapp/requirements.txt
# 4. Start the server
python -m uvicorn webapp.app.main:app --reload --port 8000
# 5. Open browser
open http://localhost:8000
Test a Prediction
# Submit a test sequence
curl -X POST http://localhost:8000/api/predict \
-H "Content-Type: application/json" \
-d '{"sequence": "GGTAGTACGCCAATTCGCCGGTGCCGCGAGCCAGAGGCTACCAAAACTTGACAAGCCTACATATACTACT"}'
# Response includes job_id, use it to view results
open http://localhost:8000/result/{job_id}
Run Research Notebooks
# Start Jupyter
cd 2022_03_11_figures
jupyter notebook
# Open position_specific_activations.ipynb to see working visualizations
Development Workflow
- Understand the data - Run notebooks to see what visualizations look like
- Modify backend - Update
predictor.pyto compute forces correctly - Test API - Verify
/api/result/{job_id}returns proper force data - Update frontend - Modify
result.jsor add PyShiny components - Test end-to-end - Full flow from input to visualization
Questions?
If you have questions about:
- Model architecture: Check
/figures/quad_model.pyand doc files in/01-10_*.md - Visualization logic: Check
/figures/force_plot.pyand research notebooks - API structure: Check
/webapp/app/api/routes.py - Frontend: Check
/webapp/templates/result.htmland/webapp/static/js/result.js
Success Criteria
Your work is complete when:
- Force plot shows correct stacked bar visualization with filter groups
- Hovering shows position, nucleotide, structure, and force breakdown
- Position saliency heatmap renders correctly
- Structure viewer shows colored dot-bracket notation
- PSI gauge provides intuitive visual feedback
- All visualizations work on Chrome, Firefox, Safari
- Mobile responsive (readable on 375px+ screens)
- No console errors
- Loading states during data fetch
This is important work that will make the model's predictions interpretable and useful for researchers.