offtargeteffect's picture
Deploy mRNA Design Studio (Docker SDK)
99f834c verified
|
Raw
History Blame Contribute Delete
5.25 kB

mRNA Scoring Models

This directory contains built-in mRNA scoring models for the mRNA Design Studio.

Available Models

1. RNAstructure MFE Scorer (rna_structure_scorer.py)

Purpose: Predicts the minimum free energy (MFE) of mRNA secondary structure.

Method: Uses ViennaRNA RNAfold algorithm to compute the thermodynamic stability of RNA secondary structures. More negative MFE values indicate stronger secondary structure formation.

Score Range: 0-100

  • 0-40: Weak/unstable secondary structure (may be too unstructured)
  • 40-70: Moderate secondary structure (optimal range for translation)
  • 70-100: Strong secondary structure (may inhibit translation)

Dependencies:

  • ViennaRNA Python package (optional)
  • If ViennaRNA is not available, falls back to GC-content based proxy scoring

Usage:

from models import RNAStructureMFEScorer

scorer = RNAStructureMFEScorer()
score = scorer.score(sequence)

Interpretation:

  • Target moderate scores (40-70) for optimal translation efficiency
  • Very low scores suggest the mRNA may be prone to degradation
  • Very high scores suggest strong secondary structures that may block ribosome access

2. mRNA Stability Scorer (mrna_stability_scorer.py)

Purpose: Composite stability prediction based on multiple sequence features.

Method: Combines five established mRNA design principles:

  1. GC Content (30% weight) - Optimal: 50-60%
  2. Codon Adaptation Index (CAI) (25% weight) - Codon optimization for host organism
  3. Homopolymer Detection (20% weight) - Penalizes long runs of identical nucleotides
  4. 5' UTR Structure (15% weight) - Moderate stability preferred
  5. Kozak Consensus (10% weight) - Translation initiation efficiency

Score Range: 0-100

  • 0-40: Poor stability/translation efficiency
  • 40-70: Acceptable design
  • 70-100: Excellent design

Dependencies:

  • BioPython (optional, for advanced CAI calculation)
  • ViennaRNA (optional, for UTR structure analysis)

Parameters:

  • organism (default: "human") - Target organism for codon optimization

Usage:

from models import mRNAStabilityScorer

scorer = mRNAStabilityScorer(organism="human")
score = scorer.score(sequence)

Individual Component Scores:

You can access individual component scores for detailed analysis:

scorer = mRNAStabilityScorer()

# Individual component scores
gc_score = scorer._score_gc_content(sequence)      # 0-100
cai_score = scorer._score_cai(sequence)           # 0-100
homopoly_score = scorer._score_homopolymers(sequence)  # 0-100
utr_score = scorer._score_utr_structure(sequence) # 0-100
kozak_score = scorer._score_kozak(sequence)       # 0-100

Interpretation:

  • 70+: Well-designed mRNA suitable for production
  • 40-70: Moderate quality, may benefit from optimization
  • <40: Significant design issues, optimization strongly recommended

Model Registry Integration

Both models implement the ScoringModel interface and can be loaded into the ModelRegistry:

from models import ModelRegistry, RNAStructureMFEScorer, mRNAStabilityScorer

registry = ModelRegistry()

# Register built-in models
registry._register(RNAStructureMFEScorer(), "scoring", "builtin", "models/rna_structure_scorer.py")
registry._register(mRNAStabilityScorer(), "scoring", "builtin", "models/mrna_stability_scorer.py")

# Run scoring on sequences
import pandas as pd
results = registry.run_scoring("RNAstructure MFE", sequences)

Testing

Run tests for both models:

pytest tests/test_models.py::TestRNAStructureMFEScorer -v
pytest tests/test_models.py::TestmRNAStabilityScorer -v

Adding Custom Models

To add your own scoring model:

  1. Create a new Python file in this directory
  2. Import and subclass ScoringModel:
from models.base import ScoringModel
from core.models.sequence import mRNASequence

class MyCustomScorer(ScoringModel):
    @property
    def name(self) -> str:
        return "My Custom Scorer"

    @property
    def description(self) -> str:
        return "Description of what this model does"

    def score(self, sequence: mRNASequence, metadata=None) -> float:
        # Your scoring logic here
        return 0.0  # Return score 0-100
  1. Load it via the ModelRegistry:
models = registry.load_local("path/to/your_model.py")

References

RNAstructure MFE Scorer

  • Lorenz et al. (2011). "ViennaRNA Package 2.0." Algorithms for Molecular Biology, 6:26.
  • Turner & Mathews (2010). "NNDB: the nearest neighbor parameter database for predicting stability of nucleic acid secondary structure." Nucleic Acids Research, 38:D280-282.

mRNA Stability Scorer

  • Mauro & Edelman (2002). "The ribosome filter hypothesis." PNAS, 99(19):12031-12036. (Kozak sequence)
  • Sharp & Li (1987). "The codon adaptation index—a measure of directional synonymous codon usage bias." Nucleic Acids Research, 15(3):1281-1295.
  • Kudla et al. (2009). "Coding-sequence determinants of gene expression in Escherichia coli." Science, 324(5924):255-258.
  • Presnyak et al. (2015). "Codon optimality is a major determinant of mRNA stability." Cell, 160(6):1111-1124.