codon-optimizer / README.md
“JoeyRiepsaame”
Pin Python 3.11 for numpy compatibility
8b9e033
---
title: "Codon Optimizer (codon-optimizer)"
emoji: 🧬
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.10.0
python_version: "3.11"
app_file: app.py
pinned: false
license: mit
---
# Codon Optimizer
Multi-objective codon optimization tool based on the GenScript GenSmart algorithm (Patent WO2020024917A1).
## Features
- **NSGA-III Optimization**: Uses multi-objective genetic algorithm for comprehensive optimization
- **Three Optimization Indices**:
- Harmony Index: Match codon usage to highly-expressed genes
- Codon Context Index: Optimize codon pair preferences
- Outlier Index: Minimize adverse sequence features (GC content, repeats, motifs)
- **10 Expression Hosts**: E. coli, Human, CHO, Yeast, Mouse, Insect, and more
- **Restriction Site Exclusion**: Avoid common restriction enzyme cut sites
- **Comprehensive Metrics**: CAI, GC content, and optimization indices
## Usage
1. Paste your protein or DNA sequence
2. Select target expression organism
3. Optionally select restriction sites to exclude
4. Click "Optimize Sequence"
5. Copy the optimized DNA sequence
## Algorithm
The optimization algorithm is based on GenScript's patent WO2020024917A1, which describes:
1. **Harmony Index (H)**: Measures codon usage frequency match to reference highly-expressed genes
2. **Codon Context Index (CC)**: Evaluates codon pair placement optimization
3. **Outlier Index (OI)**: Penalizes adverse features like extreme GC content, repeats, and splice sites
The NSGA-III algorithm optimizes all three objectives simultaneously to find Pareto-optimal solutions.
## References
- [GenScript GenSmart Codon Optimization](https://www.genscript.com/tools/gensmart-codon-optimization)
- [Patent WO2020024917A1](https://patents.google.com/patent/WO2020024917A1/en)
- Deb, K., & Jain, H. (2014). An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach. IEEE TEVC.