metadata
language:
- en
tags:
- biology
- genomics
- codon-optimization
- p-adic-math
- hyperbolic-geometry
- ddg-prediction
license: other
metrics:
- spearmanr
Ternary Codon Encoder: P-adic Hyperbolic Embeddings
The Ternary Codon Encoder is a neural embedding model that maps the 64 genetic codons into a 16-dimensional hyperbolic space. It is the first model to explicitly use 3-adic valuation as a mathematical prior to organize the genetic code's hierarchical structure.
Model Description
- Architecture: MLP-based encoder (12-dim one-hot input $ ightarrow$ 16-dim hyperbolic output).
- Mathematical Foundation: Leverages 3-adic mathematics to represent the discrete hierarchy of the codon table.
- Latent Space: Poincaré ball where radial distance encodes 3-adic valuation (conservation/variability).
Key Discoveries
- Physics Dimension: Latent dimension 13 correlates strongly ($ ho = -0.70$) with molecular mass, volume, and force constants ($k$).
- Linear Stability Manifold: Provides high-quality feature vectors for sequence-only protein stability ($\Delta\Delta G$) prediction.
- Synonymous Cohesion: Synonymous codons cluster together in hyperbolic space while maintaining clear boundaries between amino acid groups.
Performance
- DDG Spearman $ ho$: 0.614 (Sequence-only benchmarking on diverse datasets).
- Improvement: +105% over baseline p-adic embedding models.
Usage
import torch
from trainable_codon_encoder import TrainableCodonEncoder
# Load model
encoder = TrainableCodonEncoder(latent_dim=16, hidden_dim=64)
checkpoint = torch.load("pytorch_model.bin", map_location="cpu")
encoder.load_state_dict(checkpoint["model_state_dict"])
encoder.eval()
# Get embedding for a codon (e.g., ATG index 14)
codon_idx = torch.tensor([14])
with torch.no_grad():
z_hyp = encoder(codon_idx)
print(f"Hyperbolic Embedding: {z_hyp}")
Citation
@software{ternary_codon_2026,
author = {AI Whisperers},
title = {Ternary Codon Encoder: P-adic Hyperbolic Embeddings},
year = {2026},
url = {https://huggingface.co/ai-whisperers/ternary-codon-encoder}
}