LineageFlow / README.md
nielsr's picture
nielsr HF Staff
Improve model card and add metadata
8ba087c verified
|
raw
history blame
2.36 kB
metadata
library_name: pytorch
license: mit
pipeline_tag: text-generation
tags:
  - protein-sequence-generation
  - flow-matching
  - bioinformatics
  - protein-language-models
  - pfam

LineageFlow RP55 Checkpoint

LineageFlow is a Dirichlet flow-matching model designed for high-fidelity, family-aware protein sequence generation. It initializes generation from lineage priors derived from ancestral sequence reconstruction (ASR), turning generation into structured mutation from an evolved scaffold.

Model Description

Current discrete generative models for proteins often start from uniform or masked-token noise, which can discard position-specific constraints induced by evolution. LineageFlow addresses this by using phylogeny-informed priors to maintain family validity and structural confidence while exploring within-family diversity. Across diverse protein families, LineageFlow achieves family validity close to natural sequences and improves predicted structural confidence over uniform or mask-initialized baselines.

Usage

Download Checkpoint

You can download the checkpoint using the Hugging Face CLI:

pip install -U "huggingface_hub[cli]"

hf download jinxbye/LineageFlow \
  lineageflow-rp55.ckpt \
  --local-dir checkpoints

Batch Generation

To generate a batch of sequences using the official inference script, run:

python inference/batch_generate.py \
  --config config/generation.json \
  --ckpt checkpoints/lineageflow-rp55.ckpt \
  --num-samples 512 \
  --gpus all \
  --out outputs/lineageflow_samples.fasta

For more detailed instructions on installation and single-family generation, please refer to the GitHub repository.

Citation

@inproceedings{liang2026lineageflow,
  title     = {LineageFlow: Flow Matching for High-Fidelity Family-Aware Protein Sequence Generation},
  author    = {Liang, Langzhang and Yang, Ming and Feng, Yi and Li, Junfan and Pan, Shirui and Xu, Yinghui and Ying, Tianlei and Zheng, Yizhen and Xu, Zenglin},
  booktitle = {International Conference on Machine Learning},
  year      = {2026}
}