| --- |
| library_name: pytorch |
| license: mit |
| pipeline_tag: text-generation |
| tags: |
| - protein-sequence-generation |
| - flow-matching |
| - bioinformatics |
| - protein-language-models |
| - pfam |
| --- |
| |
| # LineageFlow RP55 Checkpoint |
|
|
| LineageFlow is a Dirichlet flow-matching model designed for high-fidelity, family-aware protein sequence generation. It initializes generation from lineage priors derived from ancestral sequence reconstruction (ASR), turning generation into structured mutation from an evolved scaffold. |
|
|
| - **Paper:** [LineageFlow: Flow Matching for High-Fidelity Family-Aware Protein Sequence Generation](https://huggingface.co/papers/2605.22252) |
| - **Code:** [GitHub Repository](https://github.com/Jinx-byebye/LineageFlow) |
|
|
| ## Model Description |
|
|
| Current discrete generative models for proteins often start from uniform or masked-token noise, which can discard position-specific constraints induced by evolution. LineageFlow addresses this by using phylogeny-informed priors to maintain family validity and structural confidence while exploring within-family diversity. Across diverse protein families, LineageFlow achieves family validity close to natural sequences and improves predicted structural confidence over uniform or mask-initialized baselines. |
|
|
| ## Usage |
|
|
| ### Download Checkpoint |
|
|
| You can download the checkpoint using the Hugging Face CLI: |
|
|
| ```bash |
| pip install -U "huggingface_hub[cli]" |
| |
| hf download jinxbye/LineageFlow \ |
| lineageflow-rp55.ckpt \ |
| --local-dir checkpoints |
| ``` |
|
|
| ### Batch Generation |
|
|
| To generate a batch of sequences using the official inference script, run: |
|
|
| ```bash |
| python inference/batch_generate.py \ |
| --config config/generation.json \ |
| --ckpt checkpoints/lineageflow-rp55.ckpt \ |
| --num-samples 512 \ |
| --gpus all \ |
| --out outputs/lineageflow_samples.fasta |
| ``` |
|
|
| For more detailed instructions on installation and single-family generation, please refer to the [GitHub repository](https://github.com/Jinx-byebye/LineageFlow). |
|
|
| ## Citation |
|
|
| ```bibtex |
| @inproceedings{liang2026lineageflow, |
| title = {LineageFlow: Flow Matching for High-Fidelity Family-Aware Protein Sequence Generation}, |
| author = {Liang, Langzhang and Yang, Ming and Feng, Yi and Li, Junfan and Pan, Shirui and Xu, Yinghui and Ying, Tianlei and Zheng, Yizhen and Xu, Zenglin}, |
| booktitle = {International Conference on Machine Learning}, |
| year = {2026} |
| } |
| ``` |