--- library_name: pytorch license: mit pipeline_tag: text-generation tags: - protein-sequence-generation - flow-matching - bioinformatics - protein-language-models - pfam --- # LineageFlow RP55 Checkpoint LineageFlow is a Dirichlet flow-matching model designed for high-fidelity, family-aware protein sequence generation. It initializes generation from lineage priors derived from ancestral sequence reconstruction (ASR), turning generation into structured mutation from an evolved scaffold. - **Paper:** [LineageFlow: Flow Matching for High-Fidelity Family-Aware Protein Sequence Generation](https://huggingface.co/papers/2605.22252) - **Code:** [GitHub Repository](https://github.com/Jinx-byebye/LineageFlow) ## Model Description Current discrete generative models for proteins often start from uniform or masked-token noise, which can discard position-specific constraints induced by evolution. LineageFlow addresses this by using phylogeny-informed priors to maintain family validity and structural confidence while exploring within-family diversity. Across diverse protein families, LineageFlow achieves family validity close to natural sequences and improves predicted structural confidence over uniform or mask-initialized baselines. ## Usage ### Download Checkpoint You can download the checkpoint using the Hugging Face CLI: ```bash pip install -U "huggingface_hub[cli]" hf download jinxbye/LineageFlow \ lineageflow-rp55.ckpt \ --local-dir checkpoints ``` ### Batch Generation To generate a batch of sequences using the official inference script, run: ```bash python inference/batch_generate.py \ --config config/generation.json \ --ckpt checkpoints/lineageflow-rp55.ckpt \ --num-samples 512 \ --gpus all \ --out outputs/lineageflow_samples.fasta ``` For more detailed instructions on installation and single-family generation, please refer to the [GitHub repository](https://github.com/Jinx-byebye/LineageFlow). ## Citation ```bibtex @inproceedings{liang2026lineageflow, title = {LineageFlow: Flow Matching for High-Fidelity Family-Aware Protein Sequence Generation}, author = {Liang, Langzhang and Yang, Ming and Feng, Yi and Li, Junfan and Pan, Shirui and Xu, Yinghui and Ying, Tianlei and Zheng, Yizhen and Xu, Zenglin}, booktitle = {International Conference on Machine Learning}, year = {2026} } ```