--- language: en library_name: transformers pipeline_tag: text-generation tags: - t5 - molecule-to-protein - smiles - protein-generation - binder - ligand license: apache-2.0 datasets: - contributor-anonymous/Mol2Pro-Binder-Dataset --- # Mol2Pro-base ## Model description - **Architecture:** T5-efficient-base https://huggingface.co/google/t5-efficient-base - **Tokenization:** https://huggingface.co/contributor-anonymous/Mol2Pro-tokenizer - **Code:** https://github.com/contributor-anonymous/Mol2Pro-tools - **Training data** https://huggingface.co/datasets/contributor-anonymous/Mol2Pro-Binder-Dataset ## How to use ```python from transformers import AutoTokenizer, AutoModelForSeq2SeqLM import torch model_id = "contributor-anonymous/Mol2Pro-base" tokenizer_id = "contributor-anonymous/Mol2Pro-tokenizer" # Load tokenizers tokenizer_mol = AutoTokenizer.from_pretrained(tokenizer_id, subfolder="smiles") tokenizer_aa = AutoTokenizer.from_pretrained(tokenizer_id, subfolder="aa") # Load model model = AutoModelForSeq2SeqLM.from_pretrained(model_id) ``` ## Intended use Research use only. The model generates candidate sequences conditioned on small-molecule inputs; it does not guarantee binding or function and must be validated experimentally.