Spaces:
Runtime error
Runtime error
| # Model documentation & parameters | |
| **Algorithm Version**: Which model version to use. | |
| **Property goals**: One or multiple properties that will be optimized. | |
| **Protein target**: An AAS of a protein target used for conditioning. Leave blank unless you use `affinity` as a `property goal`. | |
| **Decoding temperature**: The temperature parameter in the SMILES/SELFIES decoder. Higher values lead to more explorative choices, smaller values culminate in mode collapse. | |
| **Maximal sequence length**: The maximal number of SMILES tokens in the generated molecule. | |
| **Number of samples**: How many samples should be generated (between 1 and 50). | |
| **Limit**: Hypercube limits in the latent space. | |
| **Number of steps**: Number of steps for a GP optmization round. The longer the slower. Has to be at least `Number of initial points`. | |
| **Number of initial points**: Number of initial points evaluated. The longer the slower. | |
| **Number of optimization rounds**: Maximum number of optimization rounds. | |
| **Sampling variance**: Variance of the Gaussian noise applied during sampling from the optimal point. | |
| **Samples for evaluation**: Number of samples averaged for each minimization function evaluation. | |
| **Max. sampling steps**: Maximum number of sampling steps in an optmization round. | |
| **Seed**: The random seed used for initialization. | |
| # Model card -- PaccMannGP | |
| **Model Details**: [PaccMann<sup>GP</sup>](https://github.com/PaccMann/paccmann_gp) is a language-based Variational Autoencoder that is coupled with a GaussianProcess for controlled sampling. This model systematically explores the latent space of a trained molecular VAE. | |
| **Developers**: Jannis Born, Matteo Manica and colleagues from IBM Research. | |
| **Distributors**: Original authors' code wrapped and distributed by GT4SD Team (2023) from IBM Research. | |
| **Model date**: Published in 2022. | |
| **Model version**: A molecular VAE trained on 1.5M molecules from ChEMBL. | |
| **Model type**: A language-based molecular generative model that can be explored with Gaussian Processes to generate molecules with desired properties. | |
| **Information about training algorithms, parameters, fairness constraints or other applied approaches, and features**: | |
| Described in the [original paper](https://pubs.acs.org/doi/10.1021/acs.jcim.1c00889). | |
| **Paper or other resource for more information**: | |
| [Active Site Sequence Representations of Human Kinases Outperform Full Sequence Representations for Affinity Prediction and Inhibitor Generation: 3D Effects in a 1D Model (2022; *Journal of Chemical Information & Modeling*)](https://pubs.acs.org/doi/10.1021/acs.jcim.1c00889). | |
| **License**: MIT | |
| **Where to send questions or comments about the model**: Open an issue on [GT4SD repository](https://github.com/GT4SD/gt4sd-core). | |
| **Intended Use. Use cases that were envisioned during development**: Chemical research, in particular drug discovery. | |
| **Primary intended uses/users**: Researchers and computational chemists using the model for model comparison or research exploration purposes. | |
| **Out-of-scope use cases**: Production-level inference, producing molecules with harmful properties. | |
| **Factors**: Not applicable. | |
| **Metrics**: High reward on generating molecules with desired properties. | |
| **Datasets**: ChEMBL. | |
| **Ethical Considerations**: Unclear, please consult with original authors in case of questions. | |
| **Caveats and Recommendations**: Unclear, please consult with original authors in case of questions. | |
| Model card prototype inspired by [Mitchell et al. (2019)](https://dl.acm.org/doi/abs/10.1145/3287560.3287596?casa_token=XD4eHiE2cRUAAAAA:NL11gMa1hGPOUKTAbtXnbVQBDBbjxwcjGECF_i-WC_3g1aBgU1Hbz_f2b4kI_m1in-w__1ztGeHnwHs) | |
| ## Citation | |
| ```bib | |
| @article{born2022active, | |
| author = {Born, Jannis and Huynh, Tien and Stroobants, Astrid and Cornell, Wendy D. and Manica, Matteo}, | |
| title = {Active Site Sequence Representations of Human Kinases Outperform Full Sequence Representations for Affinity Prediction and Inhibitor Generation: 3D Effects in a 1D Model}, | |
| journal = {Journal of Chemical Information and Modeling}, | |
| volume = {62}, | |
| number = {2}, | |
| pages = {240-257}, | |
| year = {2022}, | |
| doi = {10.1021/acs.jcim.1c00889}, | |
| note ={PMID: 34905358}, | |
| URL = {https://doi.org/10.1021/acs.jcim.1c00889} | |
| } | |
| ``` |