--- license: apache-2.0 tags: - biology - protein - molecule - dna - rna - graph-neural-network --- # Cuttlefish-Encoder Graph encoder component of [Cuttlefish](https://huggingface.co/zihaojing/Cuttlefish), pretrained with masked reconstruction on all-atom structures (proteins, molecules, DNA, RNA). ## Usage ```python from huggingface_hub import snapshot_download encoder_dir = snapshot_download("zihaojing/Cuttlefish-Encoder") # Load via the Cuttlefish codebase # See https://github.com/your-repo/cuttlefish for full usage ``` ## Pretraining data Pretrained on **[Cuttlefish-Encoder-Data](https://huggingface.co/datasets/zihaojing/Cuttlefish-Encoder-Data)**, covering: - Molecules (SMILES → 3D graph) - Proteins (PDB/CIF → all-atom graph) - DNA and RNA sequences ## Model details - Architecture: All-atom graph encoder with masked reconstruction pretraining - Encoder hidden dim: 256 - Modalities: molecule, protein, dna, rna ## Related resources | Resource | Link | |---|---| | Full Cuttlefish LLM | [zihaojing/Cuttlefish](https://huggingface.co/zihaojing/Cuttlefish) | | SFT instruction data | [zihaojing/Cuttlefish-SFT-Data](https://huggingface.co/datasets/zihaojing/Cuttlefish-SFT-Data) | | Encoder pretraining data | [zihaojing/Cuttlefish-Encoder-Data](https://huggingface.co/datasets/zihaojing/Cuttlefish-Encoder-Data) |