zihaojing commited on
Commit
eda8a86
·
verified ·
1 Parent(s): 7db7b43

Add model card

Browse files
Files changed (1) hide show
  1. README.md +45 -0
README.md ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - biology
5
+ - protein
6
+ - molecule
7
+ - dna
8
+ - rna
9
+ - graph-neural-network
10
+ ---
11
+
12
+ # Cuttlefish-Encoder
13
+
14
+ Graph encoder component of [Cuttlefish](https://huggingface.co/zihaojing/Cuttlefish), pretrained with masked reconstruction on all-atom structures (proteins, molecules, DNA, RNA).
15
+
16
+ ## Usage
17
+
18
+ ```python
19
+ from huggingface_hub import snapshot_download
20
+ encoder_dir = snapshot_download("zihaojing/Cuttlefish-Encoder")
21
+
22
+ # Load via the Cuttlefish codebase
23
+ # See https://github.com/your-repo/cuttlefish for full usage
24
+ ```
25
+
26
+ ## Pretraining data
27
+
28
+ Pretrained on **[Cuttlefish-Encoder-Data](https://huggingface.co/datasets/zihaojing/Cuttlefish-Encoder-Data)**, covering:
29
+ - Molecules (SMILES → 3D graph)
30
+ - Proteins (PDB/CIF → all-atom graph)
31
+ - DNA and RNA sequences
32
+
33
+ ## Model details
34
+
35
+ - Architecture: All-atom graph encoder with masked reconstruction pretraining
36
+ - Encoder hidden dim: 256
37
+ - Modalities: molecule, protein, dna, rna
38
+
39
+ ## Related resources
40
+
41
+ | Resource | Link |
42
+ |---|---|
43
+ | Full Cuttlefish LLM | [zihaojing/Cuttlefish](https://huggingface.co/zihaojing/Cuttlefish) |
44
+ | SFT instruction data | [zihaojing/Cuttlefish-SFT-Data](https://huggingface.co/datasets/zihaojing/Cuttlefish-SFT-Data) |
45
+ | Encoder pretraining data | [zihaojing/Cuttlefish-Encoder-Data](https://huggingface.co/datasets/zihaojing/Cuttlefish-Encoder-Data) |