supanthadey1 commited on
Commit
60a4581
·
verified ·
1 Parent(s): 6a5be1f

Document ESM-C embedding requirement

Browse files
Files changed (1) hide show
  1. README.md +23 -0
README.md CHANGED
@@ -35,6 +35,24 @@ Single protein-glycan pair or batch CSV. Glycans should be WURCS strings. Protei
35
 
36
  Affinose expects per-residue ESM-C 300M embeddings with shape `[L, 960]`. Do not use mean-pooled protein embeddings as model input.
37
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
  ## Output
39
 
40
  A scalar protein-glycan score from the trained Affinose head.
@@ -42,3 +60,8 @@ A scalar protein-glycan score from the trained Affinose head.
42
  ## Draft Notes
43
 
44
  The public model card still needs final license, citation, intended-use, and limitation wording.
 
 
 
 
 
 
35
 
36
  Affinose expects per-residue ESM-C 300M embeddings with shape `[L, 960]`. Do not use mean-pooled protein embeddings as model input.
37
 
38
+ ESM-C is a separate EvolutionaryScale protein model. The ESM-C weights are not included in this repository. Users should install the `esm` package and let it download ESM-C 300M from Hugging Face/EvolutionaryScale into their own runtime cache.
39
+
40
+ ```python
41
+ from esm.models.esmc import ESMC
42
+ from esm.sdk.api import ESMProtein, LogitsConfig
43
+
44
+ model = ESMC.from_pretrained("esmc_300m").to("cuda") # or "cpu"
45
+ protein = ESMProtein(sequence="MKTAYIAKQRQISFVKSHFSRQLEERLGLIEVQ")
46
+ protein_tensor = model.encode(protein)
47
+ out = model.logits(
48
+ protein_tensor,
49
+ LogitsConfig(sequence=True, return_embeddings=True),
50
+ )
51
+ protein_embeddings = out.embeddings # per-residue ESM-C 300M embeddings
52
+ ```
53
+
54
+ If Hugging Face requests authentication for ESM-C, users should authenticate with their own Hugging Face account/token and accept any required EvolutionaryScale terms. Bertose/Affinose tokens are not required once these repositories are public.
55
+
56
  ## Output
57
 
58
  A scalar protein-glycan score from the trained Affinose head.
 
60
  ## Draft Notes
61
 
62
  The public model card still needs final license, citation, intended-use, and limitation wording.
63
+
64
+ ## References
65
+
66
+ - EvolutionaryScale ESM package: https://github.com/evolutionaryscale/esm
67
+ - ESM-C 300M Hugging Face model: https://huggingface.co/EvolutionaryScale/esmc-300m-2024-12