u8sand commited on
Commit
a5ca4b0
·
verified ·
1 Parent(s): ccd7396

Revert README.md

Browse files
Files changed (1) hide show
  1. README.md +39 -4
README.md CHANGED
@@ -7,7 +7,42 @@ tags:
7
  - pytorch_model_hub_mixin
8
  ---
9
 
10
- This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
11
- - Code: [More Information Needed]
12
- - Paper: [More Information Needed]
13
- - Docs: [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  - pytorch_model_hub_mixin
8
  ---
9
 
10
+ # GSFM
11
+
12
+ Trained on millions of gene sets automatically extracted from literature and raw RNA-seq data, GSFM learns to recover held-out genes from gene sets. The resulting model exhibits state of the art performance on gene function prediction.
13
+
14
+ ## Website
15
+
16
+ <https://gsfm.maayanlab.cloud/>
17
+
18
+ ## Usage
19
+
20
+ ```bash
21
+ # install gsfm python library from its source on huggingface
22
+ GIT_LFS_SKIP_SMUDGE=1 pip install git+https://huggingface.co/maayanlab/gsfm
23
+ ```
24
+
25
+ ```python
26
+ import torch
27
+ from gsfm import Vocab, GSFM
28
+
29
+ # load gsfm vocabulary and model weights
30
+ vocab = Vocab.from_pretrained('maayanlab/gsfm')
31
+ gsfm = GSFM.from_pretrained('maayanlab/gsfm')
32
+
33
+ # convert gene symbols into token ids
34
+ token_ids = torch.tensor(vocab(['ACE1', 'ACE2']))[None, :]
35
+
36
+ # use model to predict missing genes from the set
37
+ logits = torch.squeeze(gsfm(token_ids))
38
+ top_10 = sorted(zip(logits, vocab.vocab))[-10:]
39
+ top_10
40
+
41
+ # get gene embedding
42
+ gene_embeddings = gsfm.embedding(token_ids)
43
+ gene_embeddings
44
+
45
+ # get model middle layer
46
+ gene_set_encoding = gsfm.encode(token_ids)
47
+ gene_set_encoding
48
+ ```