aletlvl commited on
Commit
ae8d2a3
·
verified ·
1 Parent(s): 8ca5e21

README updated

Browse files
Files changed (1) hide show
  1. README.md +15 -16
README.md CHANGED
@@ -1,12 +1,3 @@
1
- ---
2
- license: mit
3
- base_model:
4
- - aletlvl/Nicheformer
5
- tags:
6
- - single-cell
7
- - transcriptomics
8
- - biology
9
- ---
10
  # Nicheformer
11
 
12
  Nicheformer is a transformer-based model designed for understanding and predicting cellular niches and their interactions. The model uses masked language modeling to learn representations of cellular contexts and their relationships.
@@ -41,8 +32,13 @@ from transformers import AutoModelForMaskedLM, AutoTokenizer
41
  import anndata as ad
42
 
43
  # Load model and tokenizer
44
- model = AutoModelForMaskedLM.from_pretrained("aletlvl/Nicheformer")
45
- tokenizer = AutoTokenizer.from_pretrained("aletlvl/Nicheformer")
 
 
 
 
 
46
 
47
  # Load your single-cell data
48
  adata = ad.read_h5ad("your_data.h5ad")
@@ -50,8 +46,13 @@ adata = ad.read_h5ad("your_data.h5ad")
50
  # Tokenize the data
51
  inputs = tokenizer(adata)
52
 
53
- # Get predictions
54
- outputs = model(**inputs)
 
 
 
 
 
55
  ```
56
 
57
  ## Training Data
@@ -74,7 +75,6 @@ The model was trained on single-cell gene expression data from various tissues a
74
  - Performance may vary depending on the quality and type of input data
75
  - The model works best with data from supported species and technologies
76
 
77
-
78
  ## License
79
 
80
  This model is released under the MIT License. See the LICENSE file for more details.
@@ -89,7 +89,6 @@ This is the official repository for **Nicheformer: a foundation model for single
89
 
90
  [![Preprint](https://img.shields.io/badge/preprint-available-brightgreen)](https://www.biorxiv.org/content/10.1101/2024.04.15.589472v1)  
91
 
92
-
93
  ## Citation
94
 
95
  If you use our tool or build upon our concepts in your own work, please cite it as
@@ -130,4 +129,4 @@ We provide the Nicheformer pretraining weights on Mendeley data, they can be dow
130
  For questions and help requests, you can reach out (preferably) on GitHub or email to the corresponding author.
131
 
132
 
133
- [issue-tracker]: https://github.com/theislab/nicheformer/issues
 
 
 
 
 
 
 
 
 
 
1
  # Nicheformer
2
 
3
  Nicheformer is a transformer-based model designed for understanding and predicting cellular niches and their interactions. The model uses masked language modeling to learn representations of cellular contexts and their relationships.
 
32
  import anndata as ad
33
 
34
  # Load model and tokenizer
35
+ model = AutoModelForMaskedLM.from_pretrained("aletlvl/Nicheformer", trust_remote_code=True)
36
+ tokenizer = AutoTokenizer.from_pretrained("aletlvl/Nicheformer", trust_remote_code=True)
37
+
38
+ # Set technology mean for HF tokenizer
39
+ technology_mean_path = 'technology_mean.npy'
40
+ technology_mean = np.load(technology_mean_path)
41
+ tokenizer._load_technology_mean(technology_mean)
42
 
43
  # Load your single-cell data
44
  adata = ad.read_h5ad("your_data.h5ad")
 
46
  # Tokenize the data
47
  inputs = tokenizer(adata)
48
 
49
+ # Get embeddings
50
+ embeddings = model.get_embeddings(
51
+ input_ids=inputs["input_ids"],
52
+ attention_mask=inputs["attention_mask"],
53
+ layer=-1,
54
+ with_context=False
55
+ )
56
  ```
57
 
58
  ## Training Data
 
75
  - Performance may vary depending on the quality and type of input data
76
  - The model works best with data from supported species and technologies
77
 
 
78
  ## License
79
 
80
  This model is released under the MIT License. See the LICENSE file for more details.
 
89
 
90
  [![Preprint](https://img.shields.io/badge/preprint-available-brightgreen)](https://www.biorxiv.org/content/10.1101/2024.04.15.589472v1)  
91
 
 
92
  ## Citation
93
 
94
  If you use our tool or build upon our concepts in your own work, please cite it as
 
129
  For questions and help requests, you can reach out (preferably) on GitHub or email to the corresponding author.
130
 
131
 
132
+ [issue-tracker]: https://github.com/theislab/nicheformer/issues