Yuto2007 commited on
Commit
27f1bc1
Β·
verified Β·
1 Parent(s): 8517f78

Upload folder using huggingface_hub

Browse files
Files changed (4) hide show
  1. README.md +37 -89
  2. config.json +0 -1
  3. model.safetensors +2 -2
  4. tokenizer_config.json +0 -1
README.md CHANGED
@@ -1,6 +1,5 @@
1
  ---
2
  license: apache-2.0
3
- language: en
4
  tags:
5
  - biology
6
  - genomics
@@ -8,14 +7,12 @@ tags:
8
  library_name: transformers
9
  ---
10
 
11
- # TXModel - Standalone Version
12
 
13
- **Zero external dependencies!** This model requires only:
14
- - `transformers`
15
- - `torch`
16
- - `safetensors`
17
-
18
- No llmfoundry, composer, or other libraries needed!
19
 
20
  ## πŸš€ Quick Start
21
 
@@ -23,18 +20,17 @@ No llmfoundry, composer, or other libraries needed!
23
  from transformers import AutoModel
24
  import torch
25
 
26
- # Load model (downloads automatically from Hub)
27
  model = AutoModel.from_pretrained(
28
- "your-username/tx-model-standalone",
29
  trust_remote_code=True
30
  )
31
 
32
- # Prepare inputs
33
  genes = torch.randint(0, 100, (2, 10))
34
  values = torch.rand(2, 10)
35
  masks = torch.ones(2, 10).bool()
36
 
37
- # Inference
38
  model.eval()
39
  with torch.no_grad():
40
  output = model(genes=genes, values=values, gen_masks=masks)
@@ -42,116 +38,68 @@ with torch.no_grad():
42
  print(output.last_hidden_state.shape) # [2, 10, d_model]
43
  ```
44
 
 
 
 
 
 
 
 
45
  ## πŸ“¦ Installation
46
 
47
  ```bash
48
  pip install transformers torch safetensors
49
  ```
50
 
51
- That's it! No other dependencies required.
52
 
53
  ## 🎯 Usage
54
 
55
- The model works exactly like any other HuggingFace model:
56
 
57
  ```python
58
  from transformers import AutoModel
59
 
60
- # Load from Hub
61
  model = AutoModel.from_pretrained(
62
- "your-username/tx-model-standalone",
63
  trust_remote_code=True
64
  )
65
 
66
- # Or load locally
67
- model = AutoModel.from_pretrained(
68
- "./path/to/model",
69
- trust_remote_code=True
70
- )
71
-
72
- # Move to GPU
73
  device = "cuda" if torch.cuda.is_available() else "cpu"
74
  model = model.to(device)
75
- model.eval()
76
-
77
- # Your inference code here
78
- ```
79
-
80
- ## ⚑ Features
81
-
82
- - βœ… **Zero external dependencies** (only transformers + torch)
83
- - βœ… **Works with AutoModel** out of the box
84
- - βœ… **Hub-ready** - upload and share easily
85
- - βœ… **Same architecture** as original model
86
- - βœ… **Full compatibility** with existing weights
87
-
88
- ## πŸ“Š Model Details
89
-
90
- | Property | Value |
91
- |----------|-------|
92
- | Parameters | ~70M |
93
- | Architecture | Transformer Encoder |
94
- | Hidden Size | 512 |
95
- | Layers | 12 |
96
- | Attention Heads | 8 |
97
-
98
- ## πŸ”§ Advanced Usage
99
-
100
- ### Accessing Model Internals
101
-
102
- ```python
103
- # Access the TXModel directly
104
- tx_model = model.tx_model
105
-
106
- # Get cell embeddings
107
- output = model(genes, values, masks)
108
- cell_emb = output.last_hidden_state[:, 0, :] # CLS token
109
-
110
- # Get gene embeddings
111
- tx_output = tx_model(genes, values, masks, key_padding_mask=~genes.eq(0))
112
- gene_embs = tx_output["gene_embeddings"] # If return_gene_embeddings=True
113
  ```
114
 
115
  ### Batch Processing
116
 
117
  ```python
118
- from torch.utils.data import DataLoader
119
-
120
- # Your dataloader
121
- dataloader = DataLoader(dataset, batch_size=32)
122
-
123
- results = []
124
- for batch in dataloader:
125
- with torch.no_grad():
126
- output = model(
127
- genes=batch['genes'],
128
- values=batch['values'],
129
- gen_masks=batch['masks']
130
- )
131
- results.append(output.last_hidden_state)
132
  ```
133
 
134
- ## πŸ†š vs Original Version
135
 
136
- This standalone version:
137
- - βœ… Removes dependencies on llmfoundry and composer
138
- - βœ… Uses only PyTorch and Transformers components
139
- - βœ… Works with standard HuggingFace tools
140
- - βœ… Maintains same model architecture and weights
141
- - βœ… Easier to install and deploy
142
 
143
  ## πŸ“ Citation
144
 
145
- If you use this model, please cite the original work:
146
-
147
  ```bibtex
148
  @article{tahoe2024,
149
- title={Tahoe-x1: Foundation Model for Genomics},
150
  author={...},
151
  year={2024}
152
  }
153
  ```
154
-
155
- ## πŸ“„ License
156
-
157
- Apache 2.0
 
1
  ---
2
  license: apache-2.0
 
3
  tags:
4
  - biology
5
  - genomics
 
7
  library_name: transformers
8
  ---
9
 
10
+ # TXModel - Hub-Ready Version
11
 
12
+ **Zero-hassle deployment!** Requires ONLY:
13
+ ```bash
14
+ pip install transformers torch safetensors
15
+ ```
 
 
16
 
17
  ## πŸš€ Quick Start
18
 
 
20
  from transformers import AutoModel
21
  import torch
22
 
23
+ # Load from Hub (one command!)
24
  model = AutoModel.from_pretrained(
25
+ "your-username/model-name",
26
  trust_remote_code=True
27
  )
28
 
29
+ # Use immediately
30
  genes = torch.randint(0, 100, (2, 10))
31
  values = torch.rand(2, 10)
32
  masks = torch.ones(2, 10).bool()
33
 
 
34
  model.eval()
35
  with torch.no_grad():
36
  output = model(genes=genes, values=values, gen_masks=masks)
 
38
  print(output.last_hidden_state.shape) # [2, 10, d_model]
39
  ```
40
 
41
+ ## ✨ Features
42
+
43
+ - βœ… **Single file** - all code in `modeling.py`
44
+ - βœ… **Zero dependencies** (except transformers + torch)
45
+ - βœ… **Works with AutoModel** out of the box
46
+ - βœ… **No import errors** - everything self-contained
47
+
48
  ## πŸ“¦ Installation
49
 
50
  ```bash
51
  pip install transformers torch safetensors
52
  ```
53
 
54
+ That's it!
55
 
56
  ## 🎯 Usage
57
 
58
+ ### Basic Inference
59
 
60
  ```python
61
  from transformers import AutoModel
62
 
 
63
  model = AutoModel.from_pretrained(
64
+ "your-username/model-name",
65
  trust_remote_code=True
66
  )
67
 
68
+ # Move to GPU if available
 
 
 
 
 
 
69
  device = "cuda" if torch.cuda.is_available() else "cpu"
70
  model = model.to(device)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71
  ```
72
 
73
  ### Batch Processing
74
 
75
  ```python
76
+ # Your data
77
+ batch = {
78
+ 'genes': torch.randint(0, 1000, (32, 100)),
79
+ 'values': torch.rand(32, 100),
80
+ 'masks': torch.ones(32, 100).bool()
81
+ }
82
+
83
+ # Process
84
+ model.eval()
85
+ with torch.no_grad():
86
+ output = model(**batch)
 
 
 
87
  ```
88
 
89
+ ## πŸ“Š Model Details
90
 
91
+ - **Parameters**: ~70M
92
+ - **Architecture**: Transformer Encoder
93
+ - **Hidden Size**: 512
94
+ - **Layers**: 12
95
+ - **Heads**: 8
 
96
 
97
  ## πŸ“ Citation
98
 
 
 
99
  ```bibtex
100
  @article{tahoe2024,
101
+ title={Tahoe-x1},
102
  author={...},
103
  year={2024}
104
  }
105
  ```
 
 
 
 
config.json CHANGED
@@ -50,7 +50,6 @@
50
  "query_activation": "sigmoid",
51
  "scaled_dot_product": true
52
  },
53
- "chemical_encoder_config": null,
54
  "auto_map": {
55
  "AutoConfig": "modeling.TXConfig",
56
  "AutoModel": "modeling.TXModelForHF",
 
50
  "query_activation": "sigmoid",
51
  "scaled_dot_product": true
52
  },
 
53
  "auto_map": {
54
  "AutoConfig": "modeling.TXConfig",
55
  "AutoModel": "modeling.TXModelForHF",
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:217637af5a4d12f3fe2d2648fb9d4d1404b53eea587336c62cfcfbfb26088efd
3
- size 284008108
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:305a80c5941f512526a35c05b8e0f6d3dc930fdb01616546f1494ed76961600b
3
+ size 284014476
tokenizer_config.json CHANGED
@@ -1,5 +1,4 @@
1
  {
2
  "tokenizer_class": "PreTrainedTokenizerFast",
3
- "model_max_length": 1000000000000000019884624838656,
4
  "vocab_size": 62720
5
  }
 
1
  {
2
  "tokenizer_class": "PreTrainedTokenizerFast",
 
3
  "vocab_size": 62720
4
  }