Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
ctheodoris
/
Geneformer
like
267
Fill-Mask
Transformers
Safetensors
ctheodoris/Genecorpus-30M
bert
single-cell
genomics
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
576
Deploy
Use this model
refs/pr/552
Geneformer
/
geneformer
11.5 MB
19 contributors
History:
159 commits
IchigoJiken
Update geneformer/evaluation_utils.py to be compatible with different versions of the datasets package.
fc7a1c0
verified
5 months ago
gene_dictionaries_30m
Update geneformer/tokenizer.py (#415)
over 1 year ago
mtl
move import wandb to conditional
8 months ago
__init__.py
1.65 kB
update with V2 models
7 months ago
classifier.py
66.6 kB
move V1 autoformatting to after validate_options
7 months ago
classifier_utils.py
23.6 kB
add option for relabeling data from prior label class dict, update dict paths in manifest
7 months ago
collator_for_classification.py
31.7 kB
silence tensor copy warning
7 months ago
emb_extractor.py
33.3 kB
plot umap for all labels in same view
6 months ago
ensembl_mapping_dict_gc104M.pkl
3.96 MB
xet
add V2 models
7 months ago
evaluation_utils.py
10.1 kB
Update geneformer/evaluation_utils.py to be compatible with different versions of the datasets package.
5 months ago
gene_median_dictionary_gc104M.pkl
1.51 MB
xet
add V2 models
7 months ago
gene_name_id_dict_gc104M.pkl
1.66 MB
xet
add V2 models
7 months ago
in_silico_perturber.py
67.5 kB
update V1 token dict usage to self attr
7 months ago
in_silico_perturber_stats.py
45.7 kB
update with V2 models
7 months ago
mtl_classifier.py
14.5 kB
fully qualified imports to resolve name-space conflicts (#532)
8 months ago
perturber_utils.py
32.1 kB
fix to properly move model after checking device
7 months ago
pretrainer.py
29.5 kB
remove unused imports while no longer using distributed sampler
about 1 year ago
token_dictionary_gc104M.pkl
426 kB
xet
add V2 models
7 months ago
tokenizer.py
34.3 kB
add input_identifier to tokenize specific matched files
6 months ago