Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
yuccaaa
/
nas
like
0
TensorBoard
Safetensors
License:
apache-2.0
Model card
Files
Files and versions
xet
Metrics
Training metrics
Community
fa2a8a7
nas
/
pretrain_data
107 GB
Ctrl+K
Ctrl+K
1 contributor
History:
72 commits
yuccaaa
Upload pretrain_data/nan/code/train-00000-of-00011.jsonl with huggingface_hub
fa2a8a7
verified
8 months ago
cot
Upload pretrain_data/cot/science/train-00003-of-00004.parquet with huggingface_hub
8 months ago
instruct
Upload pretrain_data/instruct/cot.jsonl with huggingface_hub
8 months ago
nan
Upload pretrain_data/nan/code/train-00000-of-00011.jsonl with huggingface_hub
8 months ago
.gitattributes
2.6 kB
Upload pretrain_data/.gitattributes with huggingface_hub
8 months ago
README.md
Safe
24 Bytes
Upload pretrain_data/README.md with huggingface_hub
8 months ago
clean_OntoProtein.jsonl
509 MB
xet
Upload pretrain_data/clean_OntoProtein.jsonl with huggingface_hub
8 months ago
clean_bio.jsonl
321 MB
xet
Upload pretrain_data/clean_bio.jsonl with huggingface_hub
8 months ago
clean_pmc_full_text.jsonl
11.5 GB
xet
Upload pretrain_data/clean_pmc_full_text.jsonl with huggingface_hub
8 months ago
clean_pmc_full_text_small.jsonl
1.15 GB
xet
Upload pretrain_data/clean_pmc_full_text_small.jsonl with huggingface_hub
8 months ago
clean_pubmed_abstract_part1.jsonl
12.2 GB
xet
Upload pretrain_data/clean_pubmed_abstract_part1.jsonl with huggingface_hub
8 months ago
clean_pubmed_abstract_part1_small.jsonl
1.22 GB
xet
Upload pretrain_data/clean_pubmed_abstract_part1_small.jsonl with huggingface_hub
8 months ago
clean_pubmed_abstract_part1_small1.jsonl
1.16 GB
xet
Upload pretrain_data/clean_pubmed_abstract_part1_small1.jsonl with huggingface_hub
8 months ago
clean_pubmed_abstract_part1_small1_new.jsonl
1.15 GB
xet
Upload pretrain_data/clean_pubmed_abstract_part1_small1_new.jsonl with huggingface_hub
8 months ago
clean_pubmed_abstract_part1_small_new.jsonl
1.21 GB
xet
Upload pretrain_data/clean_pubmed_abstract_part1_small_new.jsonl with huggingface_hub
8 months ago
clean_seq_in_text.jsonl
183 MB
xet
Upload pretrain_data/clean_seq_in_text.jsonl with huggingface_hub
8 months ago
clean_seq_in_text_new.jsonl
149 MB
xet
Upload pretrain_data/clean_seq_in_text_new.jsonl with huggingface_hub
8 months ago
clean_swissProt2Text.jsonl
913 MB
xet
Upload pretrain_data/clean_swissProt2Text.jsonl with huggingface_hub
8 months ago
clean_swissProt2Text_new.jsonl
907 MB
xet
Upload pretrain_data/clean_swissProt2Text_new.jsonl with huggingface_hub
8 months ago
pmc_full_text.json
11.6 GB
xet
Upload pretrain_data/pmc_full_text.json with huggingface_hub
8 months ago
pmc_full_text.jsonl
11.7 GB
xet
Upload pretrain_data/pmc_full_text.jsonl with huggingface_hub
8 months ago
pubmed_abstract_part1.json
13.5 GB
xet
Upload pretrain_data/pubmed_abstract_part1.json with huggingface_hub
8 months ago
pubmed_abstract_part2.json
13.5 GB
xet
Upload pretrain_data/pubmed_abstract_part2.json with huggingface_hub
8 months ago
seq_in_text.json
13.3 GB
xet
Upload pretrain_data/seq_in_text.json with huggingface_hub
8 months ago
swissProt2Text.json
983 MB
xet
Upload pretrain_data/swissProt2Text.json with huggingface_hub
8 months ago