Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Vjeong
/
LLM-1B-Lab
like
0
Safetensors
HuggingFaceFW/fineweb-edu
English
llm-1b-lab
llama
decoder-only
educational
pretrained
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
main
LLM-1B-Lab
/
llm_lab
/
data
29 kB
Ctrl+K
Ctrl+K
2 contributors
History:
8 commits
Vjeong
Fix dead split parameter in PackedStreamingDataset._load_dataset
0cd5689
1 day ago
__init__.py
Safe
551 Bytes
Add Code CPT pipeline for injecting Python code capability
7 days ago
dataset.py
14.4 kB
Fix dead split parameter in PackedStreamingDataset._load_dataset
1 day ago
diagnostics.py
Safe
5.65 kB
docs: translate all Korean comments and docstrings to English
about 1 month ago
pipeline.py
Safe
5.53 kB
Add Code CPT pipeline for injecting Python code capability
7 days ago
tokenizer.py
Safe
2.94 kB
Remove unused tokenizer training code (train_bpe, load_sentencepiece, load_trained_hf)
14 days ago