Code datasets for pretraining
Orion
Orion-zhen
AI & ML interests
Eco-friendly training using Tesla P4. Prefers (FSDP+)QLoRA.
Recent Activity
liked a model 11 days ago
Liang013/wk2.2 updated a Space 24 days ago
Orion-zhen/gguf-api liked a dataset about 1 month ago
allenai/olmOCR-benchOrganizations
Qwen3-Dense-AWQ
AWQ quantization of Qwen3 Dense series at Day0!
🤯Emoji datasets
All for emoji!
Calibration datasets
Datasets used for various calibrations
Unalignments
Datasets used to unalign models
Free Spaces
Powerful apps built on free HF space
Qwen2.5 Series
Llama3-Orion
My llama3 models
-
Orion-zhen/Llama3-70B-Orion-Chinese
Text Generation • 71B • Updated • 15 • • 14 -
Orion-zhen/Llama3-70B-Orion-Chinese-SE
Text Generation • 71B • Updated • 4 -
Orion-zhen/Llama3-70B-Orion-Chinese-Plus
Text Generation • 71B • Updated • 4 -
Orion-zhen/Llama3-70B-Orion-Chinese-Ultra
Text Generation • 71B • Updated • 4 • 1
Reasoning
Datasets focus on reasoning
Code4Pretrain
Code datasets for pretraining
Free Spaces
Powerful apps built on free HF space
Qwen3-Dense-AWQ
AWQ quantization of Qwen3 Dense series at Day0!
Qwen2.5 Series
🤯Emoji datasets
All for emoji!
Llama3-Orion
My llama3 models
-
Orion-zhen/Llama3-70B-Orion-Chinese
Text Generation • 71B • Updated • 15 • • 14 -
Orion-zhen/Llama3-70B-Orion-Chinese-SE
Text Generation • 71B • Updated • 4 -
Orion-zhen/Llama3-70B-Orion-Chinese-Plus
Text Generation • 71B • Updated • 4 -
Orion-zhen/Llama3-70B-Orion-Chinese-Ultra
Text Generation • 71B • Updated • 4 • 1
Calibration datasets
Datasets used for various calibrations
Reasoning
Datasets focus on reasoning
Unalignments
Datasets used to unalign models