Instructions to use NucleusAI/nucleus-22B-token-500B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use NucleusAI/nucleus-22B-token-500B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="NucleusAI/nucleus-22B-token-500B")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("NucleusAI/nucleus-22B-token-500B") model = AutoModelForCausalLM.from_pretrained("NucleusAI/nucleus-22B-token-500B") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use NucleusAI/nucleus-22B-token-500B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "NucleusAI/nucleus-22B-token-500B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "NucleusAI/nucleus-22B-token-500B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/NucleusAI/nucleus-22B-token-500B
- SGLang
How to use NucleusAI/nucleus-22B-token-500B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "NucleusAI/nucleus-22B-token-500B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "NucleusAI/nucleus-22B-token-500B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "NucleusAI/nucleus-22B-token-500B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "NucleusAI/nucleus-22B-token-500B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use NucleusAI/nucleus-22B-token-500B with Docker Model Runner:
docker model run hf.co/NucleusAI/nucleus-22B-token-500B
π Nucleus-22B-token-500B
Nucleus-22B-token-500B is a 22B parameters causal decoder-only model built by Nucleus.AI and trained on 500B tokens of RefinedWeb along with curated corpora. It is made available under the MIT license.
1T-token model coming soon π.
What about Nucleus-22B-token-500B?
- It performs well compared to similar-size open-source models (e.g., MPT-7B, StableLM, RedPajama etc.), thanks to being trained on 500B tokens of RefinedWeb enhanced with curated corpora. See the OpenLLM Leaderboard.
- It is made available under an MIT license.
- It is trained by a small team of four passionate for Open Source
β οΈ This is a raw, pretrained model, which should be further finetuned for most usecases.
Model Card for Nucleus-22B-token-500B
Model Details
Model Description
- Developed by: NucleusAI;
- Model type: Causal decoder-only;
- Language(s) (NLP): English;
- License: MIT.
Model Source
- Paper: coming soon.
Uses
Direct Use
Research on large language models; as a foundation for further specialization and finetuning for specific usecases (e.g., summarization, text generation, chatbot, etc.)
Out-of-Scope Use
Production use without adequate assessment of risks and mitigation; any use cases which may be considered irresponsible or harmful.
Bias, Risks, and Limitations
Nucleus-22B-token-500B is trained on English data only, and will not generalize appropriately to other languages. Furthermore, as it is trained on a large-scale corpora representative of the web, it will carry the stereotypes and biases commonly encountered online.
Recommendations
We recommend users of Nucleus-22B-token-500B to consider finetuning it for the specific set of tasks of interest, and for guardrails and appropriate precautions to be taken for any production use.
How to Get Started with the Mode
Training Details
Training Data
Nucleus-22B-token-500B was trained on 500B tokens of RefinedWeb, along with other corpora.
| Data source | Fraction | Tokens | Sources |
|---|---|---|---|
| RefinedWeb-English | 75% | 200B | massive web crawl |
| Books | 7% | 21B | |
| Code | 7% | 21B | Big Code, CodeNet |
| Technical | 6% | 19B | arXiv |
| Math | 5% | 17B | Mathematica, Khan Academy |
The data was tokenized with the tokenizer similar to Llama-7B.
Training Procedure
Nucleus-22B-token-500B was trained on 256 A100 80GB GPUs, using a FSDP
Training Hyperparameters
| Hyperparameter | Value | Comment |
|---|---|---|
| Precision | bfloat16 |
|
| Optimizer | AdamW | |
| Learning rate | 2e-4 | 8B tokens warm-up, cosine decay to 1.e-5 |
| Weight decay | 1e-1 | |
| Batch size | 2048 | constant |
| Context length | 2048 | constant |
Speeds, Sizes, Times
Training happened in early August 2023 and took about two weeks.
- Downloads last month
- 237