Text Generation
Transformers
Safetensors
mother_core
mother-core
msai
sovereign-ai
united-kingdom
causal-lm
custom_code
Instructions to use MediaStreamAI/MOTHER_CORE_V2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use MediaStreamAI/MOTHER_CORE_V2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="MediaStreamAI/MOTHER_CORE_V2", trust_remote_code=True)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("MediaStreamAI/MOTHER_CORE_V2", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use MediaStreamAI/MOTHER_CORE_V2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "MediaStreamAI/MOTHER_CORE_V2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MediaStreamAI/MOTHER_CORE_V2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/MediaStreamAI/MOTHER_CORE_V2
- SGLang
How to use MediaStreamAI/MOTHER_CORE_V2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "MediaStreamAI/MOTHER_CORE_V2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MediaStreamAI/MOTHER_CORE_V2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "MediaStreamAI/MOTHER_CORE_V2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "MediaStreamAI/MOTHER_CORE_V2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use MediaStreamAI/MOTHER_CORE_V2 with Docker Model Runner:
docker model run hf.co/MediaStreamAI/MOTHER_CORE_V2
Correct training hyperparameters: SEQ=4096 (not 2048), GRAD_ACCUM_STEPS=32 (not 8). Training is at full architecture context length; no RoPE extrapolation needed for inference.
Browse files
README.md
CHANGED
|
@@ -298,8 +298,8 @@ Forward return:
|
|
| 298 |
|---|---|
|
| 299 |
| Learning rate | 1e-5 |
|
| 300 |
| Gradient clip | 10.0 |
|
| 301 |
-
| Effective batch size |
|
| 302 |
-
| Sequence length (training) |
|
| 303 |
| Optimiser | AdamW (β₁=0.9, β₂=0.95) |
|
| 304 |
| Weight decay | 0.1 |
|
| 305 |
| Warmup steps | 100 |
|
|
@@ -307,7 +307,7 @@ Forward return:
|
|
| 307 |
| Hardware | NVIDIA GB10 Blackwell (Grace–Blackwell unified memory, 128GB) |
|
| 308 |
| Training site | MSAI Wright Avenue, Dundee — sovereign UK infrastructure |
|
| 309 |
|
| 310 |
-
Training was performed at sequence length **
|
| 311 |
|
| 312 |
---
|
| 313 |
|
|
|
|
| 298 |
|---|---|
|
| 299 |
| Learning rate | 1e-5 |
|
| 300 |
| Gradient clip | 10.0 |
|
| 301 |
+
| Effective batch size | 32 (BATCH_PHYSICAL=1 × GRAD_ACCUM_STEPS=32) |
|
| 302 |
+
| Sequence length (training) | 4096 |
|
| 303 |
| Optimiser | AdamW (β₁=0.9, β₂=0.95) |
|
| 304 |
| Weight decay | 0.1 |
|
| 305 |
| Warmup steps | 100 |
|
|
|
|
| 307 |
| Hardware | NVIDIA GB10 Blackwell (Grace–Blackwell unified memory, 128GB) |
|
| 308 |
| Training site | MSAI Wright Avenue, Dundee — sovereign UK infrastructure |
|
| 309 |
|
| 310 |
+
Training was performed at the full architecture sequence length of **4096** using physical microbatches of 1 with gradient accumulation of 32 (effective batch = 32). Because training and inference share the same context length, no RoPE extrapolation is required for 4096-token inference. Long-context behaviour at full 4096 has been exposed during training but not formally benchmarked at this checkpoint.
|
| 311 |
|
| 312 |
---
|
| 313 |
|