Instructions to use lightonai/ArabicWeb24-ablation-model-v5 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use lightonai/ArabicWeb24-ablation-model-v5 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="lightonai/ArabicWeb24-ablation-model-v5")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("lightonai/ArabicWeb24-ablation-model-v5", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use lightonai/ArabicWeb24-ablation-model-v5 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "lightonai/ArabicWeb24-ablation-model-v5"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lightonai/ArabicWeb24-ablation-model-v5",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/lightonai/ArabicWeb24-ablation-model-v5

SGLang

How to use lightonai/ArabicWeb24-ablation-model-v5 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "lightonai/ArabicWeb24-ablation-model-v5" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lightonai/ArabicWeb24-ablation-model-v5",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "lightonai/ArabicWeb24-ablation-model-v5" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lightonai/ArabicWeb24-ablation-model-v5",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use lightonai/ArabicWeb24-ablation-model-v5 with Docker Model Runner:
```
docker model run hf.co/lightonai/ArabicWeb24-ablation-model-v5
```

Model summary

This model is trained on the ArabicWeb dataset V5. It was trained on 25B tokens using the AraGPT-2 tokenizer. The model has 900 million parameters with a context length of 1024 tokens and uses the Mamba2 architecture.

License: odc-by
Languages: Arabic

Model Description

The ArabicWeb Ablation Model V5 is trained on a diverse corpus of Arabic text, including news articles, art and entertainment, and encyclopedia entries. This makes it suitable for a variety of Arabic text generation tasks. For more details, you can read the blog post.

Model Type: Language Model
Architecture: Mamba
Training Data: ArabicWeb24 dataset
Training Objective: Text generation

Usage

This model was primarily trained to assess the quality of the ArabicWeb dataset and is designed for text generation in Arabic. Please note that this is an ablation model that was not instruction-tuned. The primary intended use case is to compare its performance with other models trained under the same configuration but with different versions of datasets.

Training

Model

Architecture: Mamba2 model
Pretraining tokens: 25B
Scheduler: Cosine
d_model: 2304
d_intermediate: 0
n_layer: 18

Hardware

Platform: HPE Cray node
Hardware: 8 NVIDIA H100 GPUs
Cloud Provider: Orange Cloud Avenue

Downloads last month: 8

Dataset used to train lightonai/ArabicWeb24-ablation-model-v5

Collection including lightonai/ArabicWeb24-ablation-model-v5

ArabicWeb24-ablation-models

Collection

900M models trained on 25BT to compare different data processing choices (filtering, sentence dedup, minhash, etc) • 2 items • Updated 18 days ago • 2