Jais-2-70B-Chat / README.md

Update README.md

abefc34 verified 1 day ago

9.8 kB

	---
	library_name: transformers
	license: apache-2.0
	pipeline_tag: text-generation
	language:
	- ar
	- en
	---

	<p align="center">
	<picture>
	<!-- Dark mode -->
	<source media="(prefers-color-scheme: dark)" srcset="https://cdn-uploads.huggingface.co/production/uploads/65604648d69284e31fed02b0/iDADKNbWL17MTB-bB34gV.png">
	<!-- Light mode -->
	<source media="(prefers-color-scheme: light)" srcset="https://cdn-uploads.huggingface.co/production/uploads/65604648d69284e31fed02b0/AO9bjIkbM0zFs67oE3-it.png">
	<!-- Fallback -->
	<img src="https://cdn-uploads.huggingface.co/production/uploads/65604648d69284e31fed02b0/AO9bjIkbM0zFs67oE3-it.png" alt="Jais2 Logo" width="400">
	</picture>
	</p>

	# Jais-2: The Next Generation of Arabic Frontier LLMs

	## Model Overview
	Jais-2-70B-Chat is a high-capacity bilingual Arabic–English language model developed by MBZUAI, Inception, and Cerebras.
	Jais-2-70B-Chat Model is trained from scratch on Arabic and English data and is powered by a custom Arabic-centric vocabulary, it efficiently captures Modern Standard Arabic, regional dialects, and mixed Arabic–English code-switching.
	The model is openly available under a Apache 2.0 license and also deployed as a fast, production-ready chat experience running on Cerebras hardware.
	Visit the [Jais-2 Web App](https://jaischat.ai).

	## Key Technical Specifications
	- Model Developers: MBZUAI, Inception, Cerebras.
	- Languages: Arabic (MSA & dialects) and English
	- Architecture: Transformer-based, Decoder-only architecture with multi-head self-attention.
	- Parameters: 70 Billion
	- Context Length: 8,192
	- Vocabulary Size: 150,272
	- Training Infrastructure: Optimized for Cerebras CS-2 and Condor Galaxy clusters
	- Key Design Choices: Rotary Position Embeddings (RoPE), Squared-ReLU activation, custom μP parameterization, and 8:1 filter-to-hidden size ratio.

	---


	## How to Use the Model

	# Using Transformers
	### 1. Clone the Jais 2–compatible Transformers fork

	```bash
	# Pending PR merge to the official package
	git clone --branch jais2 --single-branch \
	https://github.com/inceptionai-abudhabi/transformers.git
	cd transformers
	uv pip install -e .
	```
	### 2. Load and Inference on the Model
	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	# Load the model and tokenizer
	model_name = "inceptionai/Jais-2-70B-Chat"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

	# Example Arabic prompt
	system_prompt = "أجب باللغة العربية بطريقة رسمية وواضحة."
	user_input = "ما هي عاصمة الإمارات؟"

	# Apply chat template (always)
	chat_text = tokenizer.apply_chat_template(
	[
	{"role": "system", "content": system_prompt},
	{"role": "user", "content": user_input}
	],
	tokenize=False,
	add_generation_prompt=True
	)

	# Tokenize and generate
	inputs = tokenizer(chat_text, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=8192, temperature=0)

	# Decode and print
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	#عاصمة الإمارات العربية المتحدة هي أبوظبي.
	```
	# Using vLLM

	### 1. Clone the Jais 2–compatible vLLM fork

	```bash
	# Pending PR merge to the official package
	git clone --branch jais2 --single-branch \
	https://github.com/inceptionai-abudhabi/vllm.git
	cd vllm
	uv pip install -e . # If you install vllm after transformers, please re-install transformers again from this branch: https://github.com/inceptionai-abudhabi/transformers.git
	```

	### 2. Load and Inference on the Model
	```python
	from vllm import LLM, SamplingParams

	# Load model and tokenizer
	model_name = "inceptionai/Jais-2-70B-Chat"
	llm = LLM(model=model_name, tensor_parallel_size=1)
	tokenizer = llm.get_tokenizer()

	# Example Arabic prompt
	system_prompt = "أجب باللغة العربية بطريقة رسمية وواضحة."
	user_input = "ما هي عاصمة الإمارات؟"

	# Apply chat template (always)
	chat_text = tokenizer.apply_chat_template(
	[
	{"role": "system", "content": system_prompt},
	{"role": "user", "content": user_input}
	],
	tokenize=False,
	add_generation_prompt=True
	)

	# Run generation
	sampling_params = SamplingParams(max_tokens=8192, temperature=0)
	outputs = llm.generate([chat_text], sampling_params)

	#Print output
	print(outputs[0].outputs[0].text)
	#عاصمة الإمارات العربية المتحدة هي أبوظبي.
	```

	Or serve through command line (CLI)

	```shell
	vllm serve inceptionai/Jais-2-70B-Chat \
	--served-model-name inceptionai/Jais-2-70B-Chat-Local --dtype bfloat16 \
	--tensor-parallel-size 2 --max-model-len 8192 --max-num-seqs 256 \
	--host 0.0.0.0 --port 8042 --api-key "Optional"
	```
	---
	## Evaluation
	### Performance Overview
	We evaluate Jais-2-70B across two key benchmarks that capture both instruction following and generative Arabic ability: IFEval (English and Arabic) and AraGen-12-24 (3C3H).

	### IFEval Results (Strict 0-shot)

	\| Model Name \| En-Strict-Prompt-lvl \| En-Strict-Instruction-lvl \| Ar-Strict-Prompt-lvl \| Ar-Strict-Instruction-lvl \|
	\|------------\|-----------------------\|----------------------------\|------------------------\|----------------------------\|
	\| Qwen2.5-72B-Instruct \| 83.53 \| 88.51 \| 67.33 \| 74.05 \|
	\| Llama-3.3-70B-Instruct \| 88.20 \| 92.10 \| 58.17 \| 63.13 \|
	\| Jais-2-70B (ours) \| 70.78 \| 78.93 \| 66.58 \| 74.53 \|

	---

	### AraGen-12-24 (3C3H) Results

	\| Model Name \| 3C3H Score (%) \| Correctness \| Completeness \| Conciseness \| Helpfulness \| Honesty \| Harmlessness \|
	\|------------\|----------------\|-------------\|--------------\|-------------\|-------------\|---------\|-------------- \|
	\| Qwen2.5-72B-Instruct \| 62.58 \| 71.92 \| 71.80 \| 19.06 \| 69.86 \| 70.94 \| 71.92 \|
	\| Llama-3.3-70B-Instruct \| 61.29 \| 68.58 \| 65.11 \| 34.50 \| 63.50 \| 67.47 \| 68.58 \|
	\| Jais-2-70B (ours) \| 70.71 \| 80.53 \| 79.09 \| 25.48 \| 78.43 \| 80.23 \| 80.53 \|


	Overall, our results show that:
	- Jais-2-70B delivers competitive Arabic and English instruction-following performance across IFEval metrics.
	- Jais-2-70B achieves the highest scores across nearly all AraGen metrics, outperforming Qwen2.5-72B and Llama-3.3-70B on Arabic generative tasks.

	---

	## Intended Use

	### Target Audiences
	- Academics: Researchers focusing on Arabic NLP, multilingual modeling, or cultural alignment
	- Businesses: Companies targeting Arabic-speaking markets
	- Developers and ML Engineers: Integrating Arabic language capabilities into applications and workflows

	### Appropriate Use Cases
	- Research:
	- Natural language understanding and generation tasks
	- Conducting interpretability or cross-lingual alignment analyses
	- Investigating Arabic linguistic or cultural patterns

	- Commercial Use:
	- Building chat assistants for Arabic-speaking audiences
	- Performing sentiment and market analysis in regional contexts
	- Summarizing or processing bilingual Arabic–English documents
	- Creating culturally resonant Arabic marketing and entertainment content for regional audiences

	### Inappropriate Use Cases
	- Harmful or Malicious Use:
	- Producing hate speech, extremist content, or discriminatory language
	- Creating or spreading misinformation or deceptive content
	- Engaging in or promoting illegal activities

	- Sensitive Information:
	- Handling or generating personal, confidential, or sensitive information
	- Attempting to infer, reconstruct, or guess sensitive information about individuals or organizations

	- Language Limitations:
	- Applications requiring strong performance outside Arabic or English languages

	- High-Stakes Decisions:
	- Making medical, legal, financial, or safety-critical decisions without human oversight

	## Citation

	If you find our work helpful, please give us a cite.

	```
	@techreport{jais2_2025,
	title = {Jais 2: {A} Family of {A}rabic-Centric Open Large Language Models},
	author = {
	Anwar, Mohamed and
	Freihat, Abdelhakim and
	Ibrahim, George and
	Awad, Mostafa and
	Sadallah, Abdelrahman Atef Mohamed Ali and
	Gosal, Gurpreet and
	Ramakrishnan, Gokul and
	Hestness, Joel and
	Mishra, Biswajit and
	Joshi, Rituraj and
	Chandran, Sarath and
	Frikha, Ahmed and
	Goffinet, Etienne and
	Maiti, Abhishek and
	El Filali, Ali and
	Al Barri, Sarah and
	Ghosh, Samujjwal and
	Pal, Rahul and
	Mullah, Parvez and
	Shukla, Awantika and
	Siddiki, Sajid and
	Kamboj, Samta and
	Pandit, Onkar and
	Sahu, Sunil and
	El Badawy, Abelrahman and
	Mohamed, Amr and
	Chamma, Ahmad and
	Dufraisse, Evan and
	Bounhar, Abdelaziz and
	Bouch, Dani and
	Abdine, Hadi and
	Shang, Guokan and
	Koto, Fajri and
	Wang, Yuxia and
	Xie, Zhuohan and
	Mekky, Ali and
	Elbadry, Rania Hossam Elmohamady and
	Ahmad, Sarfraz and
	Ahsan, Momina and
	El-Herraoui, Omar Emad Mohamed and
	Orel, Daniil and
	Iqbal, Hasan and
	Elzeky, Kareem Mohamed Naguib Abdelmohsen Fahmy and
	Abassy, Mervat and
	Ali, Kareem and
	Eletter, Saadeldine and
	Atif, Farah and
	Mukhituly, Nurdaulet and
	Li, Haonan and
	Han, Xudong and
	Singh, Aaryamonvikram and
	Quraishi, Zain and
	Sengupta, Neha and
	Murray, Larry and
	Sheinin, Avraham and
	Vassilieva, Natalia and
	Ren, Hector and
	Liu, Zhengzhong and
	Vazirgiannis, Michalis and
	Nakov, Preslav
	},
	institution = {IFM},
	type = {Technical Report},
	year = {2025},
	month = dec,
	day = {09},
	}
	```