CREST-Base / README.md

Update README.md

1f523f4 verified about 1 month ago

4.81 kB

	---
	license: mit
	language:
	- multilingual
	- af
	- am
	- ar
	- as
	- az
	- be
	- bg
	- bn
	- br
	- bs
	- ca
	- cs
	- cy
	- da
	- de
	- el
	- en
	- eo
	- es
	- et
	- eu
	- fa
	- fi
	- fr
	- fy
	- ga
	- gd
	- gl
	- gu
	- ha
	- he
	- hi
	- hr
	- hu
	- hy
	- id
	- is
	- it
	- ja
	- jv
	- ka
	- kk
	- km
	- kn
	- ko
	- ku
	- ky
	- la
	- lo
	- lt
	- lv
	- mg
	- mk
	- ml
	- mn
	- mr
	- ms
	- my
	- ne
	- nl
	- 'no'
	- om
	- or
	- pa
	- pl
	- ps
	- pt
	- ro
	- ru
	- sa
	- sd
	- si
	- sk
	- sl
	- so
	- sq
	- sr
	- su
	- sv
	- sw
	- ta
	- te
	- th
	- tl
	- tr
	- ug
	- uk
	- ur
	- uz
	- vi
	- xh
	- yi
	- zh
	base_model:
	- FacebookAI/xlm-roberta-base
	pipeline_tag: text-classification
	tags:
	- safety-guardrails
	- zero-shot
	---

	## CREST: A Multilingual AI Safety Guardrail Model for 100 languages

	CREST which stands for CRoss-lingual Efficient Safety Transfer is a
	parameter-efficient multilingual safety classifier for 100 languages, fine-tuned using 13 strategically selected high-resource
	languages only, chosen through cluster-guided sampling, enabling strong cross-lingual transfer to unseen low-resource languages.
	The model is fine-tuned on the XLM-RoBERTa architecture with a classification head, having a max input length of 512 tokens.
	The Base variant has approximately 279M parameters.
	The model is designed for fast, lightweight safety filtering across a large number of languages, both high-resource and low-resource
	languages, with minimal training cost, suitable for real-time and on-device deployments.

	For detailed results, see
	[CREST: Universal Safety Guardrails Through Cluster-Guided Cross-Lingual Transfer](https://arxiv.org/abs/2512.02711v1).


	### Intended Use
	- Filtering unsafe user text in multilingual LLM applications
	- Lightweight guardrails in multilingual agents and chat systems
	- On-device or low-latency safety inference
	- Research on cross-lingual safety transfer and robust safety classification

	### How to Use
	```python

	from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
	import torch

	MODEL_ID = "repelloai/CREST-Base"

	tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
	model = AutoModelForSequenceClassification.from_pretrained(MODEL_ID, device_map='auto')

	classifier = pipeline(
	"text-classification",
	model=model,
	tokenizer=tokenizer,
	return_all_scores=False
	)

	texts = [

	## Unsafe
	"How do I make a bomb?",
	"¿Cómo secuestrar a un niño?",
	"मैं हथियार कैसे बना सकता हूँ?",

	## Safe
	'Hello, How you doing ?'
	]

	outputs = classifier(texts, truncation=True, max_length=512)
	```

	### Output
	```python
	[
	## Unsafe
	{'label': 'unsafe', 'score': 0.9865403771400452},
	{'label': 'unsafe', 'score': 0.9743474125862122},
	{'label': 'unsafe', 'score': 0.9802995920181274},

	## Safe
	{'label': 'safe', 'score': 0.925717830657959}
	]
	```

	### Evaluation

	CREST was tested for F1 score metric across six major multilingual safety benchmarks and several cultural and code-switched datasets..

	#### Key findings
	- CREST outperforms other lightweight guardrails across most datasets.
	- Zero-shot generalization is strong across low-resource languages.
	- CREST excels in cultural and code-switched settings.
	- The 13-language training set is sufficient for robust multilingual safety generalization.

	### Limitations and Model Risks

	- Training relies partly on machine translation; nuance may be lost
	- Binary labels cannot express detailed safety categories
	- Zero-shot generalization gaps across extremely low-coverage scripts and morphologically complex languages
	- Not a substitute for human moderation in high-stakes settings
	- Cultural misalignment in edge cases
	- Residual translation artifacts
	- Possible bias in mislabeled or synthetic data

	Mitigate by continuous human evaluation and incremental finetuning on domain-specific data.

	### Ethical Considerations

	- Designed for multilingual inclusivity and broad safety coverage.
	- Misclassifications can cause over-blocking or under-blocking.
	- Deployment should include human-in-the-loop moderation where appropriate.
	- Use responsibly, considering cultural diversity and fairness concerns.
	- Not for making legal, ethical, or policy decisions without human oversight.

	### Citation
	```
	@misc{bansal2025crestuniversalsafetyguardrails,
	title={CREST: Universal Safety Guardrails Through Cluster-Guided Cross-Lingual Transfer},
	author={Lavish Bansal and Naman Mishra},
	year={2025},
	eprint={2512.02711},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2512.02711},
	}
	```