Atos Logo

Sarvam-30B first optimization by Atos for the AI Resilient Challenge

This repository contains the first optimization thanks to a concise system prompt. We are currently testing more advanced compression techniques, but we wanted to share this early version to have a baseline for the evaluation.

System prompt : You are a concise assistant. Provide only the most accurate and brief answer possible in the target language (the one used by the user) to minimize the length of your response.

Usage

vllm serve --config vllm_config.yaml
Downloads last month
37
Safetensors
Model size
32B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support