Instructions to use issoufzousko07/BABA-IA-2B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use issoufzousko07/BABA-IA-2B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="issoufzousko07/BABA-IA-2B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("issoufzousko07/BABA-IA-2B")
model = AutoModelForCausalLM.from_pretrained("issoufzousko07/BABA-IA-2B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use issoufzousko07/BABA-IA-2B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "issoufzousko07/BABA-IA-2B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "issoufzousko07/BABA-IA-2B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/issoufzousko07/BABA-IA-2B

SGLang

How to use issoufzousko07/BABA-IA-2B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "issoufzousko07/BABA-IA-2B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "issoufzousko07/BABA-IA-2B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "issoufzousko07/BABA-IA-2B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "issoufzousko07/BABA-IA-2B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use issoufzousko07/BABA-IA-2B with Docker Model Runner:
```
docker model run hf.co/issoufzousko07/BABA-IA-2B
```

BABA-IA-2B

Description du modèle

BABA-IA-2B est un modèle de langage conversationnel léger (2 milliards de paramètres) optimisé pour l'assistance générale. Il est conçu pour être déployable sur des infrastructures modestes (CPU/GPU grand public) tout en offrant des performances robustes pour le dialogue chat.

Ce modèle alimente le chatbot BABA, une initiative visant à rendre l'assistance generale plus accessible.

Développé par : [ElephMind Ivoire]
Type de modèle : Causal Language Model (AutoModelForCausalLM)
Langue(s) : Français, Anglais
Licence : Apache 2.0
Fin-tuné ou entrainé depuis : [entrainé from scratch]

Utilisations

Utilisation directe

Le modèle est conçu pour générer du texte de manière conversationnelle. Il peut répondre à des questions, engager un dialogue, et fournir des informations contextuelles.

Comment démarrer avec le modèle

Vous pouvez utiliser ce modèle directement avec la bibliothèque transformers de Hugging Face.

Prérequis

pip install transformers torch accelerate

Exemple de code (Python)

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "issoufzousko07/BABA-IA-2B"

# Détection du matériel (GPU ou CPU)
device = "cuda" if torch.cuda.is_available() else "cpu"
dtype = torch.float16 if device == "cuda" else torch.float32

print(f"Chargement du modèle sur {device}...")

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=dtype,
    device_map="auto" if device == "cuda" else None
)

if device == "cpu":
    model.to("cpu")

# Préparer le message
messages = [
    {"role": "user", "content": "Bonjour, comment t'appelle tu?"}
]

# Appliquer le template de chat
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(input_text, return_tensors="pt").to(device)

# Générer la réponse
outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    do_sample=True,
    temperature=0.7,
    top_p=0.9
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Détails de l'entraînement

Données d'entraînement et procédure [À compléter avec les détails spécifiques de votre fine-tuning].

Biais, risques et limites

Bien que BABA-IA-2B soit optimisé pour fournir des informations utiles, comme tout modèle de langage, il peut produire des hallucinations ou des informations inexactes.

Usage generaliste : Ce modèle est un chatbot conversationnelle conçu pour aider de façon general les utilisateurs.
Biais : Le modèle peut refléter les biais présents dans ses données d'entraînement.

Citation

Si vous utilisez ce modèle, merci de le citer comme suit :

@misc{baba-ia-2b,
  author = {Zousko Nicanor/Elephmind IA},
  title = {BABA-IA-2B: A Lightweight Medical Chatbot Model},
  year = {2026},
  publisher = {Hugging Face},
  journal = {Hugging Face Model Hub},
  howpublished = {\url{https://huggingface.co/issoufzousko07/BABA-IA-2B}}
}

Downloads last month: 9

Safetensors

Model size

3B params

Tensor type

BF16

issoufzousko07
/

BABA-IA-2B