merithalle-ai / docs /API_EXAMPLES.md
Cyril Dupland
First Commit
bd44418

Exemples d'utilisation de l'API

Table des matières

  1. Authentification
  2. Completion
  3. Transcription
  4. Modèles et Agents
  5. WebSocket
  6. Exemples avancés

Authentification

Obtenir un token JWT

Requête:

curl -X POST http://localhost:7860/auth/token \
  -H "Content-Type: application/json"

Réponse:

{
  "access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJ1c2VyIiwidHlwZSI6ImFjY2VzcyIsImV4cCI6MTcwNjEyMzQ1Nn0.abc123...",
  "token_type": "bearer",
  "expires_in": 3600
}

Vérifier un token

Requête:

curl -X GET http://localhost:7860/auth/verify \
  -H "Authorization: Bearer <votre-token>"

Réponse:

{
  "valid": true,
  "user": {
    "sub": "user",
    "type": "access",
    "exp": 1706123456
  }
}

Completion

Completion simple (non-streaming)

Requête:

curl -X POST http://localhost:7860/completion \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Explique-moi la théorie de la relativité en 2 phrases",
    "model": "gpt-4o",
    "agent_type": "simple",
    "stream": false,
    "temperature": 0.7
  }'

Réponse:

{
  "response": "La théorie de la relativité d'Einstein comprend deux parties: la relativité restreinte (1905) qui établit que la vitesse de la lumière est constante et que le temps et l'espace sont relatifs, et la relativité générale (1915) qui décrit la gravitation comme une courbure de l'espace-temps causée par la masse et l'énergie. Ces théories ont révolutionné notre compréhension de l'univers et sont confirmées par de nombreuses expériences.",
  "model": "gpt-4o",
  "agent_type": "simple",
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 98,
    "total_tokens": 123
  },
  "metadata": {
    "message_count": 2
  }
}

Completion avec streaming (SSE)

Requête:

curl -N -X POST http://localhost:7860/completion \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Raconte-moi une courte histoire",
    "model": "gpt-3.5-turbo",
    "stream": true
  }'

Réponse (Server-Sent Events):

data: {"content": "Il", "done": false, "metadata": {"model": "gpt-3.5-turbo", "agent_type": "simple"}}

data: {"content": " était", "done": false, "metadata": {"model": "gpt-3.5-turbo", "agent_type": "simple"}}

data: {"content": " une", "done": false, "metadata": {"model": "gpt-3.5-turbo", "agent_type": "simple"}}

...

data: {"content": "", "done": true, "metadata": {"model": "gpt-3.5-turbo", "agent_type": "simple"}}

Champs d'empreinte carbone, latence, pricing et équivalences

Les réponses incluent désormais des métriques d'impact carbone calculées avec ecologits.

  • Non-stream (champ metadata):
{
  "metadata": {
    "message_count": 4,
    "latency_s": 1.23,
    "emissions_kgCO2eq": 0.00042,
    "emissions_gCO2eq": 0.42,
    "pricing": {
      "currency": "EUR",
      "total_cost": 0.0031,
      "by_model": {
        "mistral-large-latest": {"input": 0.0005, "output": 0.0026, "total": 0.0031}
      }
    },
    "equivalences": {
      "water_liters": 0.3,
      "car_km": 0.002,
      "tgv_km": 0.01,
      "smartphone_charges": 0.04
    }
  }
}
  • Stream (dernier event, champ metadata):
{
  "content": "",
  "done": true,
  "metadata": {
    "model": "mistral-large-latest",
    "agent_type": "simple",
    "usage": {"input_tokens":123, "output_tokens":456, "total_tokens":579},
    "usage_by_model": {
      "mistral-large-latest": {"input_tokens":123, "output_tokens":456, "total_tokens":579}
    },
    "latency_s": 1.23,
    "emissions_kgCO2eq": 0.00042,
    "emissions_gCO2eq": 0.42,
    "pricing": {
      "currency": "EUR",
      "total_cost": 0.0031,
      "by_model": {
        "mistral-large-latest": {"input": 0.0005, "output": 0.0026, "total": 0.0031}
      }
    },
    "equivalences": {
      "water_liters": 0.3,
      "car_km": 0.002,
      "tgv_km": 0.01,
      "smartphone_charges": 0.04
    }
  }
}

Completion avec historique de conversation

Requête:

curl -X POST http://localhost:7860/completion \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Et en Python?",
    "model": "gpt-4o",
    "stream": false,
    "conversation_history": [
      {
        "role": "user",
        "content": "Comment faire une boucle en JavaScript?"
      },
      {
        "role": "assistant",
        "content": "En JavaScript, vous pouvez utiliser: for (let i = 0; i < 10; i++) { console.log(i); }"
      }
    ]
  }'

Utiliser Mistral AI

Requête:

curl -X POST http://localhost:7860/completion \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Quelle est la capitale de la France?",
    "model": "mistral-large-latest",
    "stream": false
  }'

Transcription

Transcrire un fichier audio

Requête:

curl -X POST http://localhost:7860/transcription \
  -H "Authorization: Bearer <token>" \
  -F "file=@audio.mp3"

Réponse:

{
  "text": "Bonjour, ceci est un test de transcription audio avec Whisper.",
  "language": "fr",
  "duration": 3.5,
  "model": "whisper-1"
}

Transcrire avec langue spécifiée

Requête:

curl -X POST "http://localhost:7860/transcription?language=en" \
  -H "Authorization: Bearer <token>" \
  -F "file=@english_audio.wav"

Formats audio supportés

Requête:

curl -X GET http://localhost:7860/transcription/supported-formats \
  -H "Authorization: Bearer <token>"

Réponse:

{
  "supported_formats": ["mp3", "mp4", "mpeg", "mpga", "m4a", "wav", "webm"],
  "max_file_size_mb": 25,
  "model": "whisper-1",
  "languages": "Auto-detection or specify ISO-639-1 code"
}

Modèles et Agents

Lister les modèles disponibles

Requête:

curl -X GET http://localhost:7860/models \
  -H "Authorization: Bearer <token>"

Réponse:

{
  "models": [
    {
      "name": "gpt-4o",
      "provider": "openai",
      "description": "GPT-4 Omni - Most capable model",
      "supports_streaming": true,
      "context_window": 128000
    },
    {
      "name": "mistral-large-latest",
      "provider": "mistralai",
      "description": "Mistral Large - Top-tier reasoning",
      "supports_streaming": true,
      "context_window": 32000
    }
  ],
  "total": 8
}

Lister les agents disponibles

Requête:

curl -X GET http://localhost:7860/agents \
  -H "Authorization: Bearer <token>"

Réponse:

{
  "agents": [
    {
      "type": "simple",
      "name": "Simple",
      "description": "Simple conversational agent without tools or memory",
      "available": true
    },
    {
      "type": "rag",
      "name": "Rag",
      "description": "Agent with Retrieval Augmented Generation (not yet implemented)",
      "available": false
    }
  ],
  "total": 4
}

Health Check

Requête:

curl -X GET http://localhost:7860/health

Réponse:

{
  "status": "healthy",
  "version": "1.0.0",
  "title": "CAPL Routeur IA API",
  "environment": "development",
  "timestamp": "2024-01-24T10:30:00.000000"
}

WebSocket

Connexion WebSocket

JavaScript:

const ws = new WebSocket('ws://localhost:7860/realtime/ws');

ws.onopen = () => {
  console.log('Connected');
};

ws.onmessage = (event) => {
  const data = JSON.parse(event.data);
  console.log('Received:', data);
};

ws.onerror = (error) => {
  console.error('WebSocket error:', error);
};

ws.onclose = () => {
  console.log('Disconnected');
};

Envoyer un message

ws.send(JSON.stringify({
  type: 'message',
  payload: {
    text: 'Hello from client!'
  }
}));

Ping/Pong (keep-alive)

// Envoyer ping toutes les 30 secondes
setInterval(() => {
  ws.send(JSON.stringify({
    type: 'ping',
    payload: {}
  }));
}, 30000);

WebRTC Signaling (exemple)

// Envoyer une offre WebRTC
ws.send(JSON.stringify({
  type: 'offer',
  payload: {
    sdp: 'v=0\r\no=- ...',
    type: 'offer'
  }
}));

Exemples avancés

Python avec requests

import requests

class RouterIAClient:
    def __init__(self, base_url="http://localhost:7860"):
        self.base_url = base_url
        self.token = None
    
    def authenticate(self):
        response = requests.post(f"{self.base_url}/auth/token")
        self.token = response.json()["access_token"]
        return self.token
    
    def complete(self, message, model="gpt-4o", stream=False):
        headers = {"Authorization": f"Bearer {self.token}"}
        data = {
            "message": message,
            "model": model,
            "stream": stream
        }
        response = requests.post(
            f"{self.base_url}/completion",
            headers=headers,
            json=data,
            stream=stream
        )
        
        if stream:
            for line in response.iter_lines():
                if line:
                    yield line.decode('utf-8')
        else:
            return response.json()
    
    def transcribe(self, audio_file_path):
        headers = {"Authorization": f"Bearer {self.token}"}
        with open(audio_file_path, 'rb') as f:
            files = {'file': f}
            response = requests.post(
                f"{self.base_url}/transcription",
                headers=headers,
                files=files
            )
        return response.json()

# Utilisation
client = RouterIAClient()
client.authenticate()

# Completion simple
result = client.complete("Bonjour!")
print(result["response"])

# Streaming
for chunk in client.complete("Compte de 1 à 5", stream=True):
    print(chunk)

# Transcription
transcription = client.transcribe("audio.mp3")
print(transcription["text"])

JavaScript/TypeScript avec fetch

class RouterIAClient {
  private baseUrl: string;
  private token: string | null = null;

  constructor(baseUrl: string = 'http://localhost:7860') {
    this.baseUrl = baseUrl;
  }

  async authenticate(): Promise<string> {
    const response = await fetch(`${this.baseUrl}/auth/token`, {
      method: 'POST'
    });
    const data = await response.json();
    this.token = data.access_token;
    return this.token;
  }

  async complete(
    message: string,
    model: string = 'gpt-4o',
    stream: boolean = false
  ) {
    const response = await fetch(`${this.baseUrl}/completion`, {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${this.token}`,
        'Content-Type': 'application/json'
      },
      body: JSON.stringify({ message, model, stream })
    });

    if (stream) {
      return this.handleStreamResponse(response);
    } else {
      return await response.json();
    }
  }

  private async *handleStreamResponse(response: Response) {
    const reader = response.body?.getReader();
    const decoder = new TextDecoder();

    if (!reader) return;

    while (true) {
      const { done, value } = await reader.read();
      if (done) break;

      const chunk = decoder.decode(value);
      const lines = chunk.split('\n');

      for (const line of lines) {
        if (line.startsWith('data: ')) {
          const data = JSON.parse(line.slice(6));
          yield data;
        }
      }
    }
  }

  async transcribe(audioFile: File): Promise<any> {
    const formData = new FormData();
    formData.append('file', audioFile);

    const response = await fetch(`${this.baseUrl}/transcription`, {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${this.token}`
      },
      body: formData
    });

    return await response.json();
  }
}

// Utilisation
const client = new RouterIAClient();
await client.authenticate();

// Completion
const result = await client.complete('Bonjour!');
console.log(result.response);

// Streaming
for await (const chunk of await client.complete('Compte de 1 à 5', 'gpt-4o', true)) {
  console.log(chunk.content);
}

Gestion d'erreurs

import requests
from requests.exceptions import RequestException

try:
    response = requests.post(
        "http://localhost:7860/completion",
        headers={"Authorization": f"Bearer {token}"},
        json={"message": "Test", "model": "gpt-4o"}
    )
    response.raise_for_status()
    result = response.json()
    print(result["response"])
    
except requests.exceptions.HTTPError as e:
    if e.response.status_code == 401:
        print("Token invalide ou expiré")
    elif e.response.status_code == 400:
        print("Requête invalide:", e.response.json())
    else:
        print(f"Erreur HTTP {e.response.status_code}")
        
except RequestException as e:
    print(f"Erreur de connexion: {e}")

Rate Limiting (à implémenter)

Recommandations pour les clients:

  • Implémentez un retry avec backoff exponentiel
  • Respectez les headers X-RateLimit-* (à venir)
  • Mettez en cache les réponses quand possible

Bonnes pratiques

  1. Sécurité: Ne jamais exposer votre token dans le code côté client
  2. Gestion des tokens: Rafraîchissez le token avant expiration
  3. Streaming: Utilisez le streaming pour les longues réponses
  4. Timeout: Configurez des timeouts appropriés
  5. Retry: Implémentez une logique de retry pour les erreurs réseau
  6. Logging: Loggez les erreurs côté client pour debugging