feat: live streaming in qa_cli + layered diagnostic + Windows UTF-8 fix
Browse filesProblème: avec llama3.2:3b sur CPU, chaque inférence prend 60-120s.
Le CLI restait silencieux pendant ce temps → impression de hang.
- tools/jdm_agent.py: stream(agent, q, on_event) consomme agent.stream()
en mode "updates" et émet un dict par message (AIMessage / ToolMessage).
- apps/qa_cli.py: utilise stream() avec --verbose pour afficher chaque
appel d'outil et chaque retour en direct (avec timing).
- apps/diagnose.py (jdm-diag): teste les 5 couches séquentiellement
(client, tools, ollama, inférence nue, 1 round agent) avec timings.
Aide à isoler où ça bloque réellement.
- apps/_console.py: force sys.stdout/stderr en UTF-8 sur Windows
(cp1252 plantait sur ✓ ─ ⏱ et autres unicode). Importé en tête
de chaque entrypoint d'app.
- pyproject: nouveau script jdm-diag.
- README: section "Dépannage" avec tableau symptômes/causes/solutions,
et avertissement explicite sur les temps d'inférence CPU.
Diagnostic live (machine Hani, CPU only):
- couche 1 (HTTP JDM): 0.7s ; cache disque 0.2ms (3500x speedup)
- couche 2 (tools): 0.5s
- couche 3 (Ollama tags): 2.1s
- couche 4 (inférence nue "OUI"): 94.7s — chargement modèle + 1 token
- couche 5 (1 round agent complet): 271s — 2 inférences successives
Tests 30/30 verts.
- README.md +29 -3
- pyproject.toml +1 -0
- src/jdm_agent/apps/_console.py +34 -0
- src/jdm_agent/apps/diagnose.py +191 -0
- src/jdm_agent/apps/qa_cli.py +40 -7
- src/jdm_agent/apps/qa_eval.py +2 -0
- src/jdm_agent/tools/jdm_agent.py +40 -0
|
@@ -48,12 +48,27 @@ python -m jdm_agent.apps.qa_cli
|
|
| 48 |
|
| 49 |
# Avec Ollama local (modèle compatible tool-calling)
|
| 50 |
ollama pull llama3.2:3b
|
| 51 |
-
python -m jdm_agent.apps.qa_cli --provider ollama --model llama3.2:3b
|
| 52 |
|
| 53 |
-
# Question unique
|
| 54 |
-
python -m jdm_agent.apps.qa_cli -q "synonymes de voiture"
|
| 55 |
```
|
| 56 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 57 |
### Banc d'évaluation
|
| 58 |
|
| 59 |
```bash
|
|
@@ -99,6 +114,17 @@ les synonymes de voiture »* ou *« avec JDM, quels sont les sens du mot avocat
|
|
| 99 |
pytest
|
| 100 |
```
|
| 101 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 102 |
## Roadmap
|
| 103 |
|
| 104 |
- [x] Phase 0 — Bootstrap
|
|
|
|
| 48 |
|
| 49 |
# Avec Ollama local (modèle compatible tool-calling)
|
| 50 |
ollama pull llama3.2:3b
|
| 51 |
+
python -m jdm_agent.apps.qa_cli --provider ollama --model llama3.2:3b --verbose
|
| 52 |
|
| 53 |
+
# Question unique avec streaming des étapes
|
| 54 |
+
python -m jdm_agent.apps.qa_cli --provider ollama --model llama3.2:3b -q "synonymes de voiture" --verbose
|
| 55 |
```
|
| 56 |
|
| 57 |
+
> ⏱️ Sur CPU sans GPU, llama3.2:3b prend **30–120 s par tour** (chargement modèle
|
| 58 |
+
> au 1er appel + chaque round agent ≈ 60–90 s). Utilise `--verbose` pour voir l'agent
|
| 59 |
+
> travailler en direct (chaque appel d'outil et chaque retour s'affiche au fur et à mesure).
|
| 60 |
+
|
| 61 |
+
### Diagnostic en couches
|
| 62 |
+
|
| 63 |
+
Si quelque chose semble figé, isole l'étape qui bloque :
|
| 64 |
+
|
| 65 |
+
```bash
|
| 66 |
+
python -m jdm_agent.apps.diagnose --provider ollama --model llama3.2:3b
|
| 67 |
+
```
|
| 68 |
+
|
| 69 |
+
Teste séquentiellement : client JDM → outils LangChain → serveur Ollama → inférence
|
| 70 |
+
LLM nue → un round complet d'agent, avec timing à chaque étape.
|
| 71 |
+
|
| 72 |
### Banc d'évaluation
|
| 73 |
|
| 74 |
```bash
|
|
|
|
| 114 |
pytest
|
| 115 |
```
|
| 116 |
|
| 117 |
+
## Dépannage
|
| 118 |
+
|
| 119 |
+
| Symptôme | Cause probable | Solution |
|
| 120 |
+
|---|---|---|
|
| 121 |
+
| Le CLI semble figé pendant 60-120 s | Inférence Ollama sur CPU (premier chargement modèle + chaque tour LLM) | Utilise `--verbose` pour voir l'agent travailler en direct. Le 1er appel charge le modèle (~10-30 s), les suivants vont plus vite. |
|
| 122 |
+
| `UnicodeEncodeError: 'charmap'` | Console Windows cp1252 | Déjà corrigé (`apps/_console.py` force UTF-8 au démarrage). Si tu vois encore l'erreur, fais `chcp 65001` avant de lancer Python. |
|
| 123 |
+
| `could not connect to a running Ollama instance` | Daemon Ollama pas démarré | Lance `ollama serve` dans un autre terminal (ou redémarre l'app Ollama). |
|
| 124 |
+
| Réponse contient `r_001 \| terme1 \| ...` inventé | Petit modèle qui hallucine les triplets | Passe à un modèle plus capable (`ollama pull qwen2.5:7b` ou `llama3.1:8b`), ou branche une vraie API (Anthropic/OpenAI). |
|
| 125 |
+
| `min_weight: Input should be a valid number` | Bug ancien — déjà corrigé | Mets à jour : `git pull && pip install -e .` |
|
| 126 |
+
| Outputs avec `?` à la place des accents (`syst?me`) | Encodage console non-UTF-8 | `chcp 65001` puis `set PYTHONIOENCODING=utf-8` avant Python |
|
| 127 |
+
|
| 128 |
## Roadmap
|
| 129 |
|
| 130 |
- [x] Phase 0 — Bootstrap
|
|
@@ -37,6 +37,7 @@ dev = [
|
|
| 37 |
jdm-qa = "jdm_agent.apps.qa_cli:main"
|
| 38 |
jdm-eval = "jdm_agent.apps.qa_eval:main"
|
| 39 |
jdm-mcp = "jdm_agent.mcp.server:main"
|
|
|
|
| 40 |
|
| 41 |
[tool.hatch.build.targets.wheel]
|
| 42 |
packages = ["src/jdm_agent"]
|
|
|
|
| 37 |
jdm-qa = "jdm_agent.apps.qa_cli:main"
|
| 38 |
jdm-eval = "jdm_agent.apps.qa_eval:main"
|
| 39 |
jdm-mcp = "jdm_agent.mcp.server:main"
|
| 40 |
+
jdm-diag = "jdm_agent.apps.diagnose:main"
|
| 41 |
|
| 42 |
[tool.hatch.build.targets.wheel]
|
| 43 |
packages = ["src/jdm_agent"]
|
|
@@ -0,0 +1,34 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Force la sortie console en UTF-8 (Windows cp1252 plante sinon sur ✓ ─ ⏱ …).
|
| 2 |
+
|
| 3 |
+
À importer en TOUT premier dans chaque entrypoint d'app.
|
| 4 |
+
"""
|
| 5 |
+
from __future__ import annotations
|
| 6 |
+
|
| 7 |
+
import io
|
| 8 |
+
import sys
|
| 9 |
+
|
| 10 |
+
|
| 11 |
+
def setup_console() -> None:
|
| 12 |
+
for stream_name in ("stdout", "stderr"):
|
| 13 |
+
stream = getattr(sys, stream_name, None)
|
| 14 |
+
if stream is None:
|
| 15 |
+
continue
|
| 16 |
+
# Python 3.7+: TextIOWrapper.reconfigure(encoding=...)
|
| 17 |
+
reconf = getattr(stream, "reconfigure", None)
|
| 18 |
+
if reconf is not None:
|
| 19 |
+
try:
|
| 20 |
+
reconf(encoding="utf-8", errors="replace")
|
| 21 |
+
continue
|
| 22 |
+
except Exception:
|
| 23 |
+
pass
|
| 24 |
+
# Fallback : enveloppe le buffer brut.
|
| 25 |
+
buffer = getattr(stream, "buffer", None)
|
| 26 |
+
if buffer is not None:
|
| 27 |
+
try:
|
| 28 |
+
setattr(sys, stream_name, io.TextIOWrapper(buffer, encoding="utf-8",
|
| 29 |
+
errors="replace", line_buffering=True))
|
| 30 |
+
except Exception:
|
| 31 |
+
pass
|
| 32 |
+
|
| 33 |
+
|
| 34 |
+
setup_console()
|
|
@@ -0,0 +1,191 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Diagnostic en couches du système JDM Agent.
|
| 2 |
+
|
| 3 |
+
Teste séquentiellement :
|
| 4 |
+
1. JDMClient brut (HTTP + cache)
|
| 5 |
+
2. Outils LangChain (sans LLM)
|
| 6 |
+
3. Connectivité Ollama (si choisi)
|
| 7 |
+
4. Inférence LLM seule (1 prompt court, pas d'outil)
|
| 8 |
+
5. Agent avec un seul tool call
|
| 9 |
+
|
| 10 |
+
Affiche le timing de chaque étape pour identifier où ça bloque.
|
| 11 |
+
Usage :
|
| 12 |
+
python -m jdm_agent.apps.diagnose
|
| 13 |
+
python -m jdm_agent.apps.diagnose --provider ollama --model llama3.2:3b
|
| 14 |
+
"""
|
| 15 |
+
from __future__ import annotations
|
| 16 |
+
|
| 17 |
+
from jdm_agent.apps import _console # noqa: F401 — force stdout UTF-8 (Windows)
|
| 18 |
+
|
| 19 |
+
import argparse
|
| 20 |
+
import os
|
| 21 |
+
import sys
|
| 22 |
+
import time
|
| 23 |
+
import urllib.request
|
| 24 |
+
import urllib.error
|
| 25 |
+
|
| 26 |
+
from jdm_agent.client import JDMClient
|
| 27 |
+
from jdm_agent.tools.jdm_tools import set_default_client, get_synonyms
|
| 28 |
+
|
| 29 |
+
|
| 30 |
+
GREEN = "\033[92m"
|
| 31 |
+
RED = "\033[91m"
|
| 32 |
+
YELLOW = "\033[93m"
|
| 33 |
+
RESET = "\033[0m"
|
| 34 |
+
|
| 35 |
+
|
| 36 |
+
def _step(label: str):
|
| 37 |
+
print(f"\n── {label} ──", flush=True)
|
| 38 |
+
return time.time()
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
def _ok(t0: float, extra: str = "") -> None:
|
| 42 |
+
dt = time.time() - t0
|
| 43 |
+
print(f" {GREEN}✓{RESET} {dt:5.2f}s {extra}", flush=True)
|
| 44 |
+
|
| 45 |
+
|
| 46 |
+
def _fail(t0: float, err: Exception) -> None:
|
| 47 |
+
dt = time.time() - t0
|
| 48 |
+
print(f" {RED}✗{RESET} {dt:5.2f}s {type(err).__name__}: {err}", flush=True)
|
| 49 |
+
|
| 50 |
+
|
| 51 |
+
def _warn(msg: str) -> None:
|
| 52 |
+
print(f" {YELLOW}!{RESET} {msg}", flush=True)
|
| 53 |
+
|
| 54 |
+
|
| 55 |
+
def check_jdm_client() -> bool:
|
| 56 |
+
t = _step("1. JDMClient HTTP")
|
| 57 |
+
try:
|
| 58 |
+
c = JDMClient()
|
| 59 |
+
n = c.node_by_name("chat")
|
| 60 |
+
_ok(t, f"node chat id={n.id} w={n.w}")
|
| 61 |
+
t2 = time.time()
|
| 62 |
+
n2 = c.node_by_name("chat")
|
| 63 |
+
dt = time.time() - t2
|
| 64 |
+
_ok(t2, f"2e appel (cache disque) en {dt*1000:.1f}ms")
|
| 65 |
+
return True
|
| 66 |
+
except Exception as e:
|
| 67 |
+
_fail(t, e)
|
| 68 |
+
return False
|
| 69 |
+
|
| 70 |
+
|
| 71 |
+
def check_tools() -> bool:
|
| 72 |
+
t = _step("2. Outils LangChain (sans LLM)")
|
| 73 |
+
try:
|
| 74 |
+
c = JDMClient()
|
| 75 |
+
set_default_client(c)
|
| 76 |
+
syns = get_synonyms.invoke({"term": "voiture", "min_weight": 50, "limit": 3})
|
| 77 |
+
for s in syns:
|
| 78 |
+
print(f" · {s['target']} (w={s['w']})")
|
| 79 |
+
_ok(t)
|
| 80 |
+
return True
|
| 81 |
+
except Exception as e:
|
| 82 |
+
_fail(t, e)
|
| 83 |
+
return False
|
| 84 |
+
|
| 85 |
+
|
| 86 |
+
def check_ollama(model: str) -> bool:
|
| 87 |
+
t = _step(f"3. Ollama (modèle {model})")
|
| 88 |
+
url = os.environ.get("OLLAMA_BASE_URL", "http://localhost:11434") + "/api/tags"
|
| 89 |
+
try:
|
| 90 |
+
with urllib.request.urlopen(url, timeout=3) as r:
|
| 91 |
+
import json
|
| 92 |
+
data = json.loads(r.read())
|
| 93 |
+
tags = [m["name"] for m in data.get("models", [])]
|
| 94 |
+
print(f" · serveur joignable ; modèles installés : {tags or '(aucun)'}")
|
| 95 |
+
if not any(m.startswith(model.split(':')[0]) for m in tags):
|
| 96 |
+
_warn(f"le modèle {model!r} n'est PAS installé. Lancer : ollama pull {model}")
|
| 97 |
+
return False
|
| 98 |
+
_ok(t)
|
| 99 |
+
return True
|
| 100 |
+
except urllib.error.URLError as e:
|
| 101 |
+
_fail(t, e)
|
| 102 |
+
_warn("Ollama ne tourne pas. Démarrer avec : ollama serve")
|
| 103 |
+
return False
|
| 104 |
+
except Exception as e:
|
| 105 |
+
_fail(t, e)
|
| 106 |
+
return False
|
| 107 |
+
|
| 108 |
+
|
| 109 |
+
def check_llm_inference(provider: str, model: str) -> bool:
|
| 110 |
+
t = _step(f"4. Inférence LLM nue ({provider}:{model}) — peut prendre 10-60s")
|
| 111 |
+
try:
|
| 112 |
+
from jdm_agent.tools.llm_factory import get_llm
|
| 113 |
+
from langchain_core.messages import HumanMessage
|
| 114 |
+
|
| 115 |
+
llm = get_llm(provider=provider, model=model)
|
| 116 |
+
# Force pas d'outils : juste une réponse texte.
|
| 117 |
+
out = llm.invoke([HumanMessage(content="Réponds en un seul mot : OUI")])
|
| 118 |
+
_ok(t, f"sortie: {(out.content or '').strip()[:80]!r}")
|
| 119 |
+
return True
|
| 120 |
+
except Exception as e:
|
| 121 |
+
_fail(t, e)
|
| 122 |
+
return False
|
| 123 |
+
|
| 124 |
+
|
| 125 |
+
def check_agent_one_round(provider: str, model: str) -> bool:
|
| 126 |
+
t = _step(f"5. Agent complet : 1 question simple ({provider}:{model})")
|
| 127 |
+
try:
|
| 128 |
+
from jdm_agent.tools.jdm_agent import build_jdm_agent, stream
|
| 129 |
+
|
| 130 |
+
client = JDMClient()
|
| 131 |
+
set_default_client(client)
|
| 132 |
+
from jdm_agent.tools.llm_factory import get_llm
|
| 133 |
+
llm = get_llm(provider=provider, model=model)
|
| 134 |
+
agent = build_jdm_agent(client=client, llm=llm)
|
| 135 |
+
|
| 136 |
+
# Stream pour montrer chaque étape.
|
| 137 |
+
def on_event(ev):
|
| 138 |
+
dt = time.time() - t
|
| 139 |
+
kind = ev["kind"]
|
| 140 |
+
tcs = ev.get("tool_calls") or []
|
| 141 |
+
if kind == "AIMessage" and tcs:
|
| 142 |
+
for tc in tcs:
|
| 143 |
+
print(f" [{dt:5.1f}s] → appel {tc['name']}({tc.get('args')})")
|
| 144 |
+
elif kind == "ToolMessage":
|
| 145 |
+
content = (ev.get("content") or "")[:80].replace("\n", " ")
|
| 146 |
+
print(f" [{dt:5.1f}s] ← outil {ev.get('name')} : {content}…")
|
| 147 |
+
elif kind == "AIMessage":
|
| 148 |
+
preview = (ev.get("content") or "").strip().replace("\n", " ")[:80]
|
| 149 |
+
if preview:
|
| 150 |
+
print(f" [{dt:5.1f}s] ← réponse finale ({len(ev['content'])} chars)")
|
| 151 |
+
|
| 152 |
+
out = stream(agent, "Donne-moi 2 synonymes de voiture.", on_event=on_event)
|
| 153 |
+
print(f" réponse : {out['answer'][:200]}…")
|
| 154 |
+
_ok(t)
|
| 155 |
+
return True
|
| 156 |
+
except Exception as e:
|
| 157 |
+
_fail(t, e)
|
| 158 |
+
return False
|
| 159 |
+
|
| 160 |
+
|
| 161 |
+
def main() -> int:
|
| 162 |
+
p = argparse.ArgumentParser(description="Diagnostic en couches JDM Agent.")
|
| 163 |
+
p.add_argument("--provider", default=os.environ.get("LLM_PROVIDER", "ollama"))
|
| 164 |
+
p.add_argument("--model", default=os.environ.get("LLM_MODEL", "llama3.2:3b"))
|
| 165 |
+
p.add_argument("--skip-llm", action="store_true", help="Saute les étapes 3-5.")
|
| 166 |
+
args = p.parse_args()
|
| 167 |
+
|
| 168 |
+
print(f"{'='*60}\n JDM Agent — diagnostic\n provider={args.provider}, model={args.model}\n{'='*60}")
|
| 169 |
+
|
| 170 |
+
if not check_jdm_client():
|
| 171 |
+
return 1
|
| 172 |
+
if not check_tools():
|
| 173 |
+
return 1
|
| 174 |
+
if args.skip_llm:
|
| 175 |
+
print("\n[--skip-llm] : étapes 3-5 sautées.")
|
| 176 |
+
return 0
|
| 177 |
+
if args.provider == "ollama":
|
| 178 |
+
if not check_ollama(args.model):
|
| 179 |
+
return 2
|
| 180 |
+
if not check_llm_inference(args.provider, args.model):
|
| 181 |
+
return 3
|
| 182 |
+
if not check_agent_one_round(args.provider, args.model):
|
| 183 |
+
return 4
|
| 184 |
+
|
| 185 |
+
print(f"\n{GREEN}Tout fonctionne.{RESET} Tu peux lancer maintenant :")
|
| 186 |
+
print(f" python -m jdm_agent.apps.qa_cli --provider {args.provider} --model {args.model} --verbose")
|
| 187 |
+
return 0
|
| 188 |
+
|
| 189 |
+
|
| 190 |
+
if __name__ == "__main__":
|
| 191 |
+
sys.exit(main())
|
|
@@ -11,13 +11,15 @@ Variables d'environnement (alternatives aux flags) :
|
|
| 11 |
"""
|
| 12 |
from __future__ import annotations
|
| 13 |
|
|
|
|
|
|
|
| 14 |
import argparse
|
| 15 |
import os
|
| 16 |
import sys
|
| 17 |
from typing import Optional
|
| 18 |
|
| 19 |
from jdm_agent.client import JDMClient
|
| 20 |
-
from jdm_agent.tools.jdm_agent import ask, build_jdm_agent
|
| 21 |
from jdm_agent.tools.llm_factory import get_llm
|
| 22 |
|
| 23 |
|
|
@@ -56,6 +58,34 @@ def _print_tool_calls(tool_calls: list[dict]) -> None:
|
|
| 56 |
print(f" • {tc['name']}({args_str})")
|
| 57 |
|
| 58 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 59 |
def run_repl(provider: Optional[str], model: Optional[str], verbose: bool) -> int:
|
| 60 |
print(BANNER)
|
| 61 |
print(f"Provider : {provider or os.environ.get('LLM_PROVIDER', 'anthropic')}")
|
|
@@ -99,15 +129,15 @@ def run_repl(provider: Optional[str], model: Optional[str], verbose: bool) -> in
|
|
| 99 |
continue
|
| 100 |
|
| 101 |
try:
|
| 102 |
-
|
|
|
|
|
|
|
| 103 |
except Exception as e:
|
| 104 |
print(f"[erreur] {e}", file=sys.stderr)
|
| 105 |
continue
|
| 106 |
|
| 107 |
print()
|
| 108 |
print(out["answer"])
|
| 109 |
-
if show_tools:
|
| 110 |
-
_print_tool_calls(out["tool_calls"])
|
| 111 |
print()
|
| 112 |
|
| 113 |
client.close()
|
|
@@ -126,10 +156,13 @@ def main() -> int:
|
|
| 126 |
client = JDMClient()
|
| 127 |
llm = get_llm(provider=args.provider, model=args.model)
|
| 128 |
agent = build_jdm_agent(client=client, llm=llm)
|
| 129 |
-
out = ask(agent, args.question)
|
| 130 |
-
print(out["answer"])
|
| 131 |
if args.verbose:
|
| 132 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 133 |
client.close()
|
| 134 |
return 0
|
| 135 |
|
|
|
|
| 11 |
"""
|
| 12 |
from __future__ import annotations
|
| 13 |
|
| 14 |
+
from jdm_agent.apps import _console # noqa: F401 — force stdout UTF-8 (Windows)
|
| 15 |
+
|
| 16 |
import argparse
|
| 17 |
import os
|
| 18 |
import sys
|
| 19 |
from typing import Optional
|
| 20 |
|
| 21 |
from jdm_agent.client import JDMClient
|
| 22 |
+
from jdm_agent.tools.jdm_agent import ask, build_jdm_agent, stream
|
| 23 |
from jdm_agent.tools.llm_factory import get_llm
|
| 24 |
|
| 25 |
|
|
|
|
| 58 |
print(f" • {tc['name']}({args_str})")
|
| 59 |
|
| 60 |
|
| 61 |
+
def _stream_printer(verbose: bool):
|
| 62 |
+
"""Imprime un événement par étape de l'agent — montre que ça avance."""
|
| 63 |
+
import time
|
| 64 |
+
t0 = [time.time()]
|
| 65 |
+
|
| 66 |
+
def on_event(ev: dict) -> None:
|
| 67 |
+
dt = time.time() - t0[0]
|
| 68 |
+
kind = ev["kind"]
|
| 69 |
+
if kind == "AIMessage":
|
| 70 |
+
tcs = ev.get("tool_calls") or []
|
| 71 |
+
if tcs:
|
| 72 |
+
for tc in tcs:
|
| 73 |
+
args = ", ".join(f"{k}={v!r}" for k, v in (tc.get("args") or {}).items())
|
| 74 |
+
print(f" ⏱ {dt:5.1f}s → appel {tc['name']}({args})", flush=True)
|
| 75 |
+
else:
|
| 76 |
+
# Réponse finale du modèle.
|
| 77 |
+
preview = (ev.get("content") or "").strip().replace("\n", " ")[:80]
|
| 78 |
+
if preview:
|
| 79 |
+
print(f" ⏱ {dt:5.1f}s ← réponse du modèle ({len(ev['content'])} chars)", flush=True)
|
| 80 |
+
elif kind == "ToolMessage":
|
| 81 |
+
content = ev.get("content") or ""
|
| 82 |
+
preview = content[:100].replace("\n", " ")
|
| 83 |
+
print(f" ⏱ {dt:5.1f}s ← outil {ev.get('name')} renvoie {len(content)} chars : {preview}…", flush=True)
|
| 84 |
+
t0[0] = time.time()
|
| 85 |
+
|
| 86 |
+
return on_event if verbose else None
|
| 87 |
+
|
| 88 |
+
|
| 89 |
def run_repl(provider: Optional[str], model: Optional[str], verbose: bool) -> int:
|
| 90 |
print(BANNER)
|
| 91 |
print(f"Provider : {provider or os.environ.get('LLM_PROVIDER', 'anthropic')}")
|
|
|
|
| 129 |
continue
|
| 130 |
|
| 131 |
try:
|
| 132 |
+
print("(réflexion en cours…)", flush=True)
|
| 133 |
+
on_event = _stream_printer(show_tools)
|
| 134 |
+
out = stream(agent, q, on_event=on_event)
|
| 135 |
except Exception as e:
|
| 136 |
print(f"[erreur] {e}", file=sys.stderr)
|
| 137 |
continue
|
| 138 |
|
| 139 |
print()
|
| 140 |
print(out["answer"])
|
|
|
|
|
|
|
| 141 |
print()
|
| 142 |
|
| 143 |
client.close()
|
|
|
|
| 156 |
client = JDMClient()
|
| 157 |
llm = get_llm(provider=args.provider, model=args.model)
|
| 158 |
agent = build_jdm_agent(client=client, llm=llm)
|
|
|
|
|
|
|
| 159 |
if args.verbose:
|
| 160 |
+
print("(réflexion en cours…)", flush=True)
|
| 161 |
+
out = stream(agent, args.question, on_event=_stream_printer(True))
|
| 162 |
+
else:
|
| 163 |
+
out = ask(agent, args.question)
|
| 164 |
+
print()
|
| 165 |
+
print(out["answer"])
|
| 166 |
client.close()
|
| 167 |
return 0
|
| 168 |
|
|
@@ -9,6 +9,8 @@ Usage :
|
|
| 9 |
"""
|
| 10 |
from __future__ import annotations
|
| 11 |
|
|
|
|
|
|
|
| 12 |
import argparse
|
| 13 |
import time
|
| 14 |
from typing import Optional
|
|
|
|
| 9 |
"""
|
| 10 |
from __future__ import annotations
|
| 11 |
|
| 12 |
+
from jdm_agent.apps import _console # noqa: F401 — force stdout UTF-8 (Windows)
|
| 13 |
+
|
| 14 |
import argparse
|
| 15 |
import time
|
| 16 |
from typing import Optional
|
|
@@ -73,3 +73,43 @@ def ask(agent, question: str) -> dict:
|
|
| 73 |
for tc in getattr(m, "tool_calls", []) or []:
|
| 74 |
tool_calls.append({"name": tc.get("name"), "args": tc.get("args")})
|
| 75 |
return {"answer": answer, "messages": msgs, "tool_calls": tool_calls}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 73 |
for tc in getattr(m, "tool_calls", []) or []:
|
| 74 |
tool_calls.append({"name": tc.get("name"), "args": tc.get("args")})
|
| 75 |
return {"answer": answer, "messages": msgs, "tool_calls": tool_calls}
|
| 76 |
+
|
| 77 |
+
|
| 78 |
+
def stream(agent, question: str, on_event=None):
|
| 79 |
+
"""Stream les étapes intermédiaires de l'agent (LangGraph events).
|
| 80 |
+
|
| 81 |
+
Émet un événement par message produit (AIMessage / ToolMessage).
|
| 82 |
+
Si `on_event` est fourni, il est appelé pour chaque message avec
|
| 83 |
+
un dict {kind, name, content, tool_calls}.
|
| 84 |
+
|
| 85 |
+
Renvoie le dict final {"answer", "messages", "tool_calls"}.
|
| 86 |
+
"""
|
| 87 |
+
from langchain_core.messages import AIMessage, ToolMessage
|
| 88 |
+
|
| 89 |
+
final_msgs = []
|
| 90 |
+
tool_calls_acc: list[dict] = []
|
| 91 |
+
for chunk in agent.stream({"messages": [HumanMessage(content=question)]},
|
| 92 |
+
stream_mode="updates"):
|
| 93 |
+
# chunk = dict {node_name: {"messages": [msg, ...]}}
|
| 94 |
+
for node_name, payload in chunk.items():
|
| 95 |
+
msgs = (payload or {}).get("messages") or []
|
| 96 |
+
for m in msgs:
|
| 97 |
+
final_msgs.append(m)
|
| 98 |
+
ev = {
|
| 99 |
+
"kind": type(m).__name__,
|
| 100 |
+
"node": node_name,
|
| 101 |
+
"name": getattr(m, "name", None),
|
| 102 |
+
"content": getattr(m, "content", ""),
|
| 103 |
+
"tool_calls": getattr(m, "tool_calls", None) or [],
|
| 104 |
+
}
|
| 105 |
+
for tc in ev["tool_calls"]:
|
| 106 |
+
tool_calls_acc.append({"name": tc.get("name"), "args": tc.get("args")})
|
| 107 |
+
if on_event is not None:
|
| 108 |
+
on_event(ev)
|
| 109 |
+
|
| 110 |
+
answer = ""
|
| 111 |
+
for m in reversed(final_msgs):
|
| 112 |
+
if isinstance(m, AIMessage) and not getattr(m, "tool_calls", None):
|
| 113 |
+
answer = m.content
|
| 114 |
+
break
|
| 115 |
+
return {"answer": answer, "messages": final_msgs, "tool_calls": tool_calls_acc}
|