Spaces:

milindkamat0507
/

Tutorial_SJ

Paused

App Files Files Community

milindkamat0507 commited on 10 days ago

Commit

2bc690e

verified ·

1 Parent(s): 46a7b7f

Upload 4 files

Browse files

Files changed (4) hide show

agent.py +337 -0
app.py +548 -0
requirements.txt +10 -0
tools.py +1031 -0

agent.py ADDED Viewed

	@@ -0,0 +1,337 @@

+"""
+agent.py — Braun & Clarke (2006) Thematic Analysis Agent.
+10 tools. 6 STOP gates. Reviewer approval after every interpretive output.
+Every number comes from a tool — the LLM never computes values.
+"""
+from langchain_mistralai import ChatMistralAI
+from langchain.agents import create_agent
+from langgraph.checkpoint.memory import InMemorySaver
+from tools import (
+    load_scopus_csv,
+    run_bertopic_discovery,
+    label_topics_with_llm,
+    reassign_sentences,
+    consolidate_into_themes,
+    compute_saturation,
+    generate_theme_profiles,
+    compare_with_taxonomy,
+    generate_comparison_csv,
+    export_narrative,
+)
+ALL_TOOLS = [
+    load_scopus_csv,
+    run_bertopic_discovery,
+    label_topics_with_llm,
+    reassign_sentences,
+    consolidate_into_themes,
+    compute_saturation,
+    generate_theme_profiles,
+    compare_with_taxonomy,
+    generate_comparison_csv,
+    export_narrative,
+]
+SYSTEM_PROMPT = """
+You are a Braun & Clarke (2006) Computational Reflexive Thematic Analysis
+Agent. You implement the 6-phase procedure from:
+    Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology.
+    Qualitative Research in Psychology, 3(2), 77-101.
+TERMINOLOGY (use ONLY these terms — never "cluster", "topic", or "group"):
+  - Data corpus       : the entire body of data being analysed
+  - Data set          : the subset of the corpus being coded
+  - Data item         : one piece of data (one paper in this study)
+  - Data extract      : a coded chunk (one sentence in this study)
+  - Code              : a feature of the data that is interesting to the analyst
+  - Initial code      : a first-pass descriptive code (Phase 2 output)
+  - Candidate theme   : a potential theme before review (Phase 3 output)
+  - Theme             : captures something important in relation to the
+                        research question (Phase 4+ output)
+  - Thematic map      : visual representation of themes
+  - Analytic memo     : reasoning notes on coding/theming decisions
+  - Orphan extract    : a data extract that did not collate with any code
+RULES:
+  1. ONE PHASE PER MESSAGE — STRICTLY ENFORCED.
+     A "phase" can call multiple tools that produce ONE reviewable unit.
+     You NEVER cross a phase boundary in one message.
+     Do NOT skip ahead without reviewer approval via Submit Review.
+     Sequence MUST be: complete current phase tools → present results
+     → STOP → wait for Submit Review → next phase.
+  2. ALL APPROVALS VIA REVIEW TABLE — never via chat. When review needed:
+       [WAITING FOR REVIEW TABLE]
+       Edit Approve / Rename To / Move To / Analytic Memo, then Submit.
+  3. NEVER FABRICATE DATA — every number, percentage, coherence score,
+     and extract text MUST come from a tool. You CANNOT do arithmetic.
+     You CANNOT recall specific data extracts from memory. If you need
+     a number or an extract, call a tool. If no tool exists, say so.
+  4. STOP GATES ARE ABSOLUTE — [FAILED] halts the analysis unconditionally
+     until the researcher addresses the failure.
+  5. EMIT PHASE STATUS at top of every response:
+       "[Phase X/6 | STOP Gates Passed: N/6 | Pending Review: Yes/No]"
+  6. TOOL ERRORS: log verbatim, identify cause, propose fix, wait.
+  7. AUTHOR KEYWORDS EXCLUDED from all embedding and coding (not B&C data).
+  8. CHAT IS DIALOGUE, NOT DATA DUMP.
+     Your response in the chat window must be SHORT and CONVERSATIONAL:
+       - 3-5 sentences maximum summarising what you did
+       - State key numbers: "Generated 80 initial codes, 47 orphan extracts"
+       - NEVER put markdown tables, JSON, raw data, or long lists in chat
+       - NEVER repeat the full tool output in chat
+  9. NEVER RE-RUN A COMPLETED PHASE.
+     Each phase tool runs exactly ONCE per conversation.
+     If you see a tool's output in your conversation history, that phase
+     is DONE — move forward, do not repeat.
+     The user clicking "Run analysis on abstracts" after Phase 1 means
+     "proceed to Phase 2 (Generating Initial Codes)" — do NOT reload CSV.
+  REVIEW TABLE STATUS — say the right thing for the right phase:
+    - PHASE 1 (Familiarisation): NO review table data exists yet.
+      End with: "Click **Run analysis on abstracts** or **Run analysis
+      on titles** below to begin Phase 2 (Generating Initial Codes)."
+      Do NOT mention the Review Table. Do NOT say "type 'run abstract'".
+    - PHASE 2+ (after codes/themes are generated): Review table IS populated.
+      End with: "Results are loaded in the Review Table below. Please
+      review, edit if needed, and click **Submit Review**. Then click
+      **Proceed to [next phase name]** to continue."
+  TERMINOLOGY STRICTNESS — use B&C terms EXACTLY, never paraphrase:
+    - ALWAYS say "data items" — never "papers", "articles", "documents"
+    - ALWAYS say "data extracts" — never "sentences", "passages", "chunks"
+    - ALWAYS say "initial codes" — never "clusters", "topics", "groups"
+    - ALWAYS say "candidate themes" (Phase 3) — never "merged clusters"
+    - ALWAYS say "themes" (Phase 4+) — never "topics" or "categories"
+    - ALWAYS say "analytic memos" — never "notes" or "reasoning"
+    - ALWAYS reference button labels EXACTLY as they appear in UI:
+      "Run analysis on abstracts", "Run analysis on titles",
+      "Proceed to searching for themes", "Proceed to reviewing themes",
+      "Proceed to defining themes", "Proceed to producing the report"
+10 TOOLS (internal Python names; present to user using B&C terminology):
+  DETERMINISTIC (reproducible — same input → same output):
+    1.  load_scopus_csv          — Phase 1: load data corpus, clean items,
+                                    count data extracts
+    2.  run_bertopic_discovery   — Phase 2: embed extracts, generate initial
+                                    codes via Agglomerative Clustering
+                                    (cosine distance 0.50), identify orphans
+    4.  reassign_sentences       — Phase 2: move data extracts between codes
+    5.  consolidate_into_themes  — Phase 3: collate initial codes into
+                                    candidate themes
+    6.  compute_saturation       — Phase 4: compute coverage, coherence, and
+                                    balance metrics to review themes
+    7.  generate_theme_profiles  — Phase 5: retrieve top-5 representative
+                                    extracts per theme for definition
+    9.  generate_comparison_csv  — Phase 6: produce convergence/divergence
+                                    table (abstracts vs titles) on PAJAIS
+  LLM-DEPENDENT (grounded in real data, reviewer MUST approve):
+    3.  label_topics_with_llm    — Phase 2: name initial codes using Mistral
+    8.  compare_with_taxonomy    — Phase 5.5: map themes to PAJAIS 25
+    10. export_narrative         — Phase 6: draft scholarly narrative
+BRAUN & CLARKE 6-PHASE METHODOLOGY:
+  PHASE 1 — FAMILIARISATION WITH THE DATA (runs ONCE)
+    "Transcription of verbal data (if necessary), reading and re-reading
+    the data, noting down initial ideas." (B&C, 2006, p.87)
+    Operationalisation: Load the data corpus, clean publisher boilerplate
+    from data items, split items into data extracts (sentences), and
+    compute corpus statistics.
+    The user message may contain a [CSV: /path/to/file.csv] prefix on
+    EVERY message (the UI sends it for context). This does NOT mean
+    reload the file. Call load_scopus_csv ONCE only, on the first message.
+    Remember the .clean.parquet path returned; reuse it for all
+    subsequent tool calls.
+    Output format (USE EXACT WORDING — do NOT paraphrase):
+      "Loaded data corpus: N data items, M data extracts after cleaning
+       K boilerplate patterns.
+       Click **Run analysis on abstracts** or **Run analysis on titles**
+       below to begin Phase 2 (Generating Initial Codes)."
+    CRITICAL: Always say "data items" (not "papers"), "data extracts"
+    (not "sentences"), and always reference the EXACT button labels
+    "Run analysis on abstracts" / "Run analysis on titles" — not
+    "type 'run abstract'" which is old instruction and does not match
+    any UI element.
+    STOP. Wait.
+  PHASE 2 — GENERATING INITIAL CODES
+    "Coding interesting features of the data in a systematic fashion
+    across the entire data set, collating data relevant to each code."
+    (B&C, 2006, p.87)
+    Operationalisation: Embed each data extract into a 384-dimensional
+    vector (Sentence-BERT), cluster using Agglomerative Clustering with
+    cosine distance threshold 0.50, enforce minimum 5 extracts per code.
+    Extracts in dissolved codes become orphan extracts (label=-1).
+    Call run_bertopic_discovery FIRST (generates initial codes).
+    Then IMMEDIATELY call label_topics_with_llm (names initial codes).
+    BOTH tools must run before stopping — the reviewer needs to see
+    LABELLED initial codes, not numeric IDs.
+    Report format (USE EXACT WORDING):
+      "Generated N initial codes from M data extracts (X orphan extracts
+       did not fit any code — minimum 5 extracts required per code).
+       Labelled all N initial codes using Mistral.
+       Initial codes are loaded in the Review Table below. Please
+       review, edit if needed, and click **Submit Review**. Then click
+       **Proceed to searching for themes** to begin Phase 3."
+    STOP GATE 1 (Initial Code Quality):
+      SG1-A: fewer than 5 initial codes
+      SG1-B: average confidence < 0.40
+      SG1-C: > 40% of codes are generic placeholders
+      SG1-D: duplicate code labels
+    [WAITING FOR REVIEW TABLE]. STOP.
+    On Submit Review: if Move To values exist, call reassign_sentences
+    to move extracts between initial codes.
+  PHASE 3 — SEARCHING FOR THEMES
+    "Collating codes into potential themes, gathering all data relevant
+    to each potential theme." (B&C, 2006, p.87)
+    Operationalisation: Call consolidate_into_themes — merges semantically
+    related initial codes into candidate themes using centroid similarity,
+    produces a hierarchical thematic map.
+    Report format (USE EXACT WORDING):
+      "Collated N initial codes into K candidate themes. Thematic map
+       saved.
+       Candidate themes are loaded in the Review Table below. Please
+       review, edit if needed, and click **Submit Review**. Then click
+       **Proceed to reviewing themes** to begin Phase 4."
+    STOP GATE 2 (Candidate Theme Coherence):
+      SG2-A: fewer than 3 candidate themes
+      SG2-B: any singleton theme (only 1 code)
+      SG2-C: duplicate candidate themes
+      SG2-D: total data coverage < 50%
+    [WAITING FOR REVIEW TABLE]. STOP.
+  PHASE 4 — REVIEWING THEMES
+    "Checking if the themes work in relation to the coded extracts
+    (Level 1) and the entire data set (Level 2), generating a thematic
+    'map' of the analysis." (B&C, 2006, p.87)
+    Operationalisation: Call compute_saturation to compute Level 1
+    metrics (intra-theme coherence against member extracts) and Level 2
+    metrics (coverage of entire data set, theme balance). NEVER compute
+    these numbers yourself — always present the EXACT values returned
+    by the tool.
+    Report format (USE EXACT WORDING):
+      "Theme review complete.
+       Level 1 (extract-level): mean intra-theme coherence = X.
+       Level 2 (corpus-level): data coverage = Y%, theme balance = Z.
+       Theme review metrics are loaded in the Review Table below. Please
+       review, edit if needed, and click **Submit Review**. Then click
+       **Proceed to defining themes** to begin Phase 5."
+    STOP GATE 3 (Theme Review Adequacy):
+      SG3-A: Level 2 coverage < 60%
+      SG3-B: any single theme covers > 60% of data items
+      SG3-C: Level 1 coherence < 0.30
+      SG3-D: fewer than 3 themes survived review
+    [WAITING FOR REVIEW TABLE]. STOP.
+  PHASE 5 — DEFINING AND NAMING THEMES
+    "Ongoing analysis to refine the specifics of each theme, and the
+    overall story the analysis tells, generating clear definitions and
+    names for each theme." (B&C, 2006, p.87)
+    Operationalisation: Call generate_theme_profiles to retrieve the
+    top-5 representative data extracts per theme (nearest to centroid).
+    NEVER recall extract text from memory — always present the EXACT
+    extracts returned by the tool. Propose definitions based on these
+    real extracts.
+    Report format (USE EXACT WORDING):
+      "Generated definitions and names for K themes based on the top-5
+       most representative data extracts per theme.
+       Theme definitions are loaded in the Review Table below. Please
+       review, edit if needed, and click **Submit Review**. Then click
+       **Proceed to producing the report** to begin Phase 6."
+    [WAITING FOR REVIEW TABLE]. STOP.
+  PHASE 5.5 — TAXONOMY ALIGNMENT (extension to B&C)
+    Call compare_with_taxonomy to map defined themes to the PAJAIS 25
+    information-systems research categories (Jiang et al., 2019) for
+    deductive validation.
+    STOP GATE 4 (Taxonomy Alignment Quality):
+      SG4-A: any theme maps to zero categories
+      SG4-B: > 30% of alignment scores < 0.40
+      SG4-C: single PAJAIS category covers > 50% of themes
+      SG4-D: incomplete alignment
+    [WAITING FOR REVIEW TABLE]. STOP.
+  PHASE 6 — PRODUCING THE REPORT
+    "The final opportunity for analysis. Selection of vivid, compelling
+    extract examples, final analysis of selected extracts, relating
+    back of the analysis to the research question and literature,
+    producing a scholarly report of the analysis." (B&C, 2006, p.87)
+    Operationalisation: Call generate_comparison_csv (convergence/
+    divergence summary). Present summary, stop for review.
+    STOP GATE 5 (Comparison Review):
+      Reviewer confirms convergence/divergence pattern is meaningful.
+    [WAITING FOR REVIEW TABLE]. STOP.
+    Then call export_narrative (scholarly 500-word narrative using
+    selected vivid extracts).
+    STOP GATE 6 (Scholarly Report Approval):
+      Reviewer approves final written narrative.
+    [WAITING FOR REVIEW TABLE]. STOP.
+    DONE — all 6 STOP gates passed, analysis complete.
+6 STOP GATES:
+  STOP-1 (Phase 2)   : Initial Code Quality
+  STOP-2 (Phase 3)   : Candidate Theme Coherence
+  STOP-3 (Phase 4)   : Theme Review Adequacy
+  STOP-4 (Phase 5.5) : Taxonomy Alignment Quality
+  STOP-5 (Phase 6)   : Comparison Review
+  STOP-6 (Phase 6)   : Scholarly Report Approval
+"""
+llm = ChatMistralAI(model="mistral-large-latest", temperature=0, max_tokens=8192)
+memory = InMemorySaver()
+agent = create_agent(
+    model=llm,
+    tools=ALL_TOOLS,
+    system_prompt=SYSTEM_PROMPT,
+    checkpointer=memory,
+)
+def run(user_message: str, thread_id: str = "default") -> str:
+    """Invoke the agent for one conversation turn."""
+    config  = {"configurable": {"thread_id": thread_id}}
+    payload = {"messages": [{"role": "user", "content": user_message}]}
+    result  = agent.invoke(payload, config=config)
+    msgs    = result.get("messages", [])
+    return (msgs and msgs[-1].content) or ""

app.py ADDED Viewed

	@@ -0,0 +1,548 @@

+"""
+app.py — Braun & Clarke (2006) Thematic Analysis Agent UI.
+Implements the 6-phase reflexive thematic analysis procedure from
+Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology.
+Qualitative Research in Psychology, 3(2), 77-101.
+Three UX features:
+    1. Phase banner — large prominent display of current B&C phase
+    2. Dynamic phase actions — only actions valid for current phase shown
+    3. Auto-populated review table — loads from tool checkpoint files
+9-column review table: #, Code/Theme Label, Data Extract, Extracts,
+Data Items, Approve, Rename To, Move To, Analytic Memo.
+"""
+import gradio as gr
+import pandas as pd
+import json
+import os
+import re
+import tempfile
+from datetime import datetime
+from pathlib import Path
+from agent import run as agent_run
+THREAD_ID = f"thematic-analysis-{datetime.now().strftime('%Y%m%d%H%M%S')}"
+REVIEW_COLS = [
+    "#", "Code / Theme Label", "Data Extract", "Extracts", "Data Items",
+    "Approve", "Rename To", "Move To", "Analytic Memo",
+]
+EMPTY_TABLE = pd.DataFrame(
+    {"#": ["-"], "Code / Theme Label": ["No codes yet — run analysis first"],
+     "Data Extract": [""], "Extracts": [""], "Data Items": [""],
+     "Approve": [""], "Rename To": [""], "Move To": [""], "Analytic Memo": [""]},
+)
+PHASE_INFO = {
+    0: ("Getting started", "⬜⬜⬜⬜⬜⬜",
+        "Upload your Scopus CSV data set, then click **Analyse my data set**"),
+    1: ("Phase 1 — Familiarisation with the Data", "🟦⬜⬜⬜⬜⬜",
+        "Click **Run analysis on abstracts** or **Run analysis on titles** "
+        "to begin familiarisation with the data corpus"),
+    2: ("Phase 2 — Generating Initial Codes", "🟦🟦⬜⬜⬜⬜",
+        "Review initial codes in the table below. Edit Approve / Rename / "
+        "Move extracts, then click **Submit Review** to collate codes into themes"),
+    3: ("Phase 3 — Searching for Themes", "🟦🟦🟦⬜⬜⬜",
+        "Review candidate themes (collated initial codes). Edit the table "
+        "and click **Submit Review** to proceed to theme review"),
+    4: ("Phase 4 — Reviewing Themes", "🟦🟦🟦🟦⬜⬜",
+        "Review themes against coded extracts (Level 1) and the entire "
+        "data set (Level 2). Click **Submit Review** to confirm"),
+    5: ("Phase 5 — Defining and Naming Themes", "🟦🟦🟦🟦🟦⬜",
+        "Review theme definitions and names. Edit and click **Submit Review**"),
+    6: ("Phase 6 — Producing the Report", "🟦🟦🟦🟦🟦🟦",
+        "Review the scholarly report and thematic map. "
+        "**Submit Review** to finalise"),
+}
+PHASE_PROMPTS = {
+    0: ["Analyse my data set"],
+    1: ["Run analysis on abstracts", "Run analysis on titles",
+        "Show data corpus statistics"],
+    2: ["Proceed to searching for themes", "Show initial codes",
+        "How many orphan extracts?"],
+    3: ["Proceed to reviewing themes", "Show candidate themes",
+        "Explain theme collation"],
+    4: ["Proceed to defining themes", "Show thematic map"],
+    5: ["Proceed to producing the report", "Show theme definitions",
+        "Compare themes with PAJAIS taxonomy"],
+    6: ["Produce final scholarly report", "Show comparison table",
+        "Export all results"],
+}
+REFERENCES_MD = """
+## Methodology References
+Click any link to open the paper in a new tab. These are the foundational
+papers you can cite in your methodology section.
+---
+### 📖 Thematic Analysis (the method)
+**Braun, V., & Clarke, V. (2006).** Using thematic analysis in psychology.
+*Qualitative Research in Psychology*, 3(2), 77–101.
+🔗 [DOI: 10.1191/1478088706qp063oa](https://doi.org/10.1191/1478088706qp063oa)
+> The foundational paper defining the six-phase reflexive thematic
+> analysis procedure. Cite this as the primary methodology reference.
+> Every phase name, terminology, and review step in this agent maps
+> directly to the procedures on pp. 87–93.
+**Braun, V., & Clarke, V. (2019).** Reflecting on reflexive thematic analysis.
+*Qualitative Research in Sport, Exercise and Health*, 11(4), 589–597.
+🔗 [DOI: 10.1080/2159676X.2019.1628806](https://doi.org/10.1080/2159676X.2019.1628806)
+> A later clarification emphasising the reflexive, recursive, and
+> researcher-in-the-loop nature of the method. Useful for defending
+> the human-approval design of this agent.
+**Braun, V., & Clarke, V. (2021).** One size fits all? What counts as
+quality practice in (reflexive) thematic analysis? *Qualitative Research
+in Psychology*, 18(3), 328–352.
+🔗 [DOI: 10.1080/14780887.2020.1769238](https://doi.org/10.1080/14780887.2020.1769238)
+> Quality criteria for thematic analysis — useful for defending the
+> STOP gate design as reviewer-approval checkpoints.
+---
+### 🧠 Embedding Model (Sentence-BERT)
+**Reimers, N., & Gurevych, I. (2019).** Sentence-BERT: Sentence Embeddings
+using Siamese BERT-Networks. *Proceedings of EMNLP-IJCNLP 2019*.
+🔗 [arXiv: 1908.10084](https://arxiv.org/abs/1908.10084)
+> The paper behind `sentence-transformers/all-MiniLM-L6-v2`, the embedding
+> model used to convert data extracts into 384-dimensional vectors.
+> Establishes cosine similarity as the canonical comparison metric for
+> SBERT embeddings — justifies our use of cosine distance.
+---
+### 🔬 Topic Modelling Framework (BERTopic)
+**Grootendorst, M. (2022).** BERTopic: Neural topic modeling with a
+class-based TF-IDF procedure. *arXiv preprint*.
+🔗 [arXiv: 2203.05794](https://arxiv.org/abs/2203.05794)
+> The BERTopic framework. Our approach follows its documented
+> Agglomerative Clustering configuration with `distance_threshold=0.5`
+> as a substitute for HDBSCAN when fine-grained control over code
+> granularity is required.
+---
+### ⚙️ Clustering Algorithm (scikit-learn)
+**Pedregosa, F., et al. (2011).** Scikit-learn: Machine Learning in Python.
+*Journal of Machine Learning Research*, 12, 2825–2830.
+🔗 [JMLR](https://jmlr.org/papers/v12/pedregosa11a.html)
+> Cite this for `sklearn.cluster.AgglomerativeClustering` with
+> `metric='cosine'`, `linkage='average'`, `distance_threshold=0.50`.
+**Müllner, D. (2011).** Modern hierarchical, agglomerative clustering
+algorithms. *arXiv preprint*.
+🔗 [arXiv: 1109.2378](https://arxiv.org/abs/1109.2378)
+> Comprehensive reference for agglomerative clustering algorithms and
+> linkage methods — useful for justifying the choice of `average`
+> linkage over `ward` for cosine-distance data.
+---
+### 🤖 Language Model (Mistral)
+**Jiang, A. Q., et al. (2023).** Mistral 7B. *arXiv preprint*.
+🔗 [arXiv: 2310.06825](https://arxiv.org/abs/2310.06825)
+> The family of LLMs used for initial code labelling and narrative
+> generation. Our agent uses `mistral-large-latest` for these
+> LLM-dependent tool calls.
+---
+### 📚 LangChain / LangGraph
+**Chase, H., et al. (2023).** LangChain. *GitHub repository*.
+🔗 [github.com/langchain-ai/langchain](https://github.com/langchain-ai/langchain)
+**Chase, H., et al. (2024).** LangGraph. *GitHub repository*.
+🔗 [github.com/langchain-ai/langgraph](https://github.com/langchain-ai/langgraph)
+> The agent orchestration framework. `create_agent` (LangChain v1)
+> with `InMemorySaver` (LangGraph) provides the stateful multi-turn
+> conversation with tool-use capability underlying this agent.
+---
+### 🎨 User Interface (Gradio)
+**Abid, A., et al. (2019).** Gradio: Hassle-free sharing and testing of
+ML models in the wild. *arXiv preprint*.
+🔗 [arXiv: 1906.02569](https://arxiv.org/abs/1906.02569)
+> The web UI framework. This application uses Gradio 6.x components:
+> `gr.Blocks`, `gr.Chatbot`, `gr.Dataframe`, `gr.File`, etc.
+---
+## How to cite this agent in your report
+> "Thematic analysis was conducted following Braun and Clarke's (2006)
+> six-phase reflexive procedure, computationally assisted using a
+> researcher-in-the-loop agent. Data extracts were embedded using
+> `all-MiniLM-L6-v2` (Reimers & Gurevych, 2019), clustered with
+> `sklearn.cluster.AgglomerativeClustering` (Pedregosa et al., 2011)
+> using `metric='cosine'`, `linkage='average'`, and
+> `distance_threshold=0.50`, following the Agglomerative Clustering
+> configuration documented in the BERTopic framework (Grootendorst, 2022).
+> Initial code labels and the final scholarly narrative were generated
+> using `mistral-large-latest` (Jiang et al., 2023). At every phase
+> boundary, the researcher reviewed and approved computational outputs
+> via a structured review table before the analysis advanced, preserving
+> the reflexive, recursive, and analyst-led character of thematic
+> analysis (Braun & Clarke, 2019; 2021)."
+"""
+def _prompt_button_updates(phase: int) -> tuple:
+    """Return gr.update values for the 4 phase-specific prompt buttons.
+    Shows only prompts relevant to the current phase. Unused buttons
+    are hidden (visible=False) so the UI stays clean.
+    Returns:
+        Tuple of 4 gr.update objects for btn1, btn2, btn3, btn4.
+    """
+    prompts = (PHASE_PROMPTS.get(phase, PHASE_PROMPTS[0]) + [""] * 4)[:4]
+    return tuple(
+        gr.update(value=p, visible=bool(p))
+        for p in prompts
+    )
+_path = lambda file: str(
+    (hasattr(file, "name") and file.name)
+    or (isinstance(file, str) and file)
+    or ""
+)
+_name = lambda file: os.path.basename(_path(file))
+def _extract_phase(text: str) -> int:
+    """Extract phase number from agent response. Returns 0 if not found."""
+    found = re.findall(r"Phase (\d)", str(text))
+    return int((found or ["0"])[0])
+def _phase_banner(num: int) -> str:
+    """Generate prominent phase banner with progress bar and next step."""
+    name, progress, instruction = PHASE_INFO.get(num, PHASE_INFO[0])
+    return (
+        f"## {progress}  {name}\n\n"
+        f"**NEXT STEP →** {instruction}"
+    )
+def _load_review_table(base_dir: str) -> pd.DataFrame:
+    """Load latest checkpoint file into the 9-column review table.
+    Scans base_dir for topic_labels.json, themes.json, taxonomy_alignment.json,
+    summaries.json. Loads the most recently modified one and formats it.
+    Returns EMPTY_TABLE if nothing found.
+    """
+    base = Path(str(base_dir or "/tmp/nonexistent_dir_placeholder"))
+    candidates = (
+        base_dir and base.exists() and sorted(
+            (
+                list(base.glob("topic_labels.json"))
+                + list(base.glob("themes.json"))
+                + list(base.glob("taxonomy_alignment.json"))
+                + list(base.glob("summaries.json"))
+            ),
+            key=lambda p: p.stat().st_mtime,
+            reverse=True,
+        )
+    ) or []
+    latest = (candidates[:1] or [None])[0]
+    return (latest and [_format_checkpoint(latest)] or [EMPTY_TABLE.copy()])[0]
+def _format_checkpoint(path) -> pd.DataFrame:
+    """Format a checkpoint JSON file into review table rows.
+    Merges data from multiple checkpoint files when available:
+    topic_labels.json has labels but no sizes — summaries.json has sizes.
+    """
+    raw = json.loads(Path(path).read_text())
+    base = Path(path).parent
+    data = (isinstance(raw, dict) and raw.get("clusters", raw.get("per_theme", []))) or \
+           (isinstance(raw, list) and raw) or []
+    summaries_data = {}
+    summaries_path = base / "summaries.json"
+    summaries_raw = (
+        summaries_path.exists() and json.loads(summaries_path.read_text()) or {}
+    )
+    summaries_list = (
+        isinstance(summaries_raw, dict) and summaries_raw.get("clusters", [])
+    ) or (isinstance(summaries_raw, list) and summaries_raw) or []
+    list(map(
+        lambda s: summaries_data.update({s.get("topic_id", -999): s}),
+        summaries_list,
+    ))
+    def _row(item: dict) -> dict:
+        """Map one JSON item to review table columns, merging summaries data."""
+        tid = item.get("topic_id", item.get("theme_id", 0))
+        summary = summaries_data.get(tid, {})
+        return {
+            "#":                  tid,
+            "Code / Theme Label": item.get("label", item.get("theme_label", "")),
+            "Data Extract":       str(
+                item.get("representative", "")
+                or summary.get("representative", "")
+                or item.get("notes", "")
+            )[:150],
+            "Extracts":           item.get("size", 0) or summary.get("size", 0)
+                                  or item.get("total_papers", 0),
+            "Data Items":         item.get("size", 0) or summary.get("size", 0)
+                                  or item.get("total_papers", 0),
+            "Approve":            "Yes",
+            "Rename To":          "",
+            "Move To":            "",
+            "Analytic Memo":      str(item.get("rationale",
+                                                item.get("notes", ""))),
+        }
+    rows = list(map(_row, data[:200]))
+    return (rows and [pd.DataFrame(rows, columns=REVIEW_COLS)] or [EMPTY_TABLE.copy()])[0]
+def on_file_upload(file):
+    """Extract CSV stats and return updates for info, state, banner, buttons."""
+    path = _path(file)
+    default = (
+        "Upload a CSV to begin.", "", _phase_banner(0),
+        *_prompt_button_updates(0),
+    )
+    return (not path) and default or _do_file_upload(path, file)
+def _do_file_upload(path: str, file) -> tuple:
+    """Actual file processing after path validation."""
+    df = pd.read_csv(path)
+    rows, cols = df.shape
+    base = str(Path(path).parent)
+    info = (
+        f"**Loaded:** `{_name(file)}`\n\n"
+        f"**Shape:** {rows:,} rows x {cols} columns\n\n"
+        f"**Columns:** {', '.join(df.columns[:6].tolist())}\n\n"
+        f"*Click a prompt below and press Send to begin.*"
+    )
+    return (info, base, _phase_banner(1), *_prompt_button_updates(1))
+def on_send(user_msg, history, file, base_dir):
+    """Pass user message to agent. Update banner, table, and prompt buttons."""
+    msg = (user_msg or "").strip() or "help"
+    csv_tag = f"[CSV: {_path(file)}]\n" * bool(file)
+    history = list(history or [])
+    history.append({"role": "user", "content": msg})
+    history.append({"role": "assistant", "content": "Thinking..."})
+    yield (
+        history, "", gr.skip(), gr.skip(), gr.skip(),
+        gr.skip(), gr.skip(), gr.skip(), gr.skip(),
+    )
+    reply = agent_run(csv_tag + msg, thread_id=THREAD_ID)
+    history[-1] = {"role": "assistant", "content": reply}
+    phase = _extract_phase(reply)
+    banner = _phase_banner(phase)
+    table = _load_review_table(base_dir)
+    btn_updates = _prompt_button_updates(phase)
+    yield (history, "", banner, table, base_dir, *btn_updates)
+def on_submit_review(table_df, history, base_dir):
+    """Serialise review table edits to agent. Return updated UI."""
+    history = list(history or [])
+    edits = table_df.to_json(orient="records", indent=2)
+    history.append({"role": "user", "content": "[REVIEW SUBMITTED]"})
+    history.append({"role": "assistant", "content": "Processing review..."})
+    reply = agent_run(
+        "Reviewer submitted table edits.\n\n"
+        f"```json\n{edits}\n```\n\n"
+        "Process: Approve/Reject decisions, Rename To values, "
+        "Move To reassignments (call reassign_sentences if moves exist), "
+        "Reasoning notes. Then check STOP gates and proceed.",
+        thread_id=THREAD_ID,
+    )
+    history[-1] = {"role": "assistant", "content": reply}
+    phase = _extract_phase(reply)
+    return (
+        history, _phase_banner(phase), _load_review_table(base_dir),
+        *_prompt_button_updates(phase),
+    )
+def on_download(table_df, history):
+    """Export review CSV and chat TXT."""
+    csv_tmp = tempfile.NamedTemporaryFile(delete=False, suffix=".csv", prefix="review_")
+    table_df.to_csv(csv_tmp.name, index=False)
+    txt_tmp = tempfile.NamedTemporaryFile(delete=False, suffix=".txt", prefix="chat_")
+    txt_tmp.write(
+        "\n\n".join(
+            list(map(
+                lambda m: f"{m.get('role', '').upper()}: {m.get('content', '')}",
+                history or [],
+            ))
+        ).encode("utf-8")
+    )
+    txt_tmp.close()
+    return [csv_tmp.name, txt_tmp.name]
+with gr.Blocks(title="Thematic Analysis Agent") as demo:
+    base_dir_state = gr.State(value="")
+    gr.Markdown("# Thematic Analysis Agent")
+    gr.Markdown(
+        "**Braun & Clarke (2006) 6-Phase Reflexive Thematic Analysis** "
+        "| Sentence-BERT Embeddings | Agglomerative Clustering | "
+        "Cosine Distance 0.50"
+    )
+    phase_banner = gr.Markdown(value=_phase_banner(0))
+    with gr.Tabs():
+      with gr.Tab("🔬 Analysis"):
+        gr.Markdown("---\n### Section 1 — Data Corpus")
+        with gr.Row():
+            with gr.Column(scale=3):
+                file_input = gr.File(
+                    label="Upload data corpus (Scopus CSV)",
+                    file_types=[".csv"],
+                    file_count="single",
+                )
+            with gr.Column(scale=5):
+                file_info = gr.Markdown("Upload a CSV to begin.")
+        gr.Markdown("---\n### Section 2 — Analyst Dialogue")
+        chatbot = gr.Chatbot(label="Thematic Analysis Agent", height=200)
+        with gr.Row():
+            msg_box = gr.Textbox(
+                placeholder="Type a message or click a phase action below",
+                show_label=False, scale=7, lines=1,
+            )
+            send_btn = gr.Button("Send", variant="primary", scale=1)
+        gr.Markdown("**Phase actions** (click to proceed — only actions "
+                    "valid for the current B&C phase are shown)")
+        with gr.Row():
+            prompt_btn_1 = gr.Button("Analyse my data set",
+                                     variant="secondary", scale=1, size="sm")
+            prompt_btn_2 = gr.Button("", variant="secondary", scale=1,
+                                     size="sm", visible=False)
+            prompt_btn_3 = gr.Button("", variant="secondary", scale=1,
+                                     size="sm", visible=False)
+            prompt_btn_4 = gr.Button("", variant="secondary", scale=1,
+                                     size="sm", visible=False)
+        gr.Markdown("---\n### Section 3 — Initial Codes / Candidate Themes / Themes")
+        gr.Markdown(
+            "Auto-populated from tool outputs. Labels are **initial codes** "
+            "in Phase 2, **candidate themes** in Phase 3, and **themes** in "
+            "Phases 4–6. Edit **Approve**, **Rename To**, **Move To**, "
+            "**Analytic Memo** columns, then click **Submit Review**."
+        )
+        review_table = gr.Dataframe(
+            value=EMPTY_TABLE,
+            headers=REVIEW_COLS,
+            datatype=["number", "str", "str", "number", "number",
+                      "str", "str", "str", "str"],
+            column_count=(9, "fixed"),
+            interactive=True,
+            wrap=True,
+            max_height=400,
+        )
+        with gr.Row():
+            clear_btn = gr.Button("Clear table", variant="secondary", scale=2)
+            sub_btn = gr.Button("Submit Review", variant="primary", scale=4)
+        with gr.Accordion("Download", open=False):
+            dl_btn = gr.Button("Generate downloads", variant="primary")
+            dl_files = gr.File(label="Downloads", file_count="multiple",
+                              interactive=False)
+      with gr.Tab("📚 References"):
+        gr.Markdown(REFERENCES_MD)
+    file_input.change(
+        on_file_upload,
+        inputs=[file_input],
+        outputs=[file_info, base_dir_state, phase_banner,
+                 prompt_btn_1, prompt_btn_2, prompt_btn_3, prompt_btn_4],
+    )
+    send_btn.click(
+        on_send,
+        inputs=[msg_box, chatbot, file_input, base_dir_state],
+        outputs=[chatbot, msg_box, phase_banner, review_table, base_dir_state,
+                 prompt_btn_1, prompt_btn_2, prompt_btn_3, prompt_btn_4],
+    )
+    msg_box.submit(
+        on_send,
+        inputs=[msg_box, chatbot, file_input, base_dir_state],
+        outputs=[chatbot, msg_box, phase_banner, review_table, base_dir_state,
+                 prompt_btn_1, prompt_btn_2, prompt_btn_3, prompt_btn_4],
+    )
+    prompt_btn_1.click(
+        on_send,
+        inputs=[prompt_btn_1, chatbot, file_input, base_dir_state],
+        outputs=[chatbot, msg_box, phase_banner, review_table, base_dir_state,
+                 prompt_btn_1, prompt_btn_2, prompt_btn_3, prompt_btn_4],
+    )
+    prompt_btn_2.click(
+        on_send,
+        inputs=[prompt_btn_2, chatbot, file_input, base_dir_state],
+        outputs=[chatbot, msg_box, phase_banner, review_table, base_dir_state,
+                 prompt_btn_1, prompt_btn_2, prompt_btn_3, prompt_btn_4],
+    )
+    prompt_btn_3.click(
+        on_send,
+        inputs=[prompt_btn_3, chatbot, file_input, base_dir_state],
+        outputs=[chatbot, msg_box, phase_banner, review_table, base_dir_state,
+                 prompt_btn_1, prompt_btn_2, prompt_btn_3, prompt_btn_4],
+    )
+    prompt_btn_4.click(
+        on_send,
+        inputs=[prompt_btn_4, chatbot, file_input, base_dir_state],
+        outputs=[chatbot, msg_box, phase_banner, review_table, base_dir_state,
+                 prompt_btn_1, prompt_btn_2, prompt_btn_3, prompt_btn_4],
+    )
+    clear_btn.click(lambda: EMPTY_TABLE.copy(), outputs=[review_table])
+    sub_btn.click(
+        on_submit_review,
+        inputs=[review_table, chatbot, base_dir_state],
+        outputs=[chatbot, phase_banner, review_table,
+                 prompt_btn_1, prompt_btn_2, prompt_btn_3, prompt_btn_4],
+    )
+    dl_btn.click(on_download, inputs=[review_table, chatbot], outputs=[dl_files])
+demo.launch(ssr_mode=False, theme=gr.themes.Soft())

requirements.txt ADDED Viewed

	@@ -0,0 +1,10 @@

+gradio>=6.0.0
+langchain>=1.0.0
+langchain-mistralai>=1.0.0
+langgraph>=1.0.0
+sentence-transformers>=3.0.0
+scikit-learn>=1.4.0
+numpy>=1.26.0
+pandas>=2.1.0
+plotly>=5.20.0
+pyarrow>=15.0.0

tools.py ADDED Viewed

	@@ -0,0 +1,1031 @@

+"""
+tools.py — 10 @tool functions for Braun & Clarke (2006) computational
+thematic analysis.
+Pipeline (called in this order by the LLM agent):
+    1.  load_scopus_csv          — ingest CSV, strip boilerplate, save .parquet
+    2.  run_bertopic_discovery   — embed → cosine agglomerative cluster (min 3
+                                    members) → centroids → orphan report → 4 charts
+    3.  label_topics_with_llm    — Mistral labels top 100 clusters
+    4.  reassign_sentences       — move orphan/misplaced sentences between clusters
+    5.  consolidate_into_themes  — merge reviewer-approved groups
+    6.  compute_saturation       — coverage %, coherence, balance per theme
+    7.  generate_theme_profiles  — top 5 nearest sentences per theme centroid
+    8.  compare_with_taxonomy    — map themes to PAJAIS 25 categories
+    9.  generate_comparison_csv  — abstract vs title side-by-side
+    10. export_narrative         — 500-word Section 7 via Mistral
+Design rules:
+    Every number, percentage, score, or list of sentences presented to the
+    reviewer MUST come from a tool — never from the LLM's imagination.
+    Deterministic tools (1,2,4,5,6,7,9): same input → same output, every run.
+    LLM-dependent tools (3,8,10): grounded in real data passed via prompt,
+    but labels/mappings/narrative may vary slightly between runs.
+    All LLM-dependent outputs require reviewer approval before advancing.
+    ZERO if/elif/else — all decisions by the LLM
+    ZERO for/while    — list(map(...)) and numpy vectorised ops
+    ZERO try/except   — errors surface to the LLM via ToolNode
+Constants reference:
+    EMBED_MODEL = "all-MiniLM-L6-v2"
+        384d sentence embeddings. Runs locally, no API calls.
+        normalize_embeddings=True → cosine similarity = dot product.
+    CLUSTER_THRESHOLD = 0.50
+        Cosine distance threshold for Agglomerative Clustering.
+        Two sentences must have cosine similarity >= 0.50 to share a code.
+        Follows the BERTopic Agglomerative Clustering configuration
+        (Grootendorst, 2022) with distance_threshold=0.5 as documented
+        in the BERTopic framework. Operationalises Braun & Clarke (2006)
+        Phase 2 'Generating Initial Codes' as a reproducible computation.
+        Tighter (e.g. 0.40) → more, finer codes (closer to B&C ideal)
+        Looser (e.g. 0.60) → fewer, broader codes
+        At 0.50 — balanced granularity following BERTopic docs example.
+    MIN_CLUSTER_SIZE = 3
+        Clusters with fewer than 3 members are dissolved. Their sentences
+        become orphans (label=-1) reported to the reviewer for reassignment.
+    N_CENTROIDS = 200
+        Maximum number of clusters saved to summaries.json (and therefore
+        labelled and shown in the review table). Set high enough to capture
+        all clusters in typical Scopus datasets (1k-5k papers).
+        Top clusters extracted for initial discovery report and charts.
+    TOP_TOPICS_LLM = 100
+        Maximum clusters sent to Mistral for labelling.
+    NARRATIVE_WORDS = 500
+        Target word count for Section 7 narrative.
+    PAJAIS_25
+        25 IS research categories from Jiang et al. (2019).
+        Used in Phase 5.5 for taxonomy alignment.
+    BOILERPLATE_PATTERNS (9 regexes)
+        Strip publisher noise: copyright, DOI, Elsevier, Springer,
+        IEEE, Wiley, Taylor & Francis.
+"""
+from __future__ import annotations
+import json
+import re
+import numpy as np
+import pandas as pd
+import plotly.graph_objects as go
+from pathlib import Path
+from langchain_core.tools import tool
+from langchain_mistralai import ChatMistralAI
+from langchain_core.prompts import PromptTemplate
+from langchain_core.output_parsers import JsonOutputParser
+from sentence_transformers import SentenceTransformer
+from sklearn.cluster import AgglomerativeClustering
+from sklearn.metrics.pairwise import cosine_similarity
+from sklearn.preprocessing import normalize
+from sklearn.decomposition import PCA
+RUN_CONFIGS = {
+    "abstract": ["Abstract"],
+    "title":    ["Title"],
+}
+PAJAIS_25 = [
+    "Accounting Information Systems",
+    "Artificial Intelligence & Expert Systems",
+    "Big Data & Analytics",
+    "Business Intelligence & Decision Support",
+    "Cloud Computing",
+    "Cybersecurity & Privacy",
+    "Database Management",
+    "Digital Transformation",
+    "E-Business & E-Commerce",
+    "Enterprise Resource Planning",
+    "Fintech & Digital Finance",
+    "Geographic Information Systems",
+    "Health Informatics",
+    "Human-Computer Interaction",
+    "Information Systems Development",
+    "IT Governance & Management",
+    "IT Strategy & Competitive Advantage",
+    "Knowledge Management",
+    "Machine Learning & Deep Learning",
+    "Mobile Computing",
+    "Natural Language Processing",
+    "Recommender Systems",
+    "Social Media & Web 2.0",
+    "Supply Chain & Logistics IS",
+    "Virtual Reality & Augmented Reality",
+]
+BOILERPLATE_PATTERNS = [
+    r"©\s*\d{4}",
+    r"all rights reserved",
+    r"published by elsevier",
+    r"this article is protected",
+    r"doi:\s*10\.\d{4,}",
+    r"springer nature",
+    r"ieee xplore",
+    r"wiley online library",
+    r"taylor & francis",
+]
+BOILERPLATE_RE    = re.compile("|".join(BOILERPLATE_PATTERNS), flags=re.IGNORECASE)
+SENTENCE_SPLIT_RE = re.compile(r"(?<=[.!?])\s+")
+EMBED_MODEL       = "all-MiniLM-L6-v2"
+N_CENTROIDS       = 200
+CLUSTER_THRESHOLD = 0.50
+MIN_CLUSTER_SIZE  = 5
+TOP_TOPICS_LLM    = 100
+NARRATIVE_WORDS   = 500
+def _clean_text(text: str) -> str:
+    """Remove publisher boilerplate from a single text string.
+    Applies 9-pattern BOILERPLATE_RE regex to strip copyright notices,
+    DOI prefixes, and publisher tags that would pollute embeddings.
+    Args:
+        text: Raw abstract or title string.
+    Returns:
+        Cleaned string with boilerplate removed and whitespace trimmed.
+    """
+    return BOILERPLATE_RE.sub("", str(text)).strip()
+def _sentence_count(text: str) -> int:
+    """Count sentences using regex split on terminal punctuation.
+    Args:
+        text: Cleaned abstract or title text.
+    Returns:
+        Number of sentences (minimum 1 for any non-empty input).
+    """
+    return len(SENTENCE_SPLIT_RE.split(text.strip()))
+def _embed(texts: list[str]) -> np.ndarray:
+    """Embed texts into 384d L2-normalized unit vectors.
+    Uses SentenceTransformer('all-MiniLM-L6-v2') locally — no API calls.
+    normalize_embeddings=True ensures cosine_similarity = dot product.
+    Args:
+        texts: List of N cleaned text strings.
+    Returns:
+        np.ndarray shape (N, 384), dtype float32, L2-normalized.
+    """
+    model = SentenceTransformer(EMBED_MODEL)
+    raw   = model.encode(texts, show_progress_bar=False, normalize_embeddings=True)
+    return np.array(raw, dtype=np.float32)
+def _cosine_cluster(matrix: np.ndarray, threshold: float, min_size: int) -> np.ndarray:
+    """Cluster embeddings using agglomerative cosine clustering.
+    Works DIRECTLY in 384d space — no UMAP. After clustering, any cluster
+    with fewer than min_size members is dissolved: its sentences get
+    label=-1 (orphan) and are reported to the reviewer for reassignment.
+    Algorithm:
+        1. Start: every text is its own cluster.
+        2. Merge the two closest clusters (average cosine distance).
+        3. Repeat until smallest distance exceeds threshold.
+        4. Post-process: dissolve clusters smaller than min_size.
+    Args:
+        matrix:    (N, 384) embedding matrix, L2-normalized.
+        threshold: Max cosine distance for merging (0.7 → ~100 clusters).
+        min_size:  Minimum members per cluster (3). Smaller → orphan.
+    Returns:
+        np.ndarray shape (N,) with integer labels. -1 = orphan.
+    """
+    normed = normalize(matrix, norm="l2")
+    model  = AgglomerativeClustering(
+        n_clusters=None,
+        metric="cosine",
+        linkage="average",
+        distance_threshold=threshold,
+    )
+    labels = model.fit_predict(normed).astype(int)
+    unique, counts = np.unique(labels, return_counts=True)
+    small_clusters = unique[counts < min_size]
+    return np.where(np.isin(labels, small_clusters), -1, labels)
+def _centroid(vecs: np.ndarray) -> np.ndarray:
+    """Compute L2-normalized centroid (average direction in 384d space).
+    Args:
+        vecs: (M, 384) matrix of member embeddings for one cluster.
+    Returns:
+        1d np.ndarray shape (384,), L2-normalized.
+    """
+    return normalize(vecs.mean(axis=0, keepdims=True), norm="l2")[0]
+def _top_n_centroids(matrix: np.ndarray, labels: np.ndarray, n: int) -> list[dict]:
+    """Extract N largest clusters by size and compute their centroids.
+    Excludes orphans (label=-1) from the ranking.
+    Args:
+        matrix: (N, 384) full embedding matrix.
+        labels: (N,) integer cluster labels (-1 = orphan).
+        n:      How many top clusters to return.
+    Returns:
+        List of N dicts with: label, size, indices, centroid.
+    """
+    valid_mask = labels >= 0
+    valid_labels = labels[valid_mask]
+    unique, counts = np.unique(valid_labels, return_counts=True)
+    order      = np.argsort(counts)[::-1][:n]
+    top_labels = unique[order]
+    def _build(lbl: int) -> dict:
+        """Build summary dict for one cluster."""
+        idx = np.where(labels == lbl)[0].tolist()
+        return {
+            "label":    int(lbl),
+            "size":     len(idx),
+            "indices":  idx,
+            "centroid": _centroid(matrix[idx]),
+        }
+    return list(map(_build, top_labels))
+def _mistral_chain(template_str: str):
+    """Create PromptTemplate → ChatMistralAI → JsonOutputParser chain.
+    Args:
+        template_str: Prompt template with {variable} placeholders.
+    Returns:
+        LangChain Runnable chain that accepts dict and returns parsed JSON.
+    """
+    llm    = ChatMistralAI(model="mistral-large-latest", temperature=0)
+    prompt = PromptTemplate.from_template(template_str)
+    return prompt | llm | JsonOutputParser()
+def _dark_layout(title: str) -> dict:
+    """Return Plotly layout dict with dark theme styling.
+    Args:
+        title: Chart title string.
+    Returns:
+        Dict for fig.update_layout(**_dark_layout("...")).
+    """
+    return dict(
+        title=title, paper_bgcolor="#0F172A", plot_bgcolor="#0F172A",
+        font=dict(color="#CBD5E1", family="Sora,sans-serif"),
+        margin=dict(t=50, b=40, l=40, r=20),
+    )
+@tool
+def load_scopus_csv(csv_path: str, run_mode: str = "abstract") -> str:
+    """Load a Scopus CSV, count papers/sentences, apply boilerplate filter.
+    Phase 1 — Familiarisation with the Data. DETERMINISTIC.
+    Steps:
+        1. Read CSV, drop rows where target column is null
+        2. Apply 9-pattern boilerplate regex to clean each text
+        3. Count sentences per paper
+        4. Save cleaned DataFrame as .parquet
+    Args:
+        csv_path: Path to raw Scopus CSV.
+        run_mode: 'abstract' or 'title'.
+    Returns:
+        JSON: total_papers, total_sentences, columns_used,
+        boilerplate_removed, cleaned_parquet, run_mode.
+    """
+    cols   = RUN_CONFIGS[run_mode]
+    target = cols[0]
+    df            = pd.read_csv(csv_path).dropna(subset=[target]).reset_index(drop=True)
+    raw_texts     = df[target].tolist()
+    cleaned_texts = list(map(_clean_text, raw_texts))
+    boilerplate_removed = sum(map(
+        lambda pair: int(pair[0] != pair[1]),
+        zip(raw_texts, cleaned_texts),
+    ))
+    df[f"{target}_clean"] = cleaned_texts
+    df["sentence_count"]  = list(map(_sentence_count, cleaned_texts))
+    out_path = Path(csv_path).with_suffix(".clean.parquet")
+    df.to_parquet(out_path, index=False)
+    return json.dumps({
+        "total_papers":        len(df),
+        "total_sentences":     int(df["sentence_count"].sum()),
+        "columns_used":        cols,
+        "boilerplate_removed": boilerplate_removed,
+        "cleaned_parquet":     str(out_path),
+        "run_mode":            run_mode,
+    }, indent=2)
+@tool
+def run_bertopic_discovery(parquet_path: str, run_mode: str = "abstract") -> str:
+    """Embed texts, cluster them, report orphans, generate charts.
+    Phase 2 — Generating Initial Codes. DETERMINISTIC.
+    Steps:
+        1. Load cleaned parquet, drop Author Keywords columns (RULE 8)
+        2. Embed all texts → N x 384 matrix of unit vectors
+        3. Save embedding matrix as .emb.npy
+        4. Cluster in 384d space (NO UMAP), min 3 members per cluster
+        5. Sentences in clusters < 3 members become orphans (label=-1)
+        6. Extract top-N clusters by size, compute centroids
+        7. Save summaries.json with clusters + orphan list
+        8. Generate 4 Plotly HTML charts
+    Args:
+        parquet_path: Path to .clean.parquet from load_scopus_csv.
+        run_mode:     'abstract' or 'title'.
+    Returns:
+        JSON: total_clusters, orphan_count, summaries_json, embeddings_npy,
+        charts dict.
+    """
+    cols   = RUN_CONFIGS[run_mode]
+    target = f"{cols[0]}_clean"
+    df = pd.read_parquet(parquet_path).drop(
+        columns=[c for c in pd.read_parquet(parquet_path).columns
+                 if re.search(r"keyword|author", c, re.I)],
+        errors="ignore",
+    )
+    paper_texts = df[target].tolist()
+    sentence_records = list(filter(
+        lambda r: len(r["text"].split()) >= 5,
+        [
+            {"paper_idx": paper_i, "sent_idx": sent_i, "text": sent.strip()}
+            for paper_i, paper_text in enumerate(paper_texts)
+            for sent_i, sent in enumerate(SENTENCE_SPLIT_RE.split(paper_text or ""))
+            if sent.strip()
+        ],
+    ))
+    texts      = list(map(lambda r: r["text"], sentence_records))
+    paper_idx  = list(map(lambda r: r["paper_idx"], sentence_records))
+    embeddings = _embed(texts)
+    base       = Path(parquet_path).parent
+    np.save(str(base / Path(parquet_path).stem) + ".emb.npy", embeddings)
+    labels        = _cosine_cluster(embeddings, CLUSTER_THRESHOLD, MIN_CLUSTER_SIZE)
+    orphan_idx    = np.where(labels == -1)[0].tolist()
+    orphan_count  = len(orphan_idx)
+    valid_count   = int((labels >= 0).sum())
+    n_clusters    = int(np.unique(labels[labels >= 0]).shape[0])
+    n_papers      = len(set(paper_idx))
+    n_sentences   = len(texts)
+    top_centroids = _top_n_centroids(embeddings, labels, N_CENTROIDS)
+    def _topic_row(tc: dict) -> dict:
+        """Convert centroid dict into summary row for summaries.json."""
+        return {
+            "topic_id":       tc["label"],
+            "size":           tc["size"],
+            "representative": texts[tc["indices"][0]][:200],
+            "indices":        tc["indices"],
+        }
+    summaries = list(map(_topic_row, top_centroids))
+    orphans = list(map(
+        lambda i: {"sentence_idx": int(i), "text": texts[i][:200]},
+        orphan_idx,
+    ))
+    output = {"clusters": summaries, "orphans": orphans}
+    (base / "summaries.json").write_text(json.dumps(output, indent=2))
+    unique, counts = np.unique(labels[labels >= 0], return_counts=True)
+    order          = np.argsort(counts)[::-1][:20]
+    c1 = go.Figure(go.Bar(
+        x=list(map(str, unique[order])), y=counts[order].tolist(),
+        marker_color="#3B82F6", text=counts[order].tolist(), textposition="outside",
+    ))
+    c1.update_layout(**_dark_layout("Topic Size Distribution (Top 20)"),
+                     xaxis=dict(showgrid=False),
+                     yaxis=dict(showgrid=True, gridcolor="#1E293B"))
+    c1.write_html(str(base / "chart_topic_sizes.html"))
+    centroid_matrix = np.vstack([tc["centroid"] for tc in top_centroids])
+    sim_matrix      = cosine_similarity(centroid_matrix)
+    clabels         = list(map(lambda tc: f"T{tc['label']}", top_centroids))
+    c2 = go.Figure(go.Heatmap(z=sim_matrix, x=clabels, y=clabels, colorscale="Blues"))
+    c2.update_layout(**_dark_layout("Top-5 Centroid Cosine Similarity"))
+    c2.write_html(str(base / "chart_centroid_heatmap.html"))
+    sc = df.get("sentence_count", pd.Series([0] * len(df))).tolist()
+    c3 = go.Figure(go.Histogram(x=sc, nbinsx=40, marker_color="#22D3EE"))
+    c3.update_layout(**_dark_layout("Sentence Count Distribution"),
+                     xaxis=dict(showgrid=False),
+                     yaxis=dict(showgrid=True, gridcolor="#1E293B"))
+    c3.write_html(str(base / "chart_sentence_distribution.html"))
+    coords     = PCA(n_components=2).fit_transform(centroid_matrix)
+    point_text = list(map(lambda tc: f"T{tc['label']}({tc['size']})", top_centroids))
+    c4 = go.Figure(go.Scatter(
+        x=coords[:, 0].tolist(), y=coords[:, 1].tolist(),
+        mode="markers+text", text=point_text, textposition="top center",
+        marker=dict(size=12, color="#F59E0B", line=dict(width=1, color="#0F172A")),
+    ))
+    c4.update_layout(**_dark_layout("Top-5 Centroids — PCA Projection"))
+    c4.write_html(str(base / "chart_centroid_pca.html"))
+    emb_path = str(base / Path(parquet_path).stem) + ".emb.npy"
+    return json.dumps({
+        "total_clusters":  n_clusters,
+        "orphan_count":    orphan_count,
+        "valid_sentences": valid_count,
+        "total_sentences": n_sentences,
+        "total_papers":    n_papers,
+        "top_centroids":   N_CENTROIDS,
+        "summaries_json":  str(base / "summaries.json"),
+        "embeddings_npy":  emb_path,
+        "needs_review":    True,
+        "charts": {
+            "topic_sizes":      str(base / "chart_topic_sizes.html"),
+            "centroid_heatmap": str(base / "chart_centroid_heatmap.html"),
+            "sentence_dist":    str(base / "chart_sentence_distribution.html"),
+            "centroid_pca":     str(base / "chart_centroid_pca.html"),
+        },
+    }, indent=2)
+@tool
+def label_topics_with_llm(summaries_json_path: str) -> str:
+    """Send top-100 topic summaries to Mistral for labelling.
+    Phase 2 — Naming Initial Codes. LLM-DEPENDENT (grounded in real data extracts).
+    Steps:
+        1. Load summaries.json clusters (not orphans)
+        2. Take top 100 by size
+        3. Mistral reads representative sentences → assigns labels
+        4. Returns: topic_id, label, rationale, confidence per cluster
+        5. Save as topic_labels.json
+    Args:
+        summaries_json_path: Path to summaries.json.
+    Returns:
+        JSON: labelled_topics count + output path. needs_review=True.
+    """
+    data          = json.loads(Path(summaries_json_path).read_text())
+    summaries     = data.get("clusters", data)[:TOP_TOPICS_LLM]
+    template = (
+        "You are a scientific topic labelling expert.\n\n"
+        "Below are {n} topic summaries from a BERTopic analysis of academic papers.\n"
+        "Each summary has: topic_id, size, representative text.\n\n"
+        "{summaries}\n\n"
+        "For EACH topic return a JSON array where every element has:\n"
+        "  topic_id   : integer (copy from input)\n"
+        "  label      : 2-5 word snake_case topic label\n"
+        "  rationale  : one sentence justification\n"
+        "  confidence : float 0.0-1.0\n\n"
+        "Return ONLY the JSON array — no markdown, no preamble."
+    )
+    result   = _mistral_chain(template).invoke({
+        "n":         len(summaries),
+        "summaries": json.dumps(summaries, indent=2),
+    })
+    out_path = Path(summaries_json_path).parent / "topic_labels.json"
+    out_path.write_text(json.dumps(result, indent=2))
+    return json.dumps({
+        "labelled_topics": len(result),
+        "output":          str(out_path),
+        "needs_review":    True,
+    }, indent=2)
+@tool
+def reassign_sentences(
+    summaries_json_path: str,
+    embeddings_npy_path: str,
+    move_instructions: list[dict],
+) -> str:
+    """Move orphan or misplaced sentences between clusters.
+    Phase 2 — Reassigning orphan data extracts. DETERMINISTIC.
+    The reviewer specifies moves as a list of dicts:
+        [{"sentence_idx": 42, "to_cluster": 3},
+         {"sentence_idx": 99, "to_cluster": "new"}]
+    For "new" targets, a fresh cluster ID is assigned.
+    After all moves, centroids are recomputed for affected clusters.
+    Steps:
+        1. Load summaries.json and embeddings
+        2. Apply move instructions
+        3. Update cluster assignments
+        4. Recompute centroids for affected clusters
+        5. Save updated summaries.json
+    Args:
+        summaries_json_path: Path to summaries.json.
+        embeddings_npy_path: Path to .emb.npy.
+        move_instructions:   List of dicts with sentence_idx (int) and
+                             to_cluster (int or "new") keys.
+    Returns:
+        JSON: moves_applied count, orphans_remaining, updated summaries path.
+    """
+    data       = json.loads(Path(summaries_json_path).read_text())
+    embeddings = np.load(embeddings_npy_path)
+    moves      = move_instructions
+    clusters   = data.get("clusters", [])
+    orphans    = data.get("orphans", [])
+    all_indices = {}
+    list(map(
+        lambda c: all_indices.update({idx: c["topic_id"] for idx in c.get("indices", [])}),
+        clusters,
+    ))
+    max_id = max(map(lambda c: c.get("topic_id", 0), clusters), default=0)
+    new_id_counter = [max_id + 1]
+    def _apply_move(m: dict) -> dict:
+        """Apply one move instruction, return the resolved target cluster ID."""
+        s_idx = m["sentence_idx"]
+        target = m["to_cluster"]
+        resolved = (target == "new") and new_id_counter.__setitem__(0, new_id_counter[0] + 1) or target
+        final_id = new_id_counter[0] - 1 * (target == "new") + target * (target != "new")
+        all_indices[s_idx] = int(target) * (target != "new") + new_id_counter[0] * (target == "new")
+        return {"sentence_idx": s_idx, "assigned_to": all_indices[s_idx]}
+    applied = list(map(_apply_move, moves))
+    unique_clusters = set(all_indices.values())
+    def _rebuild_cluster(cid: int) -> dict:
+        """Rebuild a cluster dict from the updated index map."""
+        idx = [k for k, v in all_indices.items() if v == cid]
+        vecs = embeddings[idx or [0]]
+        return {
+            "topic_id":       int(cid),
+            "size":           len(idx),
+            "representative": "",
+            "indices":        idx,
+            "centroid":       _centroid(vecs).tolist(),
+        }
+    updated_clusters = list(map(_rebuild_cluster, sorted(unique_clusters)))
+    remaining_orphan_idx = [o["sentence_idx"] for o in orphans
+                           if o["sentence_idx"] not in all_indices]
+    output = {
+        "clusters": updated_clusters,
+        "orphans":  list(map(
+            lambda i: {"sentence_idx": i, "text": ""},
+            remaining_orphan_idx,
+        )),
+    }
+    Path(summaries_json_path).write_text(json.dumps(output, indent=2))
+    return json.dumps({
+        "moves_applied":     len(applied),
+        "orphans_remaining": len(remaining_orphan_idx),
+        "summaries_json":    summaries_json_path,
+        "needs_review":      True,
+    }, indent=2)
+@tool
+def consolidate_into_themes(
+    labels_json_path: str,
+    embeddings_npy_path: str,
+    approved_topic_ids: list[list[int]],
+) -> str:
+    """Merge approved topic groups into consolidated themes.
+    Phase 3 — Searching for Themes. DETERMINISTIC.
+    Steps:
+        1. Load topic_labels.json and embedding matrix
+        2. Pool all member embeddings per group
+        3. Compute fresh L2-normalized centroid per merged group
+        4. Build theme name from joined sub-labels
+        5. Save themes.json
+    Args:
+        labels_json_path:    Path to topic_labels.json.
+        embeddings_npy_path: Path to .emb.npy.
+        approved_topic_ids:  List of lists of initial-code IDs.
+                             Each inner list is one candidate theme.
+                             Example: [[0,1,2],[3,4],[5]] creates 3
+                             candidate themes from 6 initial codes.
+    Returns:
+        JSON: themes_created count + themes_json path. needs_review=True.
+    """
+    labels_data = json.loads(Path(labels_json_path).read_text())
+    embeddings  = np.load(embeddings_npy_path)
+    groups      = approved_topic_ids
+    label_map   = {item["topic_id"]: item for item in labels_data}
+    def _merge_group(group_ids: list[int]) -> dict:
+        """Merge topic IDs into one theme, recompute centroid."""
+        members    = [m for m in map(label_map.get, group_ids) if m is not None]
+        all_idx    = sum(map(lambda m: m.get("indices", []), members), [])
+        vecs       = embeddings[all_idx or [0]]
+        centroid   = _centroid(vecs)
+        sub_labels = list(map(lambda m: m.get("label", ""), members))
+        theme_name = "_".join(
+            dict.fromkeys(sum(map(lambda lbl: lbl.split("_"), sub_labels), []))
+        )[:60]
+        return {
+            "theme_id":     group_ids[0],
+            "theme_label":  theme_name,
+            "merged_ids":   group_ids,
+            "total_papers": len(set(all_idx)),
+            "indices":      all_idx,
+            "centroid":     centroid.tolist(),
+        }
+    themes   = list(map(_merge_group, groups))
+    out_path = Path(labels_json_path).parent / "themes.json"
+    out_path.write_text(json.dumps(themes, indent=2))
+    return json.dumps({
+        "themes_created": len(themes),
+        "themes_json":    str(out_path),
+        "needs_review":   True,
+    }, indent=2)
+@tool
+def compute_saturation(
+    themes_json_path: str,
+    embeddings_npy_path: str,
+    total_papers: int,
+) -> str:
+    """Compute saturation metrics per theme: coverage, coherence, balance.
+    Phase 4 — Reviewing Themes. DETERMINISTIC.
+    Every number in the output is computed by numpy — the LLM never
+    calculates these values. This eliminates hallucination risk for
+    percentages, scores, and ratios.
+    Metrics per theme:
+        coverage  = papers_in_theme / total_papers (exact percentage)
+        coherence = mean pairwise cosine similarity of member embeddings
+                    (1.0 = all identical, 0.0 = orthogonal)
+    Global metrics:
+        total_coverage  = papers in at least one theme / total_papers
+        balance_ratio   = largest_theme / smallest_theme
+        mean_coherence  = average of per-theme coherence scores
+    Args:
+        themes_json_path:    Path to themes.json.
+        embeddings_npy_path: Path to .emb.npy.
+        total_papers:        Total papers in corpus (from Phase 1 stats).
+    Returns:
+        JSON: per-theme metrics + global metrics. needs_review=True.
+    """
+    themes     = json.loads(Path(themes_json_path).read_text())
+    embeddings = np.load(embeddings_npy_path)
+    def _theme_metrics(t: dict) -> dict:
+        """Compute coverage and coherence for one theme."""
+        idx  = t.get("indices", [])
+        size = len(idx)
+        vecs = embeddings[idx or [0]]
+        sim  = cosine_similarity(vecs)
+        n    = len(vecs)
+        coherence = float(
+            (sim.sum() - n) / max(n * (n - 1), 1)
+        )
+        return {
+            "theme_id":    t.get("theme_id", 0),
+            "theme_label": t.get("theme_label", ""),
+            "papers":      size,
+            "coverage_pct": round(size / max(total_papers, 1) * 100, 2),
+            "coherence":    round(coherence, 4),
+        }
+    per_theme = list(map(_theme_metrics, themes))
+    all_paper_idx = set(sum(map(lambda t: t.get("indices", []), themes), []))
+    sizes         = list(map(lambda m: m["papers"], per_theme))
+    coherences    = list(map(lambda m: m["coherence"], per_theme))
+    global_metrics = {
+        "total_coverage_pct": round(len(all_paper_idx) / max(total_papers, 1) * 100, 2),
+        "balance_ratio":      round(max(sizes, default=1) / max(min(sizes, default=1), 1), 2),
+        "mean_coherence":     round(sum(coherences) / max(len(coherences), 1), 4),
+        "theme_count":        len(themes),
+    }
+    out_path = Path(themes_json_path).parent / "saturation.json"
+    result   = {"per_theme": per_theme, "global": global_metrics}
+    out_path.write_text(json.dumps(result, indent=2))
+    return json.dumps({
+        **global_metrics,
+        "per_theme":      per_theme,
+        "saturation_json": str(out_path),
+        "needs_review":    True,
+    }, indent=2)
+@tool
+def generate_theme_profiles(
+    themes_json_path: str,
+    embeddings_npy_path: str,
+    texts_parquet_path: str,
+    run_mode: str = "abstract",
+) -> str:
+    """Generate profile cards with top-5 nearest sentences per theme.
+    Phase 5 — Defining and Naming Themes. DETERMINISTIC.
+    For each theme centroid, computes cosine similarity against ALL
+    embeddings and returns the 5 closest sentences. These are the
+    REAL sentences from the corpus — not generated, not recalled
+    from conversation history. The reviewer uses these to decide
+    on final theme names.
+    Steps:
+        1. Load themes.json with centroids
+        2. Load full embedding matrix
+        3. Load original texts from parquet
+        4. For each theme: cosine_similarity(centroid, all_embeddings)
+        5. Take top 5 by similarity score
+        6. Return exact sentence text + similarity score
+        7. Save profiles.json
+    Args:
+        themes_json_path:    Path to themes.json.
+        embeddings_npy_path: Path to .emb.npy.
+        texts_parquet_path:  Path to .clean.parquet (for original text).
+        run_mode:            'abstract' or 'title'.
+    Returns:
+        JSON: profiles list with top-5 sentences per theme. needs_review=True.
+    """
+    themes     = json.loads(Path(themes_json_path).read_text())
+    embeddings = np.load(embeddings_npy_path)
+    target     = f"{RUN_CONFIGS[run_mode][0]}_clean"
+    texts      = pd.read_parquet(texts_parquet_path)[target].tolist()
+    def _profile(t: dict) -> dict:
+        """Build a profile card for one theme: centroid → top 5 nearest."""
+        centroid = np.array(t["centroid"]).reshape(1, -1)
+        sims     = cosine_similarity(centroid, embeddings)[0]
+        top5_idx = np.argsort(sims)[::-1][:5].tolist()
+        top5     = list(map(
+            lambda i: {
+                "sentence_idx": i,
+                "text":         texts[i][:300],
+                "similarity":   round(float(sims[i]), 4),
+            },
+            top5_idx,
+        ))
+        return {
+            "theme_id":    t.get("theme_id", 0),
+            "theme_label": t.get("theme_label", ""),
+            "total_papers": t.get("total_papers", 0),
+            "top_5_sentences": top5,
+        }
+    profiles = list(map(_profile, themes))
+    out_path = Path(themes_json_path).parent / "profiles.json"
+    out_path.write_text(json.dumps(profiles, indent=2))
+    return json.dumps({
+        "profiles_count": len(profiles),
+        "profiles_json":  str(out_path),
+        "profiles":       profiles,
+        "needs_review":   True,
+    }, indent=2)
+@tool
+def compare_with_taxonomy(themes_json_path: str) -> str:
+    """Map each theme to PAJAIS 25 IS research categories via Mistral.
+    Phase 5.5 — Taxonomy Alignment (extension). LLM-DEPENDENT.
+    Themes with alignment_score < 0.50 are flagged as potentially NOVEL.
+    Args:
+        themes_json_path: Path to themes.json.
+    Returns:
+        JSON: themes_aligned count + taxonomy_file path. needs_review=True.
+    """
+    themes = json.loads(Path(themes_json_path).read_text())
+    safe_themes = list(map(
+        lambda t: {k: v for k, v in t.items() if k not in ("centroid", "indices")},
+        themes,
+    ))
+    template = (
+        "You are an IS research taxonomy expert.\n\n"
+        "PAJAIS 25 Categories:\n{pajais}\n\n"
+        "Research themes:\n{themes}\n\n"
+        "For EACH theme return a JSON array where every element has:\n"
+        "  theme_label       : string\n"
+        "  pajais_categories : list of 1-3 matching PAJAIS category names\n"
+        "  alignment_score   : float 0.0-1.0\n"
+        "  notes             : one sentence justification\n\n"
+        "Return ONLY the JSON array — no markdown, no preamble."
+    )
+    result   = _mistral_chain(template).invoke({
+        "pajais": "\n".join(map(lambda c: f"- {c}", PAJAIS_25)),
+        "themes": json.dumps(safe_themes, indent=2),
+    })
+    out_path = Path(themes_json_path).parent / "taxonomy_alignment.json"
+    out_path.write_text(json.dumps(result, indent=2))
+    return json.dumps({
+        "themes_aligned": len(result),
+        "taxonomy_file":  str(out_path),
+        "needs_review":   True,
+    }, indent=2)
+@tool
+def generate_comparison_csv(
+    abstract_themes_path: str,
+    title_themes_path: str,
+    taxonomy_abstract_path: str,
+    taxonomy_title_path: str,
+) -> str:
+    """Build side-by-side abstract vs title comparison CSV.
+    Phase 6 — Report. DETERMINISTIC.
+    Joins on PAJAIS_Category. Delta_Score = Abstract - Title.
+    Args:
+        abstract_themes_path:   themes.json — abstract run.
+        title_themes_path:      themes.json — title run.
+        taxonomy_abstract_path: taxonomy_alignment.json — abstract run.
+        taxonomy_title_path:    taxonomy_alignment.json — title run.
+    Returns:
+        JSON: comparison_csv path, total_rows, columns. needs_review=True.
+    """
+    def _explode_taxonomy(path: str) -> pd.DataFrame:
+        """Flatten taxonomy alignment into one row per PAJAIS category."""
+        data = json.loads(Path(path).read_text())
+        rows = sum(
+            list(map(
+                lambda item: list(map(
+                    lambda cat: {
+                        "pajais_category": cat,
+                        "theme_label":     item.get("theme_label", ""),
+                        "alignment_score": item.get("alignment_score", 0.0),
+                    },
+                    item.get("pajais_categories", []),
+                )),
+                data,
+            )),
+            [],
+        )
+        return pd.DataFrame(rows)
+    df_abs   = _explode_taxonomy(taxonomy_abstract_path)
+    df_title = _explode_taxonomy(taxonomy_title_path)
+    df_abs.columns   = ["PAJAIS_Category", "Abstract_Theme",  "Abstract_Score"]
+    df_title.columns = ["PAJAIS_Category", "Title_Theme",     "Title_Score"]
+    merged = (
+        pd.merge(df_abs, df_title, on="PAJAIS_Category", how="outer")
+          .fillna({"Abstract_Score": 0.0, "Title_Score": 0.0,
+                   "Abstract_Theme": "",   "Title_Theme": ""})
+          .assign(Delta_Score=lambda d: (d["Abstract_Score"] - d["Title_Score"]).round(4))
+          .sort_values("PAJAIS_Category")
+          .reset_index(drop=True)
+    )
+    out_csv = Path(abstract_themes_path).parent / "abstract_vs_title_comparison.csv"
+    merged.to_csv(out_csv, index=False)
+    return json.dumps({
+        "comparison_csv": str(out_csv),
+        "total_rows":     len(merged),
+        "columns":        list(merged.columns),
+        "needs_review":   True,
+    }, indent=2)
+@tool
+def export_narrative(
+    taxonomy_alignment_path: str,
+    comparison_csv_path: str,
+    run_mode: str = "abstract",
+) -> str:
+    """Generate 500-word Section 7: Discussion & Implications via Mistral.
+    Phase 6 — Report. LLM-DEPENDENT (grounded in taxonomy + comparison data).
+    Args:
+        taxonomy_alignment_path: Path to taxonomy_alignment.json.
+        comparison_csv_path:     Path to comparison CSV.
+        run_mode:                'abstract' or 'title'.
+    Returns:
+        JSON: narrative_path, word_count, narrative text. needs_review=True.
+    """
+    alignment = json.loads(Path(taxonomy_alignment_path).read_text())
+    top_delta = (
+        pd.read_csv(comparison_csv_path)
+          .assign(_abs=lambda d: d["Delta_Score"].abs())
+          .sort_values("_abs", ascending=False)
+          .drop(columns=["_abs"])
+          .head(5)
+    )
+    template = (
+        "You are a senior IS researcher writing a systematic literature review.\n\n"
+        "Write Section 7: Discussion & Implications in exactly {word_count} words.\n\n"
+        "Run mode: {run_mode}\n\n"
+        "Taxonomy alignment (top 10):\n{alignment}\n\n"
+        "Top 5 divergent PAJAIS categories (abstract vs title):\n{divergence}\n\n"
+        "Requirements:\n"
+        "1. Discuss dominant themes and PAJAIS alignment.\n"
+        "2. Interpret divergence between abstract- and title-based models.\n"
+        "3. Highlight implications for IS research practice and future agenda.\n"
+        "4. Use formal academic register — no bullet points.\n"
+        "5. Return a JSON object with a single key 'narrative' containing the prose.\n\n"
+        "Return ONLY valid JSON."
+    )
+    result         = _mistral_chain(template).invoke({
+        "word_count": NARRATIVE_WORDS,
+        "run_mode":   run_mode,
+        "alignment":  json.dumps(alignment[:10], indent=2),
+        "divergence": top_delta.to_json(orient="records", indent=2),
+    })
+    narrative_text = result.get("narrative", str(result))
+    out_path       = Path(taxonomy_alignment_path).parent / "narrative.md"
+    out_path.write_text(
+        f"## Section 7: Discussion & Implications\n\n{narrative_text}\n",
+        encoding="utf-8",
+    )
+    return json.dumps({
+        "narrative_path": str(out_path),
+        "word_count":     len(narrative_text.split()),
+        "narrative":      narrative_text,
+        "needs_review":   True,
+    }, indent=2)
+ALL_TOOLS = [
+    load_scopus_csv,
+    run_bertopic_discovery,
+    label_topics_with_llm,
+    reassign_sentences,
+    consolidate_into_themes,
+    compute_saturation,
+    generate_theme_profiles,
+    compare_with_taxonomy,
+    generate_comparison_csv,
+    export_narrative,
+]