Mituvinci commited on
Commit
7428575
Β·
1 Parent(s): 6d671f9

Two-model setup: GPT-4o-mini examines, Claude answers

Browse files
Files changed (2) hide show
  1. README.md +20 -12
  2. src/graph/nodes.py +3 -2
README.md CHANGED
@@ -12,11 +12,18 @@ private: true
12
 
13
  # Adaptive Study Agent
14
 
15
- A **LLM self-examination simulation** built with **LangGraph** and **Claude (Anthropic)**. The agent reads any document you provide, then runs a fully autonomous study loop β€” the LLM generates its own comprehension questions, retrieves context from ChromaDB to answer them, and evaluates its own answers. The user does not answer any questions. The purpose is to **probe where the LLM's understanding of the document breaks down** β€” which topics it answers confidently versus where it scores low and needs to re-read.
16
 
17
- The output is a structured session report revealing the LLM's weak areas within your document. This is useful for identifying conceptually dense or underrepresented sections in any text.
 
 
 
18
 
19
- This project can be applied to **any domain** β€” machine learning papers, medical literature, legal documents, textbooks β€” anything in PDF or TXT format.
 
 
 
 
20
 
21
  ---
22
 
@@ -38,15 +45,16 @@ The agent operates as a LangGraph state machine with conditional branching. Afte
38
 
39
  ## Tech Stack
40
 
41
- | Component | Technology | Purpose |
42
- |------------------|-------------------------------|----------------------------------------------|
43
- | Agent framework | LangGraph | Stateful loops with conditional branching |
44
- | LLM | Claude Sonnet 4 (Anthropic) | Question generation, answering, evaluation |
45
- | Embeddings | OpenAI text-embedding-3-small | Text chunk embeddings |
46
- | Vector store | ChromaDB (local, embedded) | No Docker required |
47
- | Document parsing | PyMuPDF (fitz) | PDF support |
48
- | UI | Gradio | Web interface and Hugging Face Spaces deploy |
49
- | Package manager | uv | Dependency management |
 
50
 
51
  ---
52
 
 
12
 
13
  # Adaptive Study Agent
14
 
15
+ A **two-model LLM self-examination simulation** built with **LangGraph**, **Claude (Anthropic)**, and **GPT-4o-mini (OpenAI)**. The agent reads any document you provide and runs a fully autonomous study loop β€” no human answers anything.
16
 
17
+ **How the two models collaborate:**
18
+ - **GPT-4o-mini** generates comprehension questions from document chunks (temperature 0.7 β€” creative) and evaluates the answers against the source material (temperature 0.0 β€” deterministic)
19
+ - **Claude Sonnet** answers the questions using RAG retrieval from ChromaDB (temperature 0.3 β€” balanced)
20
+ - **OpenAI text-embedding-3-small** handles document chunking and embedding into ChromaDB only β€” not used for reasoning
21
 
22
+ The purpose is to **probe where Claude's understanding of the document breaks down** β€” GPT acts as the examiner, Claude as the student. When Claude scores below the mastery threshold, the agent re-reads the weak chunk and tries again.
23
+
24
+ The output is a structured session report revealing Claude's weak areas within your document β€” useful for identifying conceptually dense or underrepresented sections in any text.
25
+
26
+ Applicable to **any domain** β€” ML papers, medical literature, legal documents, textbooks β€” anything in PDF or TXT format.
27
 
28
  ---
29
 
 
45
 
46
  ## Tech Stack
47
 
48
+ | Component | Technology | Purpose |
49
+ |------------------|-------------------------------|------------------------------------------------------|
50
+ | Agent framework | LangGraph | Stateful loops with conditional branching |
51
+ | Examiner LLM | GPT-4o-mini (OpenAI) | Question generation (0.7) + evaluation (0.0) |
52
+ | Student LLM | Claude Sonnet 4 (Anthropic) | Answering questions via RAG (0.3) |
53
+ | Embeddings | OpenAI text-embedding-3-small | Document chunking and embedding into ChromaDB only |
54
+ | Vector store | ChromaDB (local, embedded) | No Docker required |
55
+ | Document parsing | PyMuPDF (fitz) | PDF support |
56
+ | UI | Gradio | Web interface and Hugging Face Spaces deploy |
57
+ | Package manager | uv | Dependency management |
58
 
59
  ---
60
 
src/graph/nodes.py CHANGED
@@ -2,6 +2,7 @@ import random
2
  import re
3
 
4
  from langchain_anthropic import ChatAnthropic
 
5
  from langchain_core.messages import HumanMessage
6
 
7
  from src.graph.state import StudyState
@@ -45,7 +46,7 @@ def generate_question_node(state: StudyState) -> dict:
45
  else:
46
  passage = random.choice(chunks)
47
 
48
- llm = ChatAnthropic(model="claude-sonnet-4-20250514", temperature=0.7)
49
  prompt = QUESTION_PROMPT.format(passage=passage)
50
  response = llm.invoke([HumanMessage(content=prompt)])
51
  question = response.content.strip()
@@ -81,7 +82,7 @@ def evaluate_node(state: StudyState) -> dict:
81
  source_chunks = retrieve_chunks(vectorstore, question, top_k=1)
82
  source = source_chunks[0] if source_chunks else ""
83
 
84
- llm = ChatAnthropic(model="claude-sonnet-4-20250514", temperature=0.0)
85
  prompt = EVALUATE_PROMPT.format(question=question, answer=answer, source=source)
86
  response = llm.invoke([HumanMessage(content=prompt)])
87
  result = response.content.strip()
 
2
  import re
3
 
4
  from langchain_anthropic import ChatAnthropic
5
+ from langchain_openai import ChatOpenAI
6
  from langchain_core.messages import HumanMessage
7
 
8
  from src.graph.state import StudyState
 
46
  else:
47
  passage = random.choice(chunks)
48
 
49
+ llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)
50
  prompt = QUESTION_PROMPT.format(passage=passage)
51
  response = llm.invoke([HumanMessage(content=prompt)])
52
  question = response.content.strip()
 
82
  source_chunks = retrieve_chunks(vectorstore, question, top_k=1)
83
  source = source_chunks[0] if source_chunks else ""
84
 
85
+ llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.0)
86
  prompt = EVALUATE_PROMPT.format(question=question, answer=answer, source=source)
87
  response = llm.invoke([HumanMessage(content=prompt)])
88
  result = response.content.strip()