Spaces:
Sleeping
Sleeping
Mituvinci commited on
Commit Β·
7428575
1
Parent(s): 6d671f9
Two-model setup: GPT-4o-mini examines, Claude answers
Browse files- README.md +20 -12
- src/graph/nodes.py +3 -2
README.md
CHANGED
|
@@ -12,11 +12,18 @@ private: true
|
|
| 12 |
|
| 13 |
# Adaptive Study Agent
|
| 14 |
|
| 15 |
-
A **LLM self-examination simulation** built with **LangGraph**
|
| 16 |
|
| 17 |
-
|
|
|
|
|
|
|
|
|
|
| 18 |
|
| 19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
|
| 21 |
---
|
| 22 |
|
|
@@ -38,15 +45,16 @@ The agent operates as a LangGraph state machine with conditional branching. Afte
|
|
| 38 |
|
| 39 |
## Tech Stack
|
| 40 |
|
| 41 |
-
| Component | Technology | Purpose
|
| 42 |
-
|------------------|-------------------------------|----------------------------------------------|
|
| 43 |
-
| Agent framework | LangGraph | Stateful loops with conditional branching
|
| 44 |
-
| LLM
|
| 45 |
-
|
|
| 46 |
-
|
|
| 47 |
-
|
|
| 48 |
-
|
|
| 49 |
-
|
|
|
|
|
| 50 |
|
| 51 |
---
|
| 52 |
|
|
|
|
| 12 |
|
| 13 |
# Adaptive Study Agent
|
| 14 |
|
| 15 |
+
A **two-model LLM self-examination simulation** built with **LangGraph**, **Claude (Anthropic)**, and **GPT-4o-mini (OpenAI)**. The agent reads any document you provide and runs a fully autonomous study loop β no human answers anything.
|
| 16 |
|
| 17 |
+
**How the two models collaborate:**
|
| 18 |
+
- **GPT-4o-mini** generates comprehension questions from document chunks (temperature 0.7 β creative) and evaluates the answers against the source material (temperature 0.0 β deterministic)
|
| 19 |
+
- **Claude Sonnet** answers the questions using RAG retrieval from ChromaDB (temperature 0.3 β balanced)
|
| 20 |
+
- **OpenAI text-embedding-3-small** handles document chunking and embedding into ChromaDB only β not used for reasoning
|
| 21 |
|
| 22 |
+
The purpose is to **probe where Claude's understanding of the document breaks down** β GPT acts as the examiner, Claude as the student. When Claude scores below the mastery threshold, the agent re-reads the weak chunk and tries again.
|
| 23 |
+
|
| 24 |
+
The output is a structured session report revealing Claude's weak areas within your document β useful for identifying conceptually dense or underrepresented sections in any text.
|
| 25 |
+
|
| 26 |
+
Applicable to **any domain** β ML papers, medical literature, legal documents, textbooks β anything in PDF or TXT format.
|
| 27 |
|
| 28 |
---
|
| 29 |
|
|
|
|
| 45 |
|
| 46 |
## Tech Stack
|
| 47 |
|
| 48 |
+
| Component | Technology | Purpose |
|
| 49 |
+
|------------------|-------------------------------|------------------------------------------------------|
|
| 50 |
+
| Agent framework | LangGraph | Stateful loops with conditional branching |
|
| 51 |
+
| Examiner LLM | GPT-4o-mini (OpenAI) | Question generation (0.7) + evaluation (0.0) |
|
| 52 |
+
| Student LLM | Claude Sonnet 4 (Anthropic) | Answering questions via RAG (0.3) |
|
| 53 |
+
| Embeddings | OpenAI text-embedding-3-small | Document chunking and embedding into ChromaDB only |
|
| 54 |
+
| Vector store | ChromaDB (local, embedded) | No Docker required |
|
| 55 |
+
| Document parsing | PyMuPDF (fitz) | PDF support |
|
| 56 |
+
| UI | Gradio | Web interface and Hugging Face Spaces deploy |
|
| 57 |
+
| Package manager | uv | Dependency management |
|
| 58 |
|
| 59 |
---
|
| 60 |
|
src/graph/nodes.py
CHANGED
|
@@ -2,6 +2,7 @@ import random
|
|
| 2 |
import re
|
| 3 |
|
| 4 |
from langchain_anthropic import ChatAnthropic
|
|
|
|
| 5 |
from langchain_core.messages import HumanMessage
|
| 6 |
|
| 7 |
from src.graph.state import StudyState
|
|
@@ -45,7 +46,7 @@ def generate_question_node(state: StudyState) -> dict:
|
|
| 45 |
else:
|
| 46 |
passage = random.choice(chunks)
|
| 47 |
|
| 48 |
-
llm =
|
| 49 |
prompt = QUESTION_PROMPT.format(passage=passage)
|
| 50 |
response = llm.invoke([HumanMessage(content=prompt)])
|
| 51 |
question = response.content.strip()
|
|
@@ -81,7 +82,7 @@ def evaluate_node(state: StudyState) -> dict:
|
|
| 81 |
source_chunks = retrieve_chunks(vectorstore, question, top_k=1)
|
| 82 |
source = source_chunks[0] if source_chunks else ""
|
| 83 |
|
| 84 |
-
llm =
|
| 85 |
prompt = EVALUATE_PROMPT.format(question=question, answer=answer, source=source)
|
| 86 |
response = llm.invoke([HumanMessage(content=prompt)])
|
| 87 |
result = response.content.strip()
|
|
|
|
| 2 |
import re
|
| 3 |
|
| 4 |
from langchain_anthropic import ChatAnthropic
|
| 5 |
+
from langchain_openai import ChatOpenAI
|
| 6 |
from langchain_core.messages import HumanMessage
|
| 7 |
|
| 8 |
from src.graph.state import StudyState
|
|
|
|
| 46 |
else:
|
| 47 |
passage = random.choice(chunks)
|
| 48 |
|
| 49 |
+
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)
|
| 50 |
prompt = QUESTION_PROMPT.format(passage=passage)
|
| 51 |
response = llm.invoke([HumanMessage(content=prompt)])
|
| 52 |
question = response.content.strip()
|
|
|
|
| 82 |
source_chunks = retrieve_chunks(vectorstore, question, top_k=1)
|
| 83 |
source = source_chunks[0] if source_chunks else ""
|
| 84 |
|
| 85 |
+
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.0)
|
| 86 |
prompt = EVALUATE_PROMPT.format(question=question, answer=answer, source=source)
|
| 87 |
response = llm.invoke([HumanMessage(content=prompt)])
|
| 88 |
result = response.content.strip()
|