Siddhesh-Ai9797 commited on
Commit
5cc7a6c
Β·
1 Parent(s): 6ef4315

Add README config

Browse files
Files changed (1) hide show
  1. README.md +8 -75
README.md CHANGED
@@ -1,78 +1,11 @@
1
- # πŸ€– Self-Correcting Data Validation Agent
2
- ### Schema-Enforced, No-Hallucination Agentic Data Pipeline
3
-
4
- A production-style AI system that converts messy employee text into schema-valid JSON using:
5
-
6
- - LangGraph (state machine orchestration)
7
- - OpenAI LLMs (structured extraction)
8
- - Pydantic v2 (strict schema validation)
9
- - Pandas (deterministic execution)
10
- - Streamlit (interactive UI)
11
-
12
- ---
13
-
14
- ## 🧠 Problem
15
-
16
- LLMs hallucinate missing fields and fabricate identifiers.
17
-
18
- This project enforces:
19
-
20
- - Strict schema validation
21
- - Bounded self-correction retries
22
- - Deterministic query execution
23
- - Explicit rejection handling
24
- - Zero fabrication of required fields
25
-
26
  ---
27
-
28
- ## πŸ— Architecture
29
-
30
- Messy Text
31
- β†’ LLM Extraction
32
- β†’ Schema Validation
33
- β†’ Self-Correction Loop (if needed)
34
- β†’ Final Valid JSON
35
-
36
- Records missing required fields are rejected β€” never hallucinated.
37
-
38
  ---
39
 
40
- ## πŸ” Agent Flow
41
-
42
- extract β†’ validate
43
- if fail β†’ correct β†’ validate β†’ repeat
44
- finalize
45
-
46
- Retry attempts are limited and controlled.
47
-
48
- ---
49
-
50
- ## πŸ’¬ Deterministic Query Engine
51
-
52
- 1. LLM generates structured query plan
53
- 2. Pandas executes it
54
- 3. LLM summarizes computed results
55
-
56
- No synthetic answers.
57
-
58
- ---
59
-
60
- ## πŸš€ Run Locally
61
-
62
- ```bash
63
- git clone git@github.com:Siddhesh-Ai9797/self-correcting-data-validation-agent.git
64
- cd self-correcting-data-validation-agent
65
-
66
- python -m venv .venv
67
- source .venv/bin/activate
68
- pip install -r requirements.txt
69
-
70
- export OPENAI_API_KEY="your_key_here"
71
- streamlit run app.py
72
-
73
- ---
74
-
75
- # Stress Testing
76
- python -m src.eval.run_agent_suite
77
-
78
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Self Correcting Data Validation Agent
3
+ emoji: πŸ€–
4
+ colorFrom: green
5
+ colorTo: blue
6
+ sdk: docker
7
+ pinned: false
 
 
 
 
 
8
  ---
9
 
10
+ # Self-Correcting Data Validation Agent
11
+ Production-style LLM agent using LangGraph, Pydantic v2 and self-correction loops.