Spaces:

Ani14
/

Guvi-Hackathon-TeChAI

Sleeping

App Files Files Community

Ani14 commited on Jan 29

Commit

192155e

verified ·

1 Parent(s): b789735

Upload 7 files

Browse files

Files changed (7) hide show

Agentic Honey-Pot for Scam Detection & Intelligence Extraction.md +19 -0
Dockerfile +26 -0
Implementation and Deployment Guide for Agentic Honey-Pot.md +96 -0
agent.py +329 -0
app.py +155 -0
models.py +62 -0
requirements.txt +16 -0

Agentic Honey-Pot for Scam Detection & Intelligence Extraction.md ADDED Viewed

	@@ -0,0 +1,19 @@

+# Agentic Honey-Pot for Scam Detection & Intelligence Extraction
+This project implements the solution for Problem Statement 2: **Agentic Honey-Pot for Scam Detection & Intelligence Extraction**.
+## Technology Stack
+*   **Agentic Framework:** [LangGraph](https://langchain-ai.github.io/langgraph/tutorials/introduction/) for stateful, cyclical conversation management.
+*   **LLM:** [Qwen 2.5 3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) (Open-Source, optimized for resource-constrained deployment).
+*   **Backend:** [FastAPI](https://fastapi.tiangolo.com/) for the low-latency REST API.
+*   **Deployment:** Hugging Face Space (Self-hosted on Free Tier).
+## API Endpoint
+The main endpoint for the honeypot is:
+`POST /api/honeypot-detection`
+## Authentication
+The API requires an `x-api-key` header for authentication. The key is set via a Space Secret.
+## Development Notes
+The core logic is implemented in `agent.py` using LangGraph to manage the multi-turn conversation state. The model is loaded with 4-bit quantization (`bitsandbytes`) for efficient use of the free-tier hardware.

Dockerfile ADDED Viewed

	@@ -0,0 +1,26 @@

+# Use a base image with Python and CUDA for GPU support (recommended for Qwen 2.5 3B)
+# This image includes Python, CUDA, and common ML libraries
+FROM nvcr.io/nvidia/pytorch:24.01-py3
+# Set environment variables
+ENV PYTHONUNBUFFERED=1
+# Hugging Face Spaces uses port 7860 by default for web applications
+ENV PORT=7860
+# Set working directory
+WORKDIR /app
+# Copy requirements and install Python dependencies
+# We use --no-cache-dir to keep the image size small
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+# Copy application code
+COPY . .
+# Expose the port
+EXPOSE 7860
+# Command to run the application (matches the Procfile logic)
+# We use the explicit port 7860 as required by HF Spaces
+CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]

Implementation and Deployment Guide for Agentic Honey-Pot.md ADDED Viewed

	@@ -0,0 +1,96 @@

+# Implementation and Deployment Guide for Agentic Honey-Pot
+This guide provides instructions for setting up, running, and deploying the provided Python codebase for the Agentic Honey-Pot solution.
+## 1. Code Structure
+The solution is modularized into the following files:
+| File | Purpose |
+| :--- | :--- |
+| `requirements.txt` | Lists all necessary Python dependencies (LangGraph, FastAPI, Qwen model dependencies). |
+| `models.py` | Contains all Pydantic schemas for API input/output, LangGraph state, and structured intelligence extraction. |
+| `agent.py` | Contains the core **LangGraph** state machine logic, the **Qwen 2.5 3B-Instruct** model loading, and the node functions (`detect_scam`, `agent_persona_response`, `extract_intelligence`, `final_callback`). |
+| `app.py` | The **FastAPI** application that exposes the `/api/honeypot-detection` endpoint and integrates with the LangGraph agent. |
+| `Procfile` | Configuration file for the Hugging Face Space to run the FastAPI application using Uvicorn. (Only needed if not using Dockerfile) |
+| `README.md` | A brief description for the Hugging Face Space repository. |
+| `Dockerfile` | Defines the environment and dependencies for a robust Docker-based deployment on Hugging Face Spaces. |},{find:
+## 2. Local Setup and Testing
+### Step 2.1: Setup Environment
+1.  **Install Dependencies:**
+    ```bash
+    pip install -r requirements.txt
+    ```
+    *Note: The `bitsandbytes` library requires a compatible CUDA setup for GPU usage. If running on CPU, you may need to adjust the model loading in `agent.py` to remove the quantization configuration.*
+2.  **Set Environment Variables:**
+    The `app.py` and `agent.py` files rely on an environment variable for the API key.
+    ```bash
+    export HONEYPOT_API_KEY="YOUR_SECRET_API_KEY_FOR_AUTH"
+    ```
+    *Note: Replace the placeholder with your actual key.*
+### Step 2.2: Run Locally
+1.  **Start the FastAPI Server:**
+    ```bash
+    uvicorn app:app --host 0.0.0.0 --port 8000
+    ```
+2.  **Test the Endpoint:**
+    Use a tool like `curl` or Postman to send a request to `http://localhost:8000/api/honeypot-detection`.
+    **Example cURL Request (Initial Message):**
+    ```bash
+    curl -X POST "http://localhost:8000/api/honeypot-detection" \
+    -H "accept: application/json" \
+    -H "x-api-key: YOUR_SECRET_API_KEY_FOR_AUTH" \
+    -H "Content-Type: application/json" \
+    -d '{
+      "sessionId": "test-session-123",
+      "message": {
+        "sender": "scammer",
+        "text": "Your account is blocked. Click this link immediately: http://malicious-link.example",
+        "timestamp": "2026-01-28T10:00:00Z"
+      },
+      "conversationHistory": [],
+      "metadata": {
+        "channel": "SMS",
+        "language": "English",
+        "locale": "IN"
+      }
+    }'
+    ```
+    *Send subsequent messages by including the previous conversation in the `conversationHistory` field.*
+## 3. Deployment on Hugging Face Space (Recommended Strategy)
+This strategy bypasses the severe limitations of the Hugging Face Inference API free tier by self-hosting the model.
+### Step 3.1: Create a Hugging Face Space
+1.  Go to [Hugging Face Spaces](https://huggingface.co/spaces).
+2.  Click **"Create new Space"**.
+3.  **Name:** Choose a name (e.g., `my-honeypot-agent`).
+4.  **License:** Select a license.
+5.  **Space SDK:** Select **`Docker`** for maximum control over the environment, or **`Gradio`** if you want a simple UI for monitoring. *For a pure API, Docker is the most robust choice.*
+6.  **Hardware:** Select the **Free CPU** or **Free T4 Medium GPU** (if available). **T4 GPU is highly recommended for better latency.**
+### Step 3.2: Configure Environment and Upload Files
+1.  **Set Secrets:** In your Space settings, go to **"Secrets"** and add the following:
+    *   **Name:** `HONEYPOT_API_KEY`
+    *   **Value:** `YOUR_SECRET_API_KEY_FOR_AUTH` (This is the key your API will validate against).
+2.  **Upload Code:** Upload all the provided files (`requirements.txt`, `models.py`, `agent.py`, `app.py`, `Procfile`, `README.md`) to your Space repository.
+3.  **Upload Dockerfile:** The provided `Dockerfile` is optimized for a GPU-enabled environment (recommended for the Qwen 2.5 3B model). Ensure this file is uploaded to the root of your Space repository.
+### Step 3.3: Final Testing
+1.  Once the Space builds successfully, the public URL will be your API base URL (e.g., `https://[your-user]-[your-space].hf.space`).
+2.  Test the live endpoint using the cURL command from Step 2.2, replacing `http://localhost:8000` with your Space URL.
+This setup ensures your agent is running in a dedicated, free environment, providing the stability and performance required for the competition.

agent.py ADDED Viewed

	@@ -0,0 +1,329 @@

+import os
+import json
+import requests
+import torch
+from typing import List, Dict, Any
+from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
+from langgraph.graph import StateGraph, END, START
+from langgraph.checkpoint.base import BaseCheckpointSaver
+from langgraph.checkpoint.memory import MemorySaver
+from pydantic import ValidationError
+from models import AgentState, Message, ExtractedIntelligence, ScamClassification
+# --- Configuration ---
+MODEL_ID = "Qwen/Qwen2.5-3B-Instruct"
+# Placeholder for the final evaluation endpoint
+CALLBACK_URL = "https://hackathon.guvi.in/api/updateHoneyPotFinalResult"
+# Placeholder for the honeypot's own API key (for the callback)
+HONEYPOT_API_KEY = os.environ.get("HONEYPOT_API_KEY", "YOUR_SECRET_API_KEY_FOR_CALLBACK")
+# --- Model Initialization (Singleton Pattern) ---
+class ModelLoader:
+    """Handles loading the Qwen 2.5 3B model with quantization."""
+    _model = None
+    _tokenizer = None
+    @classmethod
+    def get_model_and_tokenizer(cls):
+        if cls._model is None or cls._tokenizer is None:
+            print(f"Loading model {MODEL_ID}...")
+            # 4-bit quantization for memory efficiency on small GPUs/CPUs
+            bnb_config = BitsAndBytesConfig(
+                load_in_4bit=True,
+                bnb_4bit_quant_type="nf4",
+                bnb_4bit_compute_dtype=torch.bfloat16
+            )
+            cls._tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
+            cls._model = AutoModelForCausalLM.from_pretrained(
+                MODEL_ID,
+                quantization_config=bnb_config,
+                device_map="auto" # Use auto to place model parts efficiently
+            )
+            print("Model loaded successfully.")
+        return cls._model, cls._tokenizer
+# --- LangGraph Nodes (Functions) ---
+def _invoke_llm(messages: List[Dict[str, str]], system_prompt: str, json_schema: Optional[Dict[str, Any]] = None) -> str:
+    """Helper function to invoke the Qwen model."""
+    model, tokenizer = ModelLoader.get_model_and_tokenizer()
+    # Construct the full conversation history including the system prompt
+    full_messages = [{"role": "system", "content": system_prompt}] + messages
+    # Add instruction for JSON output if a schema is provided
+    if json_schema:
+        full_messages.append({"role": "user", "content": f"Please output the result as a JSON object that strictly conforms to the following schema: {json.dumps(json_schema)}"})
+    # Apply chat template and tokenize
+    input_ids = tokenizer.apply_chat_template(
+        full_messages,
+        return_tensors="pt",
+        add_generation_prompt=True
+    ).to(model.device)
+    # Generate response
+    with torch.no_grad():
+        output_ids = model.generate(
+            input_ids,
+            max_new_tokens=512,
+            do_sample=True,
+            temperature=0.7,
+            pad_token_id=tokenizer.eos_token_id
+        )
+    # Decode and clean up the response
+    response = tokenizer.decode(output_ids[0][input_ids.shape[-1]:], skip_special_tokens=True)
+    # Simple JSON extraction (often required for open-source models)
+    if json_schema:
+        try:
+            # Attempt to find and parse the JSON block
+            start = response.find('{')
+            end = response.rfind('}') + 1
+            json_str = response[start:end]
+            return json_str
+        except:
+            # Return raw response if parsing fails, let the caller handle it
+            return response
+    return response
+def detect_scam(state: AgentState) -> AgentState:
+    """Node 1: Detects scam intent from the initial message."""
+    latest_message = state["conversationHistory"][-1]
+    system_prompt = (
+        "You are an expert scam detection system. Analyze the user's message and determine "
+        "if it contains clear scam or fraudulent intent (e.g., bank fraud, phishing, urgent account block). "
+        "Your output MUST be a JSON object conforming to the ScamClassification schema."
+    )
+    messages = [{"role": "user", "content": latest_message.text}]
+    try:
+        json_str = _invoke_llm(messages, system_prompt, ScamClassification.model_json_schema())
+        classification = ScamClassification.model_validate_json(json_str)
+        state["scamDetected"] = classification.is_scam
+        state["agentNotes"] += f"Initial Detection: {classification.reason}. "
+        print(f"Scam Detected: {classification.is_scam}")
+    except (json.JSONDecodeError, ValidationError) as e:
+        print(f"Detection failed, defaulting to False: {e}")
+        state["scamDetected"] = False
+        state["agentNotes"] += "Detection failed due to parsing error. "
+    return state
+def agent_persona_response(state: AgentState) -> AgentState:
+    """Node 2: Generates a human-like response to engage the scammer."""
+    if not state["scamDetected"]:
+        # Should not happen in the main flow, but as a safeguard
+        state["agent_response_text"] = "Conversation ended by system."
+        state["should_continue_engagement"] = False
+        return state
+    # Construct the full conversation history for the model
+    history_for_llm = [m.model_dump() for m in state["conversationHistory"]]
+    system_prompt = (
+        "You are a human-like honeypot agent. Your goal is to engage the 'scammer' "
+        "in a believable, multi-turn conversation to extract intelligence (e.g., links, IDs, numbers) "
+        "without revealing that you are an AI or that you suspect a scam. "
+        "Maintain a slightly confused, cautious, but engaged persona. "
+        "Your response must be ONLY the text of the message to send back to the scammer."
+    )
+    # The last message in history_for_llm is the scammer's latest message
+    messages = history_for_llm
+    response_text = _invoke_llm(messages, system_prompt)
+    # Update state with the agent's response
+    agent_message = Message(
+        sender="user", # The honeypot agent is acting as the 'user'
+        text=response_text,
+        timestamp=state["conversationHistory"][-1].timestamp # Placeholder, should be current time
+    )
+    state["conversationHistory"].append(agent_message)
+    state["agent_response_text"] = response_text
+    state["totalMessagesExchanged"] += 1
+    # Simple heuristic to decide if engagement should continue (e.g., if the agent is satisfied)
+    # In a real system, this would be a separate node or a more complex heuristic.
+    state["should_continue_engagement"] = True
+    return state
+def extract_intelligence(state: AgentState) -> AgentState:
+    """Node 3: Extracts structured intelligence from the full conversation history."""
+    # Combine all messages into a single text block for the model to analyze
+    full_transcript = "\n".join([f"{m.sender}: {m.text}" for m in state["conversationHistory"]])
+    system_prompt = (
+        "You are an intelligence extraction specialist. Analyze the following conversation transcript "
+        "between a 'scammer' and a 'user' (honeypot agent). "
+        "Extract all relevant intelligence (bank accounts, UPI IDs, links, phone numbers, keywords) "
+        "mentioned by the 'scammer'. Your output MUST be a JSON object conforming to the ExtractedIntelligence schema. "
+        "If no item is found for a field, use an empty list."
+    )
+    messages = [{"role": "user", "content": f"Transcript:\n{full_transcript}"}]
+    try:
+        json_str = _invoke_llm(messages, system_prompt, ExtractedIntelligence.model_json_schema())
+        extracted_data = ExtractedIntelligence.model_validate_json(json_str)
+        # Merge new intelligence with existing intelligence (if any)
+        current_data = state["extractedIntelligence"].model_dump()
+        new_data = extracted_data.model_dump()
+        for key in current_data:
+            current_data[key] = list(set(current_data[key] + new_data[key]))
+        state["extractedIntelligence"] = ExtractedIntelligence.model_validate(current_data)
+        state["agentNotes"] += f"Intelligence updated. "
+    except (json.JSONDecodeError, ValidationError) as e:
+        print(f"Intelligence extraction failed: {e}")
+        state["agentNotes"] += "Intelligence extraction failed due to parsing error. "
+    return state
+def final_callback(state: AgentState) -> AgentState:
+    """Node 4: Sends the mandatory final result callback to the evaluation endpoint."""
+    if not state["scamDetected"]:
+        print("Callback skipped: Scam not detected.")
+        return state
+    payload = {
+        "sessionId": state["sessionId"],
+        "scamDetected": state["scamDetected"],
+        "totalMessagesExchanged": state["totalMessagesExchanged"],
+        "extractedIntelligence": state["extractedIntelligence"].model_dump(),
+        "agentNotes": state["agentNotes"]
+    }
+    headers = {
+        "Content-Type": "application/json",
+        "x-api-key": HONEYPOT_API_KEY # Use the honeypot's own API key for the callback
+    }
+    try:
+        response = requests.post(CALLBACK_URL, json=payload, headers=headers, timeout=10)
+        response.raise_for_status()
+        print(f"Final callback successful. Status: {response.status_code}")
+        state["agentNotes"] += "Final callback sent successfully. "
+    except requests.exceptions.RequestException as e:
+        print(f"Final callback failed: {e}")
+        state["agentNotes"] += f"Final callback failed: {e}. "
+    return state
+# --- Graph Definition ---
+def create_honeypot_graph(checkpoint_saver: BaseCheckpointSaver):
+    """Defines and compiles the LangGraph state machine."""
+    workflow = StateGraph(AgentState)
+    # Add nodes
+    workflow.add_node("detect_scam", detect_scam)
+    workflow.add_node("agent_persona_response", agent_persona_response)
+    workflow.add_node("extract_intelligence", extract_intelligence)
+    workflow.add_node("final_callback", final_callback)
+    # Define the entry point
+    workflow.add_edge(START, "detect_scam")
+    # Conditional edge after scam detection
+    def should_continue(state: AgentState) -> str:
+        if state["scamDetected"]:
+            return "extract_intelligence"
+        else:
+            return END
+    workflow.add_conditional_edges("detect_scam", should_continue)
+    # Main loop: Extract -> Respond -> (Wait for next message)
+    # The loop is broken by the external API call (the next message)
+    # For a single API call, we just extract and respond.
+    workflow.add_edge("extract_intelligence", "agent_persona_response")
+    # After the agent responds, the process ends, waiting for the next API call
+    # which will restart the graph from the checkpoint.
+    workflow.add_edge("agent_persona_response", END)
+    # The final callback is assumed to be a separate, manual trigger
+    # or a final step after a predetermined number of turns.
+    # For this implementation, we will assume the final callback is triggered
+    # by a separate endpoint or a flag in the state.
+    # Compile the graph
+    app = workflow.compile(checkpointer=checkpoint_saver)
+    return app
+# Initialize the graph with a memory saver for local testing
+# In a real deployment, a database checkpointer (e.g., SQLite, Postgres) would be used.
+memory_saver = MemorySaver()
+honeypot_app = create_honeypot_graph(memory_saver)
+# Optional: Run a test flow locally
+if __name__ == "__main__":
+    # Initialize the state for a new conversation
+    initial_state = AgentState(
+        sessionId="test-session-123",
+        conversationHistory=[
+            Message(
+                sender="scammer",
+                text="Your bank account will be blocked today. Verify immediately by clicking this link: http://malicious-link.example",
+                timestamp="2026-01-28T10:00:00Z"
+            )
+        ],
+        scamDetected=False,
+        extractedIntelligence=ExtractedIntelligence(),
+        agentNotes="",
+        totalMessagesExchanged=1,
+        should_continue_engagement=False,
+        agent_response_text=""
+    )
+    # Run the first turn
+    print("--- Running Turn 1 (Detection, Extraction, Response) ---")
+    final_state = honeypot_app.invoke(initial_state, config={"configurable": {"thread_id": "test-session-123"}})
+    print("\n--- Final State After Turn 1 ---")
+    print(f"Scam Detected: {final_state['scamDetected']}")
+    print(f"Agent Response: {final_state['agent_response_text']}")
+    print(f"Intelligence: {final_state['extractedIntelligence'].model_dump()}")
+    # Simulate the next incoming message from the scammer
+    next_scammer_message = Message(
+        sender="scammer",
+        text="Why are you asking so many questions? Just give me your UPI ID now or I will block your account permanently.",
+        timestamp="2026-01-28T10:05:00Z"
+    )
+    # Load the previous state and add the new message
+    final_state["conversationHistory"].append(next_scammer_message)
+    final_state["totalMessagesExchanged"] += 1
+    # Run the second turn (LangGraph will load the checkpoint and continue)
+    print("\n--- Running Turn 2 (Extraction, Response) ---")
+    final_state_2 = honeypot_app.invoke(final_state, config={"configurable": {"thread_id": "test-session-123"}})
+    print("\n--- Final State After Turn 2 ---")
+    print(f"Agent Response: {final_state_2['agent_response_text']}")
+    print(f"Intelligence: {final_state_2['extractedIntelligence'].model_dump()}")
+    # Manually trigger the final callback (simulating the end of engagement)
+    print("\n--- Triggering Final Callback ---")
+    # Note: The final callback logic needs to be integrated into the API structure
+    # or triggered by a separate endpoint/condition. For this example, we call the function directly.
+    final_callback(final_state_2)

app.py ADDED Viewed

	@@ -0,0 +1,155 @@

+import os
+import time
+from fastapi import FastAPI, Request, HTTPException, Depends, status
+from fastapi.security import APIKeyHeader
+from typing import Dict, Any
+from datetime import datetime
+# LangGraph and Model Imports
+from langgraph.checkpoint.memory import MemorySaver
+from langgraph.checkpoint.base import BaseCheckpointSaver
+from agent import create_honeypot_graph, final_callback
+from models import HoneypotRequest, HoneypotResponse, AgentState, ExtractedIntelligence, Message
+# --- Configuration ---
+API_KEY_NAME = "x-api-key"
+API_KEY = os.environ.get("HONEYPOT_API_KEY", "sk_test_123456789") # Default for local testing
+api_key_header = APIKeyHeader(name=API_KEY_NAME, auto_error=False)
+# --- Initialization ---
+app = FastAPI(
+    title="Agentic Honey-Pot API",
+    description="REST API for Scam Detection and Intelligence Extraction using LangGraph and Qwen 2.5 3B.",
+    version="1.0.0"
+)
+# Initialize LangGraph Checkpointer (Use MemorySaver for simplicity, replace with a database for production)
+# For Hugging Face Spaces, a persistent volume or database is recommended for checkpointer.
+# For this example, we use MemorySaver, which will reset on Space restart.
+checkpointer: BaseCheckpointSaver = MemorySaver()
+honeypot_app = create_honeypot_graph(checkpointer)
+# --- Dependency for API Key Validation ---
+async def get_api_key(api_key_header: str = Depends(api_key_header)):
+    if api_key_header is None or api_key_header != API_KEY:
+        raise HTTPException(
+            status_code=status.HTTP_401_UNAUTHORIZED,
+            detail="Invalid API Key or missing 'x-api-key' header.",
+        )
+    return api_key_header
+# --- API Endpoints ---
+@app.post("/api/honeypot-detection", response_model=HoneypotResponse)
+async def honeypot_detection(
+    request_data: HoneypotRequest,
+    api_key: str = Depends(get_api_key)
+) -> Dict[str, Any]:
+    """
+    Accepts an incoming message event, runs the LangGraph agent, and returns the response.
+    """
+    session_id = request_data.sessionId
+    # 1. Load or Initialize State
+    # LangGraph uses the thread_id for checkpointing
+    config = {"configurable": {"thread_id": session_id}}
+    # Check if a checkpoint exists for this session
+    checkpoint = checkpointer.get_state(config)
+    if checkpoint:
+        # Load existing state and append the new message
+        current_state_dict = checkpoint.dict["values"]
+        # Convert the dictionary back to the TypedDict structure
+        current_state = AgentState(**current_state_dict)
+        # Append the new message from the scammer
+        current_state["conversationHistory"].append(request_data.message)
+        current_state["totalMessagesExchanged"] += 1
+        # The LangGraph will continue from the last node
+        input_state = current_state
+        start_time = time.time()
+    else:
+        # New conversation: Initialize the state
+        initial_history = request_data.conversationHistory + [request_data.message]
+        input_state = AgentState(
+            sessionId=session_id,
+            conversationHistory=initial_history,
+            scamDetected=False,
+            extractedIntelligence=ExtractedIntelligence(),
+            agentNotes="New session started. ",
+            totalMessagesExchanged=len(initial_history),
+            should_continue_engagement=False,
+            agent_response_text=""
+        )
+        start_time = time.time()
+    # 2. Invoke LangGraph
+    try:
+        # Invoke the graph with the updated state
+        final_state_dict = honeypot_app.invoke(input_state, config=config)
+        final_state = AgentState(**final_state_dict)
+        end_time = time.time()
+        engagement_duration = end_time - start_time
+        # 3. Prepare API Response
+        response_data = {
+            "status": "success",
+            "scamDetected": final_state["scamDetected"],
+            "engagementMetrics": {
+                "engagementDurationSeconds": round(engagement_duration, 2),
+                "totalMessagesExchanged": final_state["totalMessagesExchanged"]
+            },
+            "extractedIntelligence": final_state["extractedIntelligence"].model_dump(),
+            "agentNotes": final_state["agentNotes"]
+        }
+        # 4. Check for Final Callback Condition (Example Heuristic)
+        # In a real system, the agent would decide when to end the engagement.
+        # For this example, we'll assume a simple condition (e.g., max turns or specific intelligence extracted)
+        # If the agent decides to end the engagement, we trigger the final callback here.
+        # NOTE: The final_callback function needs the full state, which is available in final_state.
+        # For a true asynchronous callback, this should be run in a background task.
+        # Example: If the agent has extracted a UPI ID, end the engagement and trigger callback
+        if final_state["scamDetected"] and final_state["extractedIntelligence"].upiIds:
+             # Trigger the final callback (synchronously for simplicity in this example)
+             final_callback(final_state)
+        return response_data
+    except Exception as e:
+        print(f"An error occurred during LangGraph invocation: {e}")
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail=f"Internal server error during agent processing: {str(e)}",
+        )
+@app.post("/api/trigger-final-callback")
+async def trigger_callback(session_id: str, api_key: str = Depends(get_api_key)):
+    """
+    Manually triggers the final result callback for a specific session.
+    This is useful for testing or for an external system to signal the end of engagement.
+    """
+    config = {"configurable": {"thread_id": session_id}}
+    checkpoint = checkpointer.get_state(config)
+    if not checkpoint:
+        raise HTTPException(status_code=404, detail=f"Session ID {session_id} not found.")
+    current_state_dict = checkpoint.dict["values"]
+    current_state = AgentState(**current_state_dict)
+    # Trigger the final callback
+    final_callback(current_state)
+    return {"status": "success", "message": f"Final callback triggered for session {session_id}."}
+@app.get("/")
+async def root():
+    return {"message": "Agentic Honey-Pot API is running. Use /api/honeypot-detection endpoint."}

models.py ADDED Viewed

	@@ -0,0 +1,62 @@

+from typing import TypedDict, List, Optional
+from pydantic import BaseModel, Field
+# --- 1. API Input/Output Models ---
+class Message(BaseModel):
+    """Represents a single message in the conversation."""
+    sender: str = Field(..., description="The sender of the message: 'scammer' or 'user'.")
+    text: str = Field(..., description="The content of the message.")
+    timestamp: str = Field(..., description="ISO-8601 format timestamp.")
+class Metadata(BaseModel):
+    """Optional metadata about the conversation channel."""
+    channel: Optional[str] = Field(None, description="e.g., SMS, WhatsApp, Email, Chat")
+    language: Optional[str] = Field(None, description="e.g., English, Hindi")
+    locale: Optional[str] = Field(None, description="e.g., IN")
+class HoneypotRequest(BaseModel):
+    """The incoming request body for the honeypot API."""
+    sessionId: str = Field(..., description="Unique session ID.")
+    message: Message = Field(..., description="The latest incoming message.")
+    conversationHistory: List[Message] = Field(..., description="All previous messages in the same conversation.")
+    metadata: Optional[Metadata] = None
+class HoneypotResponse(BaseModel):
+    """The outgoing response body from the honeypot API."""
+    status: str = Field(..., description="Status of the request: 'success' or 'error'.")
+    scamDetected: bool = Field(..., description="Whether scam intent was confirmed.")
+    engagementMetrics: dict = Field(..., description="Metrics like duration and message count.")
+    extractedIntelligence: dict = Field(..., description="All intelligence gathered by the agent.")
+    agentNotes: str = Field(..., description="Summary of scammer behavior.")
+# --- 2. Structured Intelligence Model (for LLM output) ---
+class ExtractedIntelligence(BaseModel):
+    """Structured data to be extracted from the conversation."""
+    bankAccounts: List[str] = Field(default_factory=list, description="List of bank account numbers mentioned.")
+    upiIds: List[str] = Field(default_factory=list, description="List of UPI IDs mentioned.")
+    phishingLinks: List[str] = Field(default_factory=list, description="List of suspicious links mentioned.")
+    phoneNumbers: List[str] = Field(default_factory=list, description="List of phone numbers mentioned.")
+    suspiciousKeywords: List[str] = Field(default_factory=list, description="List of suspicious keywords used by the scammer.")
+# --- 3. LangGraph State Model ---
+class AgentState(TypedDict):
+    """The state object for the LangGraph state machine."""
+    sessionId: str
+    conversationHistory: List[Message]
+    scamDetected: bool
+    extractedIntelligence: ExtractedIntelligence
+    agentNotes: str
+    totalMessagesExchanged: int
+    # New fields for control flow
+    should_continue_engagement: bool
+    agent_response_text: str
+# --- 4. LLM Classification Output Model ---
+class ScamClassification(BaseModel):
+    """Model for the initial scam detection output."""
+    is_scam: bool = Field(..., description="True if scam intent is detected, False otherwise.")
+    reason: str = Field(..., description="Brief reason for the classification.")

requirements.txt ADDED Viewed

	@@ -0,0 +1,16 @@

+# Core Agentic Framework
+langchain
+langgraph
+# Model Handling (Hugging Face)
+torch
+transformers
+accelerate
+bitsandbytes
+# API and Structured Output
+fastapi
+uvicorn
+pydantic
+# HTTP Client for Final Callback
+requests
+# For tool-calling/JSON output
+pydantic-extra-types