LogicGoInfotechSpaces's picture
Add testing scripts and docs for duplicate detection API
705812e

Testing Guide for Duplicate Detection API

Quick Test Methods

1. Using the Test Script (Recommended)

Test the Hugging Face Space deployment:

python3 test.py --base-url https://LogicGoInfotechSpaces-duplicate-transaction-detection.hf.space

Test locally (if running on your machine):

python3 test.py --base-url http://127.0.0.1:8000

2. Using cURL Commands

Health Check:

curl https://LogicGoInfotechSpaces-duplicate-transaction-detection.hf.space/health

Get Suggestions:

curl https://LogicGoInfotechSpaces-duplicate-transaction-detection.hf.space/suggestions?limit=10

With pretty JSON output:

curl -s https://LogicGoInfotechSpaces-duplicate-transaction-detection.hf.space/suggestions | python3 -m json.tool

3. Using Browser

Simply open these URLs in your browser:

  • Health Check:

    https://LogicGoInfotechSpaces-duplicate-transaction-detection.hf.space/health
    
  • Get Suggestions:

    https://LogicGoInfotechSpaces-duplicate-transaction-detection.hf.space/suggestions?limit=5
    

4. Using Python Requests

Create a simple test script:

import requests
import json

BASE_URL = "https://LogicGoInfotechSpaces-duplicate-transaction-detection.hf.space"

# Test health endpoint
response = requests.get(f"{BASE_URL}/health")
print("Health Status:", response.status_code)
print("Response:", response.json())

# Test suggestions endpoint
response = requests.get(f"{BASE_URL}/suggestions", params={"limit": 5})
print("\nSuggestions Status:", response.status_code)
print("Suggestions Count:", len(response.json()))
print("\nFirst Suggestion:")
print(json.dumps(response.json()[0] if response.json() else {}, indent=2))

5. Using Postman or Insomnia

Health Endpoint:

  • Method: GET
  • URL: https://LogicGoInfotechSpaces-duplicate-transaction-detection.hf.space/health

Suggestions Endpoint:

  • Method: GET
  • URL: https://LogicGoInfotechSpaces-duplicate-transaction-detection.hf.space/suggestions
  • Query Parameters:
    • limit: 10 (optional, default: 50, max: 500)

Expected Responses

Health Endpoint Response:

{
  "status": "ok"
}

Suggestions Endpoint Response:

[
  {
    "_id": "6923ec37a48c1900950d7e7a",
    "candidate_ids": ["6923ebe7dfb90e344a8a8289", "6923ebf3dfb90e344a8a82cf"],
    "message": "These seem similar. Would you like to merge them?",
    "details": {
      "amount_delta_pct": 0.16,
      "time_delta_minutes": 0.0,
      "merchant_match_rule": "exact"
    },
    "audit": {
      "generated_by": "duplicate-detector",
      "generated_at": "2025-11-24T05:25:11.966000",
      "rule_version": "v1.0"
    },
    "status": "pending"
  }
]

Testing the Scheduler

The scheduler runs automatically in the background. To verify it's working:

  1. Check the logs - The scheduler logs when it runs duplicate detection
  2. Monitor suggestions - New suggestions should appear periodically (default: every 60 seconds)
  3. Check timestamps - The audit.generated_at field shows when each suggestion was created

Local Testing

To test locally, first start the server:

python3 -m uvicorn src.api:app --host 127.0.0.1 --port 8000 --reload

Then in another terminal:

python3 test.py --base-url http://127.0.0.1:8000

Troubleshooting

  • 404 Error: Make sure the Space is running on Hugging Face
  • Timeout: The suggestions endpoint might take time if there's a lot of data
  • Empty suggestions: The scheduler might not have found duplicates yet, or the lookback window needs adjustment