Spaces:

pujithapsx
/

HDFC_EMBEDDING_RA_MATCHING

Sleeping

File size: 1,513 Bytes

e9084d7

Current Version of this application features:

1. dual mode with embedding and llm mode
2. data preprocessing retrieving from csv data
3. Pincode Logic has been updated

Objective:
This repository contains the implementation of a **GenAI-based Entity Matching** system. It supports a dual‑mode architecture with a Fastapi backend, a Streamlit frontend, and a collection of services for data processing and model interaction.

Features:

- **Flexible matching service** implemented in `backend/matching_service.py`.
- **Modular data models** defined in `backend/models.py`.
- **Streamlit frontend** for quick experimentation (`frontend/app_streamlit.py`).
- **Configurable rules and LLM model integration** under `services/`.
- **Extensive test suite** located in `tests/`.
- **Configuration files** and property management in `backend/config` and `services/config.py`.

Active endpoints :

POST /backend/v1/match – Match a single pair of records
POST /backend/v1/match/batch – Match multiple pairs # multithread implementation
GET /backend/v1/health – Full health check (CSV data, models, LLM)
GET /backend/v1/health/llm – LLM server health check only

To Run the application :

for embedding mode:
models will be loaded when we initiate the server

for llm mode:
we have to paste the llm up url in the common.properties , base-url:

for frontend :

python -m streamlit run frontend/app_streamlit.py

for backend:

python -m uvicorn backend.server:app