pujithapsx's picture
initial push
e9084d7
Current Version of this application features:
1. dual mode with embedding and llm mode
2. data preprocessing retrieving from csv data
3. Pincode Logic has been updated
Objective:
This repository contains the implementation of a **GenAI-based Entity Matching** system. It supports a dual‑mode architecture with a Fastapi backend, a Streamlit frontend, and a collection of services for data processing and model interaction.
Features:
- **Flexible matching service** implemented in `backend/matching_service.py`.
- **Modular data models** defined in `backend/models.py`.
- **Streamlit frontend** for quick experimentation (`frontend/app_streamlit.py`).
- **Configurable rules and LLM model integration** under `services/`.
- **Extensive test suite** located in `tests/`.
- **Configuration files** and property management in `backend/config` and `services/config.py`.
Active endpoints :
POST /backend/v1/match – Match a single pair of records
POST /backend/v1/match/batch – Match multiple pairs # multithread implementation
GET /backend/v1/health – Full health check (CSV data, models, LLM)
GET /backend/v1/health/llm – LLM server health check only
To Run the application :
for embedding mode:
models will be loaded when we initiate the server
for llm mode:
we have to paste the llm up url in the common.properties , base-url:
for frontend :
python -m streamlit run frontend/app_streamlit.py
for backend:
python -m uvicorn backend.server:app