Spaces:

sameer2026
/

iris_backend

Sleeping

App Files Files Community

iris_backend / experimental_results.tex

Saandraahh

Implemented clustering

4b3a33f 3 months ago

raw

history blame contribute delete

3.37 kB

	\section{Experimental Results}
	\label{sec:experimental_results}

	In this section, we present the empirical evaluation of the IRIS system, focusing on two key dimensions: computational efficiency (latency and throughput) and retrieval accuracy.

	\subsection{Computational Efficiency}
	The efficiency of the entity extraction and embedding pipeline was evaluated using a dataset of 50 candidate profiles. The pipeline consists of extracting specific entities—Headline, Summary, Skills, and Experience—and generating their corresponding embeddings using the BGE-M3 model.

	Table~\ref{tab:latency_results} summarizes the mean latency and standard deviation for each entity type.

	\begin{table}[h]
	\centering
	\caption{Mean Latency and Standard Deviation per Entity Extraction (N=50)}
	\label{tab:latency_results}
	\begin{tabular}{lrr}
	\hline
	\textbf{Entity Type} & \textbf{Mean Latency (ms)} & \textbf{Std. Dev. (ms)} \\ \hline
	Headline & 965.78 & 2969.16 \\
	Summary & 785.70 & 141.60 \\
	Skills (List) & 780.01 & 160.76 \\
	Experience (List) & 1005.30 & 185.11 \\ \hline
	\textbf{Total per Profile} & \textbf{3536.80} & -- \\ \hline
	\end{tabular}
	\end{table}

	The average total processing time per profile is approximately 3.54 seconds, resulting in a throughput of \textbf{0.283 profiles per second}. While the Headline extraction shows high variance, possibly due to network latency or cold-start issues in the embedding service, the overall pipeline maintains a consistent performance suitable for near-real-time recruitment tasks.

	\subsection{Retrieval Performance}
	We compared the proposed IRIS matching methods against standard baselines using Mean Reciprocal Rank (MRR) and Recall@K ($R@k$). The evaluation included:
	\begin{itemize}
	\item \textbf{Jaccard Baseline}: A keyword-based overlap method.
	\item \textbf{BERT Flattened}: Dense retrieval using BERT embeddings on concatenated profile text.
	\item \textbf{BGE Flattened}: Dense retrieval using BGE-M3 embeddings on concatenated profile text.
	\item \textbf{BGE Granular Weighted}: Our proposed method using weighted cosine similarity across specific entities.
	\end{itemize}

	Table~\ref{tab:retrieval_results} presents the results of this comparison.

	\begin{table}[h]
	\centering
	\caption{Comparison of Retrieval Accuracy Metrics}
	\label{tab:retrieval_results}
	\begin{tabular}{lccc}
	\hline
	\textbf{Method} & \textbf{MRR} & \textbf{R@1} & \textbf{R@3} \\ \hline
	Jaccard Baseline & 0.0755 & 0.016 & 0.048 \\
	BERT Flattened & 0.1708 & 0.048 & \textbf{0.144} \\
	BGE Flattened & \textbf{0.1729} & \textbf{0.048} & \textbf{0.144} \\
	BGE Granular Weighted & 0.0749 & 0.016 & 0.040 \\ \hline
	\end{tabular}
	\end{table}

	The results indicate that the \textbf{BGE Flattened} approach achieves the highest MRR (0.1729) and Recall@1/Recall@3. Notably, the granular weighted approach currently underperforms compared to the flattened embedding methods, suggesting that the aggregation logic or weight distribution for specific entities requires further optimization.