AppAI β ConFiT Bidirectional Alignment
A custom bidirectional cross-modal alignment model for bridging the embedding spaces of SBERT-encoded job descriptions and LayoutLMv3-encoded resumes in the AppAI recruitment matching pipeline.
ConFiT (Contrastive Fine-tuned Transformation) learns to project resume embeddings into JD space and JD embeddings into resume space, enabling direct cosine similarity comparison across the two modalities.
Model Details
| Property | Value |
|---|---|
| Architecture | BidirectionalAlignmentModel (custom) |
| Base model | None β trained from scratch |
| Input dimension | 768 |
| Output dimension | 768 |
| Features | full, education, experience, leadership |
| Directions | to_jd (resume β JD space), to_resume (JD β resume space) |
Architecture
BidirectionalAlignmentModel applies a learnable transformation to each embedding span independently:
Input embedding (768-dim)
β LayerNorm (per feature)
β Learned feature scale (abs + 0.1, ensures positive scale)
β Shared transform (nn.Linear 768β768, bias=False, orthogonal init)
β Per-feature transform (nn.Linear 768β768, bias=False, orthogonal init)
β Learned blend: output = (1 - Ξ±) * shared + Ξ± * per_feature
β 768-dim aligned embedding
Where Ξ± = sigmoid(feature_blend_weight) β (0, 1) is learned independently per feature. Two independent sets of transforms are maintained β one for resume β JD, one for JD β resume.
Parameters per direction per feature:
feature_norm[feat]β LayerNorm(768)feature_scale[feat]β learned scalar (positive)transform_*_sharedβ shared Linear(768, 768), orthogonal inittransform_*[feat]β per-feature Linear(768, 768), orthogonal initfeature_blend_weights[feat]β learned scalar blend weight
Intended Use
This model is the third stage of the AppAI recruitment intelligence pipeline:
Smutypi3/applai-sbertβ encodes JD text spans (SBERT)Smutypi3/applai-layoutlmv3β encodes resume PDF spans (LayoutLMv3)- This model β aligns both embedding spaces (ConFiT)
After alignment, resume and JD embeddings can be directly compared with cosine similarity to produce a structured match score per feature.
Usage
Installation
pip install torch
Aligning Embeddings
from ai_models.services.confit_service import align_spans
# Resume embeddings (from LayoutLMv3) β project into JD space
resume_embeddings = {
"full": [...], # 768-dim list
"education": [...],
"experience": [...],
"leadership": [...],
}
aligned = align_spans(resume_embeddings, direction="to_jd")
# JD embeddings (from SBERT) β project into resume space
jd_embeddings = {"full": [...], "education": [...], ...}
aligned = align_spans(jd_embeddings, direction="to_resume")
Computing Match Scores
import torch
import torch.nn.functional as F
for feature in ["full", "education", "experience", "leadership"]:
r = torch.tensor(resume_aligned[feature])
j = torch.tensor(jd_aligned[feature])
score = F.cosine_similarity(r.unsqueeze(0), j.unsqueeze(0)).item()
print(f"{feature}: {score:.4f}")
Training Details
Objective
Contrastive loss over (resume span, JD span) pairs. Both alignment directions are jointly optimised so that projected resume embeddings are close to their matching JD embeddings in JD space, and vice versa.
Checkpoint Format
The weights file supports two formats, both handled automatically at load time:
# Raw state dict
torch.load("confit_best_model_weights.pt")
# Checkpoint dict
{"model_state_dict": {...}, "epoch": ..., "loss": ...}
Limitations
- Designed to align embeddings produced specifically by
Smutypi3/applai-sbertandSmutypi3/applai-layoutlmv3β using embeddings from other models is not supported - Both directions are independent within
forward(); do not mix resume and JD embeddings in the same call - Not a general-purpose embedding alignment model
Citation
@software{lucero2025applai_confit,
author = {Lucero, Jaime Emmanuel},
title = {{AppAI ConFiT}: Bidirectional Cross-Modal Embedding Alignment for Recruitment Matching},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/Smutypi3/applai-confit},
note = {Part of the AppAI recruitment intelligence pipeline}
}
License
MIT