AppAI β€” ConFiT Bidirectional Alignment

A custom bidirectional cross-modal alignment model for bridging the embedding spaces of SBERT-encoded job descriptions and LayoutLMv3-encoded resumes in the AppAI recruitment matching pipeline.

ConFiT (Contrastive Fine-tuned Transformation) learns to project resume embeddings into JD space and JD embeddings into resume space, enabling direct cosine similarity comparison across the two modalities.


Model Details

Property Value
Architecture BidirectionalAlignmentModel (custom)
Base model None β€” trained from scratch
Input dimension 768
Output dimension 768
Features full, education, experience, leadership
Directions to_jd (resume β†’ JD space), to_resume (JD β†’ resume space)

Architecture

BidirectionalAlignmentModel applies a learnable transformation to each embedding span independently:

Input embedding (768-dim)
  β†’ LayerNorm (per feature)
  β†’ Learned feature scale (abs + 0.1, ensures positive scale)
  β†’ Shared transform   (nn.Linear 768β†’768, bias=False, orthogonal init)
  β†’ Per-feature transform (nn.Linear 768β†’768, bias=False, orthogonal init)
  β†’ Learned blend: output = (1 - Ξ±) * shared + Ξ± * per_feature
  β†’ 768-dim aligned embedding

Where Ξ± = sigmoid(feature_blend_weight) ∈ (0, 1) is learned independently per feature. Two independent sets of transforms are maintained β€” one for resume β†’ JD, one for JD β†’ resume.

Parameters per direction per feature:

  • feature_norm[feat] β€” LayerNorm(768)
  • feature_scale[feat] β€” learned scalar (positive)
  • transform_*_shared β€” shared Linear(768, 768), orthogonal init
  • transform_*[feat] β€” per-feature Linear(768, 768), orthogonal init
  • feature_blend_weights[feat] β€” learned scalar blend weight

Intended Use

This model is the third stage of the AppAI recruitment intelligence pipeline:

  1. Smutypi3/applai-sbert β€” encodes JD text spans (SBERT)
  2. Smutypi3/applai-layoutlmv3 β€” encodes resume PDF spans (LayoutLMv3)
  3. This model β€” aligns both embedding spaces (ConFiT)

After alignment, resume and JD embeddings can be directly compared with cosine similarity to produce a structured match score per feature.


Usage

Installation

pip install torch

Aligning Embeddings

from ai_models.services.confit_service import align_spans

# Resume embeddings (from LayoutLMv3) β†’ project into JD space
resume_embeddings = {
    "full":       [...],  # 768-dim list
    "education":  [...],
    "experience": [...],
    "leadership": [...],
}
aligned = align_spans(resume_embeddings, direction="to_jd")

# JD embeddings (from SBERT) β†’ project into resume space
jd_embeddings = {"full": [...], "education": [...], ...}
aligned = align_spans(jd_embeddings, direction="to_resume")

Computing Match Scores

import torch
import torch.nn.functional as F

for feature in ["full", "education", "experience", "leadership"]:
    r = torch.tensor(resume_aligned[feature])
    j = torch.tensor(jd_aligned[feature])
    score = F.cosine_similarity(r.unsqueeze(0), j.unsqueeze(0)).item()
    print(f"{feature}: {score:.4f}")

Training Details

Objective

Contrastive loss over (resume span, JD span) pairs. Both alignment directions are jointly optimised so that projected resume embeddings are close to their matching JD embeddings in JD space, and vice versa.

Checkpoint Format

The weights file supports two formats, both handled automatically at load time:

# Raw state dict
torch.load("confit_best_model_weights.pt")

# Checkpoint dict
{"model_state_dict": {...}, "epoch": ..., "loss": ...}

Limitations

  • Designed to align embeddings produced specifically by Smutypi3/applai-sbert and Smutypi3/applai-layoutlmv3 β€” using embeddings from other models is not supported
  • Both directions are independent within forward(); do not mix resume and JD embeddings in the same call
  • Not a general-purpose embedding alignment model

Citation

@software{lucero2025applai_confit,
  author    = {Lucero, Jaime Emmanuel},
  title     = {{AppAI ConFiT}: Bidirectional Cross-Modal Embedding Alignment for Recruitment Matching},
  year      = {2025},
  publisher = {HuggingFace},
  url       = {https://huggingface.co/Smutypi3/applai-confit},
  note      = {Part of the AppAI recruitment intelligence pipeline}
}

License

MIT

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support