Upload folder using huggingface_hub

Browse files

Files changed (3) hide show

README.md +159 -0
confit_best_model_weights.pt +3 -0
confit_service.py +202 -0

README.md ADDED Viewed

	@@ -0,0 +1,159 @@

+---
+language:
+- en
+license: mit
+tags:
+- embedding-alignment
+- cross-modal
+- feature-extraction
+- dense
+- recruitment
+- contrastive-learning
+- applai
+pipeline_tag: feature-extraction
+---
+# AppAI — ConFiT Bidirectional Alignment
+A custom **bidirectional cross-modal alignment model** for bridging the embedding spaces of SBERT-encoded job descriptions and LayoutLMv3-encoded resumes in the [AppAI](https://github.com/jaimeemanuellucero/applai) recruitment matching pipeline.
+ConFiT (**Con**trastive **Fi**ne-tuned **T**ransformation) learns to project resume embeddings into JD space and JD embeddings into resume space, enabling direct cosine similarity comparison across the two modalities.
+---
+## Model Details
+| Property | Value |
+|---|---|
+| Architecture | `BidirectionalAlignmentModel` (custom) |
+| Base model | None — trained from scratch |
+| Input dimension | 768 |
+| Output dimension | 768 |
+| Features | `full`, `education`, `experience`, `leadership` |
+| Directions | `to_jd` (resume → JD space), `to_resume` (JD → resume space) |
+### Architecture
+`BidirectionalAlignmentModel` applies a learnable transformation to each embedding span independently:
+```
+Input embedding (768-dim)
+  → LayerNorm (per feature)
+  → Learned feature scale (abs + 0.1, ensures positive scale)
+  → Shared transform   (nn.Linear 768→768, bias=False, orthogonal init)
+  → Per-feature transform (nn.Linear 768→768, bias=False, orthogonal init)
+  → Learned blend: output = (1 - α) * shared + α * per_feature
+  → 768-dim aligned embedding
+```
+Where `α = sigmoid(feature_blend_weight)` ∈ (0, 1) is learned independently per feature. Two independent sets of transforms are maintained — one for resume → JD, one for JD → resume.
+**Parameters per direction per feature:**
+- `feature_norm[feat]` — LayerNorm(768)
+- `feature_scale[feat]` — learned scalar (positive)
+- `transform_*_shared` — shared Linear(768, 768), orthogonal init
+- `transform_*[feat]` — per-feature Linear(768, 768), orthogonal init
+- `feature_blend_weights[feat]` — learned scalar blend weight
+---
+## Intended Use
+This model is the **third stage** of the AppAI recruitment intelligence pipeline:
+1. [`Smutypi3/applai-sbert`](https://huggingface.co/Smutypi3/applai-sbert) — encodes JD text spans (SBERT)
+2. [`Smutypi3/applai-layoutlmv3`](https://huggingface.co/Smutypi3/applai-layoutlmv3) — encodes resume PDF spans (LayoutLMv3)
+3. **This model** — aligns both embedding spaces (ConFiT)
+After alignment, resume and JD embeddings can be directly compared with cosine similarity to produce a structured match score per feature.
+---
+## Usage
+### Installation
+```bash
+pip install torch
+```
+### Aligning Embeddings
+```python
+from ai_models.services.confit_service import align_spans
+# Resume embeddings (from LayoutLMv3) → project into JD space
+resume_embeddings = {
+    "full":       [...],  # 768-dim list
+    "education":  [...],
+    "experience": [...],
+    "leadership": [...],
+}
+aligned = align_spans(resume_embeddings, direction="to_jd")
+# JD embeddings (from SBERT) → project into resume space
+jd_embeddings = {"full": [...], "education": [...], ...}
+aligned = align_spans(jd_embeddings, direction="to_resume")
+```
+### Computing Match Scores
+```python
+import torch
+import torch.nn.functional as F
+for feature in ["full", "education", "experience", "leadership"]:
+    r = torch.tensor(resume_aligned[feature])
+    j = torch.tensor(jd_aligned[feature])
+    score = F.cosine_similarity(r.unsqueeze(0), j.unsqueeze(0)).item()
+    print(f"{feature}: {score:.4f}")
+```
+---
+## Training Details
+### Objective
+Contrastive loss over (resume span, JD span) pairs. Both alignment directions are jointly optimised so that projected resume embeddings are close to their matching JD embeddings in JD space, and vice versa.
+### Checkpoint Format
+The weights file supports two formats, both handled automatically at load time:
+```python
+# Raw state dict
+torch.load("confit_best_model_weights.pt")
+# Checkpoint dict
+{"model_state_dict": {...}, "epoch": ..., "loss": ...}
+```
+---
+## Limitations
+- Designed to align embeddings produced specifically by `Smutypi3/applai-sbert` and `Smutypi3/applai-layoutlmv3` — using embeddings from other models is not supported
+- Both directions are independent within `forward()`; do not mix resume and JD embeddings in the same call
+- Not a general-purpose embedding alignment model
+---
+## Citation
+```bibtex
+@software{lucero2025applai_confit,
+  author    = {Lucero, Jaime Emmanuel},
+  title     = {{AppAI ConFiT}: Bidirectional Cross-Modal Embedding Alignment for Recruitment Matching},
+  year      = {2025},
+  publisher = {HuggingFace},
+  url       = {https://huggingface.co/Smutypi3/applai-confit},
+  note      = {Part of the AppAI recruitment intelligence pipeline}
+}
+```
+---
+## License
+MIT

confit_best_model_weights.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0307b39f5368f8d559ea7aa78e3892bc5d83d2ae48a98f21e783da48ac7f602e
+size 23627989

confit_service.py ADDED Viewed

	@@ -0,0 +1,202 @@

+"""ConFiT service — bidirectional cross-modal embedding alignment.
+Architecture: BidirectionalAlignmentModel (from confit hyperparameter tuning notebook).
+  - Shared transforms: transform_resume_to_jd_shared / transform_jd_to_resume_shared
+  - Per-feature transforms: transform_resume_to_jd[feat] / transform_jd_to_resume[feat]
+  - Per-feature LayerNorm, learned blend weights, and feature scale parameters
+  - Features: full, education, experience, leadership (768-dim each)
+Forward:
+  model(resume_features_dict, jd_features_dict)
+  → (resume_to_jd_dict, jd_to_resume_dict)
+  Directions are independent — only the relevant side needs to be populated:
+    direction='to_jd'     → pass resume spans as resume_features, get resume_to_jd
+    direction='to_resume' → pass jd spans as jd_features, get jd_to_resume
+Weights: backend/ai_models/confit_best_model_weights.pt
+  Supported formats:
+    - Raw state dict
+    - Checkpoint dict with key 'model_state_dict'
+"""
+from __future__ import annotations
+from pathlib import Path
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+_WEIGHTS_PATH = Path(__file__).parent.parent / "confit" / "confit_best_model_weights.pt"
+_DIM = 768
+_FEATURE_NAMES = ["full", "education", "experience", "leadership"]
+_model: BidirectionalAlignmentModel | None = None
+_device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+# ---------------------------------------------------------------------------
+# Model definition — exact copy from confit hyperparameter tuning notebook
+# ---------------------------------------------------------------------------
+class BidirectionalAlignmentModel(nn.Module):
+    """Bidirectional alignment with shared + per-feature transformations.
+    Each feature gets:
+      - A shared transform (resume→JD and JD→resume)
+      - A feature-specific transform (resume→JD and JD→resume)
+      - A learned blend weight that mixes shared vs feature-specific output
+      - A LayerNorm applied before transformation
+      - A learned feature scale applied after LayerNorm
+    """
+    def __init__(self, dim: int, feature_names: list[str]) -> None:
+        super().__init__()
+        self.dim = dim
+        self.feature_names = feature_names
+        # Shared transformations
+        self.transform_resume_to_jd_shared = nn.Linear(dim, dim, bias=False)
+        self.transform_jd_to_resume_shared = nn.Linear(dim, dim, bias=False)
+        nn.init.orthogonal_(self.transform_resume_to_jd_shared.weight)
+        nn.init.orthogonal_(self.transform_jd_to_resume_shared.weight)
+        # Per-feature transformations
+        self.transform_resume_to_jd = nn.ModuleDict({
+            feat: nn.Linear(dim, dim, bias=False) for feat in feature_names
+        })
+        self.transform_jd_to_resume = nn.ModuleDict({
+            feat: nn.Linear(dim, dim, bias=False) for feat in feature_names
+        })
+        # Layer normalisation per feature
+        self.feature_norm = nn.ModuleDict({
+            feat: nn.LayerNorm(dim) for feat in feature_names
+        })
+        # Learned blend weights (sigmoid → [0,1])
+        self.feature_blend_weights = nn.ParameterDict({
+            feat: nn.Parameter(torch.tensor(0.5)) for feat in feature_names
+        })
+        # Feature scaling (softplus → positive)
+        self.feature_scale = nn.ParameterDict({
+            feat: nn.Parameter(torch.ones(1)) for feat in feature_names
+        })
+        for feat in feature_names:
+            nn.init.orthogonal_(self.transform_resume_to_jd[feat].weight)
+            nn.init.orthogonal_(self.transform_jd_to_resume[feat].weight)
+    def forward(
+        self,
+        resume_features: dict[str, torch.Tensor],
+        jd_features: dict[str, torch.Tensor],
+    ) -> tuple[dict[str, torch.Tensor], dict[str, torch.Tensor]]:
+        resume_to_jd: dict[str, torch.Tensor] = {}
+        jd_to_resume: dict[str, torch.Tensor] = {}
+        for feat in self.feature_names:
+            if feat in resume_features:
+                normed = self.feature_norm[feat](resume_features[feat])
+                scaled = normed * (self.feature_scale[feat].abs() + 0.1)
+                blend = torch.sigmoid(self.feature_blend_weights[feat])
+                shared = self.transform_resume_to_jd_shared(scaled)
+                specific = self.transform_resume_to_jd[feat](scaled)
+                resume_to_jd[feat] = (1 - blend) * shared + blend * specific
+            if feat in jd_features:
+                normed = self.feature_norm[feat](jd_features[feat])
+                scaled = normed * (self.feature_scale[feat].abs() + 0.1)
+                blend = torch.sigmoid(self.feature_blend_weights[feat])
+                shared = self.transform_jd_to_resume_shared(scaled)
+                specific = self.transform_jd_to_resume[feat](scaled)
+                jd_to_resume[feat] = (1 - blend) * shared + blend * specific
+        return resume_to_jd, jd_to_resume
+# ---------------------------------------------------------------------------
+# Lazy model loader
+# ---------------------------------------------------------------------------
+def _get_model() -> BidirectionalAlignmentModel:
+    global _model
+    if _model is None:
+        if not _WEIGHTS_PATH.exists():
+            raise FileNotFoundError(
+                f"ConFiT weights not found: {_WEIGHTS_PATH}\n"
+                "Upload the file to that path (or to HuggingFace Hub and load via hf_hub_download)."
+            )
+        _model = BidirectionalAlignmentModel(_DIM, _FEATURE_NAMES).to(_device)
+        checkpoint = torch.load(str(_WEIGHTS_PATH), map_location=_device, weights_only=True)
+        # Handle both raw state dict and checkpoint dict
+        state = checkpoint.get("model_state_dict", checkpoint) if isinstance(checkpoint, dict) else checkpoint
+        _model.load_state_dict(state)
+        _model.eval()
+    return _model
+# ---------------------------------------------------------------------------
+# Internal helpers
+# ---------------------------------------------------------------------------
+def _to_tensor_dict(spans: dict[str, list[float]]) -> dict[str, torch.Tensor]:
+    """Convert float-list span dict to batched tensor dict (batch_size=1)."""
+    return {
+        key: torch.tensor(emb, dtype=torch.float32).unsqueeze(0).to(_device)
+        for key, emb in spans.items()
+    }
+def _from_tensor_dict(tensor_dict: dict[str, torch.Tensor]) -> dict[str, list[float]]:
+    """Convert batched tensor dict back to float-list dict."""
+    return {key: t[0].cpu().tolist() for key, t in tensor_dict.items()}
+# ---------------------------------------------------------------------------
+# Public API
+# ---------------------------------------------------------------------------
+@torch.no_grad()
+def align_embedding(embedding: list[float], feature: str, direction: str) -> list[float]:
+    """Align a single 768-dim embedding for a named feature.
+    Args:
+        embedding: 768-dim float list from SBERT or LayoutLM.
+        feature:   One of 'full', 'education', 'experience', 'leadership'.
+        direction: 'to_jd'     — resume embedding → JD space
+                   'to_resume' — JD embedding → resume space
+    Returns:
+        Aligned 768-dim float list.
+    """
+    result = align_spans({feature: embedding}, direction)
+    return result[feature]
+def align_spans(spans: dict[str, list[float]], direction: str) -> dict[str, list[float]]:
+    """Align all embedding spans using BidirectionalAlignmentModel.
+    Args:
+        spans:     Dict of feature_name → 768-dim float list.
+                   Keys must be a subset of ['full', 'education', 'experience', 'leadership'].
+        direction: 'to_jd'     — resume embeddings → JD space
+                   'to_resume' — JD embeddings → resume space
+    Returns:
+        Same structure with aligned embeddings.
+    """
+    model = _get_model()
+    features = _to_tensor_dict(spans)
+    with torch.no_grad():
+        if direction == "to_jd":
+            # Resume → JD: only populate resume_features side
+            resume_to_jd, _ = model(features, {})
+            return _from_tensor_dict(resume_to_jd)
+        else:
+            # JD → Resume: only populate jd_features side
+            _, jd_to_resume = model({}, features)
+            return _from_tensor_dict(jd_to_resume)