Smutypi3 commited on
Commit
67e322a
·
verified ·
1 Parent(s): dbde91e

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. README.md +159 -0
  2. confit_best_model_weights.pt +3 -0
  3. confit_service.py +202 -0
README.md ADDED
@@ -0,0 +1,159 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: mit
5
+ tags:
6
+ - embedding-alignment
7
+ - cross-modal
8
+ - feature-extraction
9
+ - dense
10
+ - recruitment
11
+ - contrastive-learning
12
+ - applai
13
+ pipeline_tag: feature-extraction
14
+ ---
15
+
16
+ # AppAI — ConFiT Bidirectional Alignment
17
+
18
+ A custom **bidirectional cross-modal alignment model** for bridging the embedding spaces of SBERT-encoded job descriptions and LayoutLMv3-encoded resumes in the [AppAI](https://github.com/jaimeemanuellucero/applai) recruitment matching pipeline.
19
+
20
+ ConFiT (**Con**trastive **Fi**ne-tuned **T**ransformation) learns to project resume embeddings into JD space and JD embeddings into resume space, enabling direct cosine similarity comparison across the two modalities.
21
+
22
+ ---
23
+
24
+ ## Model Details
25
+
26
+ | Property | Value |
27
+ |---|---|
28
+ | Architecture | `BidirectionalAlignmentModel` (custom) |
29
+ | Base model | None — trained from scratch |
30
+ | Input dimension | 768 |
31
+ | Output dimension | 768 |
32
+ | Features | `full`, `education`, `experience`, `leadership` |
33
+ | Directions | `to_jd` (resume → JD space), `to_resume` (JD → resume space) |
34
+
35
+ ### Architecture
36
+
37
+ `BidirectionalAlignmentModel` applies a learnable transformation to each embedding span independently:
38
+
39
+ ```
40
+ Input embedding (768-dim)
41
+ → LayerNorm (per feature)
42
+ → Learned feature scale (abs + 0.1, ensures positive scale)
43
+ → Shared transform (nn.Linear 768→768, bias=False, orthogonal init)
44
+ → Per-feature transform (nn.Linear 768→768, bias=False, orthogonal init)
45
+ → Learned blend: output = (1 - α) * shared + α * per_feature
46
+ → 768-dim aligned embedding
47
+ ```
48
+
49
+ Where `α = sigmoid(feature_blend_weight)` ∈ (0, 1) is learned independently per feature. Two independent sets of transforms are maintained — one for resume → JD, one for JD → resume.
50
+
51
+ **Parameters per direction per feature:**
52
+ - `feature_norm[feat]` — LayerNorm(768)
53
+ - `feature_scale[feat]` — learned scalar (positive)
54
+ - `transform_*_shared` — shared Linear(768, 768), orthogonal init
55
+ - `transform_*[feat]` — per-feature Linear(768, 768), orthogonal init
56
+ - `feature_blend_weights[feat]` — learned scalar blend weight
57
+
58
+ ---
59
+
60
+ ## Intended Use
61
+
62
+ This model is the **third stage** of the AppAI recruitment intelligence pipeline:
63
+
64
+ 1. [`Smutypi3/applai-sbert`](https://huggingface.co/Smutypi3/applai-sbert) — encodes JD text spans (SBERT)
65
+ 2. [`Smutypi3/applai-layoutlmv3`](https://huggingface.co/Smutypi3/applai-layoutlmv3) — encodes resume PDF spans (LayoutLMv3)
66
+ 3. **This model** — aligns both embedding spaces (ConFiT)
67
+
68
+ After alignment, resume and JD embeddings can be directly compared with cosine similarity to produce a structured match score per feature.
69
+
70
+ ---
71
+
72
+ ## Usage
73
+
74
+ ### Installation
75
+
76
+ ```bash
77
+ pip install torch
78
+ ```
79
+
80
+ ### Aligning Embeddings
81
+
82
+ ```python
83
+ from ai_models.services.confit_service import align_spans
84
+
85
+ # Resume embeddings (from LayoutLMv3) → project into JD space
86
+ resume_embeddings = {
87
+ "full": [...], # 768-dim list
88
+ "education": [...],
89
+ "experience": [...],
90
+ "leadership": [...],
91
+ }
92
+ aligned = align_spans(resume_embeddings, direction="to_jd")
93
+
94
+ # JD embeddings (from SBERT) → project into resume space
95
+ jd_embeddings = {"full": [...], "education": [...], ...}
96
+ aligned = align_spans(jd_embeddings, direction="to_resume")
97
+ ```
98
+
99
+ ### Computing Match Scores
100
+
101
+ ```python
102
+ import torch
103
+ import torch.nn.functional as F
104
+
105
+ for feature in ["full", "education", "experience", "leadership"]:
106
+ r = torch.tensor(resume_aligned[feature])
107
+ j = torch.tensor(jd_aligned[feature])
108
+ score = F.cosine_similarity(r.unsqueeze(0), j.unsqueeze(0)).item()
109
+ print(f"{feature}: {score:.4f}")
110
+ ```
111
+
112
+ ---
113
+
114
+ ## Training Details
115
+
116
+ ### Objective
117
+
118
+ Contrastive loss over (resume span, JD span) pairs. Both alignment directions are jointly optimised so that projected resume embeddings are close to their matching JD embeddings in JD space, and vice versa.
119
+
120
+ ### Checkpoint Format
121
+
122
+ The weights file supports two formats, both handled automatically at load time:
123
+
124
+ ```python
125
+ # Raw state dict
126
+ torch.load("confit_best_model_weights.pt")
127
+
128
+ # Checkpoint dict
129
+ {"model_state_dict": {...}, "epoch": ..., "loss": ...}
130
+ ```
131
+
132
+ ---
133
+
134
+ ## Limitations
135
+
136
+ - Designed to align embeddings produced specifically by `Smutypi3/applai-sbert` and `Smutypi3/applai-layoutlmv3` — using embeddings from other models is not supported
137
+ - Both directions are independent within `forward()`; do not mix resume and JD embeddings in the same call
138
+ - Not a general-purpose embedding alignment model
139
+
140
+ ---
141
+
142
+ ## Citation
143
+
144
+ ```bibtex
145
+ @software{lucero2025applai_confit,
146
+ author = {Lucero, Jaime Emmanuel},
147
+ title = {{AppAI ConFiT}: Bidirectional Cross-Modal Embedding Alignment for Recruitment Matching},
148
+ year = {2025},
149
+ publisher = {HuggingFace},
150
+ url = {https://huggingface.co/Smutypi3/applai-confit},
151
+ note = {Part of the AppAI recruitment intelligence pipeline}
152
+ }
153
+ ```
154
+
155
+ ---
156
+
157
+ ## License
158
+
159
+ MIT
confit_best_model_weights.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0307b39f5368f8d559ea7aa78e3892bc5d83d2ae48a98f21e783da48ac7f602e
3
+ size 23627989
confit_service.py ADDED
@@ -0,0 +1,202 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """ConFiT service — bidirectional cross-modal embedding alignment.
2
+
3
+ Architecture: BidirectionalAlignmentModel (from confit hyperparameter tuning notebook).
4
+ - Shared transforms: transform_resume_to_jd_shared / transform_jd_to_resume_shared
5
+ - Per-feature transforms: transform_resume_to_jd[feat] / transform_jd_to_resume[feat]
6
+ - Per-feature LayerNorm, learned blend weights, and feature scale parameters
7
+ - Features: full, education, experience, leadership (768-dim each)
8
+
9
+ Forward:
10
+ model(resume_features_dict, jd_features_dict)
11
+ → (resume_to_jd_dict, jd_to_resume_dict)
12
+
13
+ Directions are independent — only the relevant side needs to be populated:
14
+ direction='to_jd' → pass resume spans as resume_features, get resume_to_jd
15
+ direction='to_resume' → pass jd spans as jd_features, get jd_to_resume
16
+
17
+ Weights: backend/ai_models/confit_best_model_weights.pt
18
+ Supported formats:
19
+ - Raw state dict
20
+ - Checkpoint dict with key 'model_state_dict'
21
+ """
22
+
23
+ from __future__ import annotations
24
+
25
+ from pathlib import Path
26
+
27
+ import torch
28
+ import torch.nn as nn
29
+ import torch.nn.functional as F
30
+
31
+ _WEIGHTS_PATH = Path(__file__).parent.parent / "confit" / "confit_best_model_weights.pt"
32
+ _DIM = 768
33
+ _FEATURE_NAMES = ["full", "education", "experience", "leadership"]
34
+
35
+ _model: BidirectionalAlignmentModel | None = None
36
+ _device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
37
+
38
+
39
+ # ---------------------------------------------------------------------------
40
+ # Model definition — exact copy from confit hyperparameter tuning notebook
41
+ # ---------------------------------------------------------------------------
42
+
43
+ class BidirectionalAlignmentModel(nn.Module):
44
+ """Bidirectional alignment with shared + per-feature transformations.
45
+
46
+ Each feature gets:
47
+ - A shared transform (resume→JD and JD→resume)
48
+ - A feature-specific transform (resume→JD and JD→resume)
49
+ - A learned blend weight that mixes shared vs feature-specific output
50
+ - A LayerNorm applied before transformation
51
+ - A learned feature scale applied after LayerNorm
52
+ """
53
+
54
+ def __init__(self, dim: int, feature_names: list[str]) -> None:
55
+ super().__init__()
56
+ self.dim = dim
57
+ self.feature_names = feature_names
58
+
59
+ # Shared transformations
60
+ self.transform_resume_to_jd_shared = nn.Linear(dim, dim, bias=False)
61
+ self.transform_jd_to_resume_shared = nn.Linear(dim, dim, bias=False)
62
+ nn.init.orthogonal_(self.transform_resume_to_jd_shared.weight)
63
+ nn.init.orthogonal_(self.transform_jd_to_resume_shared.weight)
64
+
65
+ # Per-feature transformations
66
+ self.transform_resume_to_jd = nn.ModuleDict({
67
+ feat: nn.Linear(dim, dim, bias=False) for feat in feature_names
68
+ })
69
+ self.transform_jd_to_resume = nn.ModuleDict({
70
+ feat: nn.Linear(dim, dim, bias=False) for feat in feature_names
71
+ })
72
+
73
+ # Layer normalisation per feature
74
+ self.feature_norm = nn.ModuleDict({
75
+ feat: nn.LayerNorm(dim) for feat in feature_names
76
+ })
77
+
78
+ # Learned blend weights (sigmoid → [0,1])
79
+ self.feature_blend_weights = nn.ParameterDict({
80
+ feat: nn.Parameter(torch.tensor(0.5)) for feat in feature_names
81
+ })
82
+
83
+ # Feature scaling (softplus → positive)
84
+ self.feature_scale = nn.ParameterDict({
85
+ feat: nn.Parameter(torch.ones(1)) for feat in feature_names
86
+ })
87
+
88
+ for feat in feature_names:
89
+ nn.init.orthogonal_(self.transform_resume_to_jd[feat].weight)
90
+ nn.init.orthogonal_(self.transform_jd_to_resume[feat].weight)
91
+
92
+ def forward(
93
+ self,
94
+ resume_features: dict[str, torch.Tensor],
95
+ jd_features: dict[str, torch.Tensor],
96
+ ) -> tuple[dict[str, torch.Tensor], dict[str, torch.Tensor]]:
97
+ resume_to_jd: dict[str, torch.Tensor] = {}
98
+ jd_to_resume: dict[str, torch.Tensor] = {}
99
+
100
+ for feat in self.feature_names:
101
+ if feat in resume_features:
102
+ normed = self.feature_norm[feat](resume_features[feat])
103
+ scaled = normed * (self.feature_scale[feat].abs() + 0.1)
104
+ blend = torch.sigmoid(self.feature_blend_weights[feat])
105
+ shared = self.transform_resume_to_jd_shared(scaled)
106
+ specific = self.transform_resume_to_jd[feat](scaled)
107
+ resume_to_jd[feat] = (1 - blend) * shared + blend * specific
108
+
109
+ if feat in jd_features:
110
+ normed = self.feature_norm[feat](jd_features[feat])
111
+ scaled = normed * (self.feature_scale[feat].abs() + 0.1)
112
+ blend = torch.sigmoid(self.feature_blend_weights[feat])
113
+ shared = self.transform_jd_to_resume_shared(scaled)
114
+ specific = self.transform_jd_to_resume[feat](scaled)
115
+ jd_to_resume[feat] = (1 - blend) * shared + blend * specific
116
+
117
+ return resume_to_jd, jd_to_resume
118
+
119
+
120
+ # ---------------------------------------------------------------------------
121
+ # Lazy model loader
122
+ # ---------------------------------------------------------------------------
123
+
124
+ def _get_model() -> BidirectionalAlignmentModel:
125
+ global _model
126
+ if _model is None:
127
+ if not _WEIGHTS_PATH.exists():
128
+ raise FileNotFoundError(
129
+ f"ConFiT weights not found: {_WEIGHTS_PATH}\n"
130
+ "Upload the file to that path (or to HuggingFace Hub and load via hf_hub_download)."
131
+ )
132
+ _model = BidirectionalAlignmentModel(_DIM, _FEATURE_NAMES).to(_device)
133
+ checkpoint = torch.load(str(_WEIGHTS_PATH), map_location=_device, weights_only=True)
134
+ # Handle both raw state dict and checkpoint dict
135
+ state = checkpoint.get("model_state_dict", checkpoint) if isinstance(checkpoint, dict) else checkpoint
136
+ _model.load_state_dict(state)
137
+ _model.eval()
138
+ return _model
139
+
140
+
141
+ # ---------------------------------------------------------------------------
142
+ # Internal helpers
143
+ # ---------------------------------------------------------------------------
144
+
145
+ def _to_tensor_dict(spans: dict[str, list[float]]) -> dict[str, torch.Tensor]:
146
+ """Convert float-list span dict to batched tensor dict (batch_size=1)."""
147
+ return {
148
+ key: torch.tensor(emb, dtype=torch.float32).unsqueeze(0).to(_device)
149
+ for key, emb in spans.items()
150
+ }
151
+
152
+
153
+ def _from_tensor_dict(tensor_dict: dict[str, torch.Tensor]) -> dict[str, list[float]]:
154
+ """Convert batched tensor dict back to float-list dict."""
155
+ return {key: t[0].cpu().tolist() for key, t in tensor_dict.items()}
156
+
157
+
158
+ # ---------------------------------------------------------------------------
159
+ # Public API
160
+ # ---------------------------------------------------------------------------
161
+
162
+ @torch.no_grad()
163
+ def align_embedding(embedding: list[float], feature: str, direction: str) -> list[float]:
164
+ """Align a single 768-dim embedding for a named feature.
165
+
166
+ Args:
167
+ embedding: 768-dim float list from SBERT or LayoutLM.
168
+ feature: One of 'full', 'education', 'experience', 'leadership'.
169
+ direction: 'to_jd' — resume embedding → JD space
170
+ 'to_resume' — JD embedding → resume space
171
+
172
+ Returns:
173
+ Aligned 768-dim float list.
174
+ """
175
+ result = align_spans({feature: embedding}, direction)
176
+ return result[feature]
177
+
178
+
179
+ def align_spans(spans: dict[str, list[float]], direction: str) -> dict[str, list[float]]:
180
+ """Align all embedding spans using BidirectionalAlignmentModel.
181
+
182
+ Args:
183
+ spans: Dict of feature_name → 768-dim float list.
184
+ Keys must be a subset of ['full', 'education', 'experience', 'leadership'].
185
+ direction: 'to_jd' — resume embeddings → JD space
186
+ 'to_resume' — JD embeddings → resume space
187
+
188
+ Returns:
189
+ Same structure with aligned embeddings.
190
+ """
191
+ model = _get_model()
192
+ features = _to_tensor_dict(spans)
193
+
194
+ with torch.no_grad():
195
+ if direction == "to_jd":
196
+ # Resume → JD: only populate resume_features side
197
+ resume_to_jd, _ = model(features, {})
198
+ return _from_tensor_dict(resume_to_jd)
199
+ else:
200
+ # JD → Resume: only populate jd_features side
201
+ _, jd_to_resume = model({}, features)
202
+ return _from_tensor_dict(jd_to_resume)