| --- |
| library_name: transformers |
| base_model: huawei-noah/TinyBERT_General_4L_312D |
| language: |
| - en |
| license: mit |
| pipeline_tag: text-classification |
| task_ids: |
| - fact-checking |
| tags: |
| - edge-rag |
| - semantic-filtering |
| - hallucination-reduction |
| - cross-encoder |
| metrics: |
| - accuracy |
| - precision |
| - recall |
| - roc_auc |
| model-index: |
| - name: LF_BERT_v1 |
| results: |
| - task: |
| type: fact-checking |
| name: Semantic Evidence Filtering |
| dataset: |
| name: Project Sentinel (HotpotQA-derived) |
| type: hotpotqa/hotpot_qa |
| metrics: |
| - type: accuracy |
| value: 0.8167 |
| - type: precision |
| value: 0.5907 |
| - type: recall |
| value: 0.8674 |
| - type: roc_auc |
| value: 0.9064 |
| --- |
| |
| # LF_BERT_v1 |
|
|
| **LF_BERT_v1** is a lightweight **TinyBERT-based cross-encoder** fine-tuned for **semantic evidence filtering** in **Retrieval-Augmented Generation (RAG)** pipelines. |
|
|
| The model acts as a *semantic gatekeeper*, scoring `(query, candidate_sentence)` pairs to determine whether the sentence is **factually useful evidence** or a **semantic distractor**. |
| It is designed for **CPU-only, edge, and offline deployments**, with millisecond-level inference latency. |
|
|
| This model is the core filtering component of **Project Sentinel**. |
|
|
| --- |
|
|
| ## Model Description |
|
|
| - **Architecture:** TinyBERT (4 layers, 312 hidden size) |
| - **Type:** Cross-encoder (joint encoding of query and sentence) |
| - **Task:** Binary fact-checking / evidence verification |
| - **Base Model:** `huawei-noah/TinyBERT_General_4L_312D` |
| - **Inference Latency:** ~5.3 ms (CPU) |
|
|
| ### Input Format |
|
|
| ``` |
| [CLS] query [SEP] candidate_sentence [SEP] |
| ``` |
|
|
| - Maximum sequence length: 512 tokens |
|
|
| ### Output |
|
|
| - Probability score ∈ [0,1] representing **factual utility** |
| - Typical deployment threshold: **0.85** (Strict Guard configuration) |
|
|
| --- |
|
|
| ## Intended Use |
|
|
| ✔ Semantic filtering for RAG pipelines |
| ✔ Hallucination reduction |
| ✔ Early-exit decision systems |
| ✔ Edge / offline LLM deployments |
|
|
| This model is especially suited for: |
| - Local document QA systems |
| - Privacy-sensitive environments |
| - Resource-constrained hardware (≤ 8 GB RAM) |
|
|
| --- |
|
|
| ## Limitations |
|
|
| - Trained on Wikipedia-based QA (HotpotQA) |
| - English-only |
| - Sentence-level relevance (not passage-level reasoning) |
| - Not a factual verifier for open-world claims |
|
|
| Performance may degrade on highly domain-specific or non-factual corpora. |
|
|
| --- |
|
|
| ## Training Data |
|
|
| The model was trained on a **binary dataset derived from HotpotQA (Distractor setting)**. |
|
|
| ### Labels |
|
|
| - **1 – Supporting Fact:** Ground-truth evidence sentences |
| - **0 – Distractor:** Topically similar but factually insufficient sentences |
|
|
| ### Dataset Statistics |
|
|
| | Split | Samples | |
| |------|--------| |
| | Train | 69,101 | |
| | Validation | 7,006 | |
|
|
| The dataset is intentionally **imbalanced**, reflecting real retrieval scenarios. |
|
|
| --- |
|
|
| ## Training Procedure |
|
|
| ### Hyperparameters |
|
|
| - Learning rate: `1e-5` |
| - Batch size: `16` |
| - Epochs: `2` |
| - Optimizer: AdamW |
| - Scheduler: Linear |
| - Seed: `42` |
| - Loss: Weighted cross-entropy |
|
|
| ### Training Results |
|
|
| | Epoch | Validation Loss | F1 | Accuracy | Precision | Recall | ROC-AUC | |
| |------|-----------------|----|----------|-----------|--------|--------| |
| | 1 | 0.4003 | 0.7119 | 0.8290 | 0.6146 | 0.8457 | 0.9038 | |
| | 2 | 0.4042 | 0.7028 | 0.8167 | 0.5907 | 0.8674 | 0.9064 | |
|
|
| --- |
|
|
| ## Thresholded Performance (Strict Guard) |
|
|
| - **Decision threshold:** 0.85 |
| - **Hallucination rate:** 5.92% |
| - **Fact retention:** 60.34% |
| - **Average latency:** 5.30 ms (CPU) |
|
|
| This configuration prioritizes **trustworthiness over recall**. |
|
|
| --- |
|
|
| ## Citation |
|
|
| If you use this model, please cite: |
|
|
| ``` |
| @article{salih2026sentinel, |
| title={Project Sentinel: Lightweight Semantic Filtering for Edge RAG}, |
| author={Salih, El Mehdi and Ait El Mouden, Khaoula and Akchouch, Abdelhakim}, |
| year={2026} |
| } |
| ``` |
|
|
| --- |
|
|
| ## Contact |
|
|
| **El Mehdi Salih** |
| Mohammed V University – Rabat |
| Email: elmehdi_salih@um5.ac.ma |
| |