Hybrid NLI Model: DeBERTa + Classical Features
Table of Contents:
Project Overview
Features & Contributions
Installation
Data Preparation
Modeling Pipeline
Hybrid Model Training
Evaluation & Statistical Testing
Stress Testing
Impossibility Testing
Causal Inference
A/B Testing
Deployment
-- License
MIT
-- Project Overview
This project implements a hybrid Natural Language Inference (NLI) model that combines DeBERTa embeddings with handcrafted classical features for improved performance. The hybrid model is evaluated rigorously using statistical testing, stress testing, and causal inference to ensure robustness before deployment as a Model-as-a-Service (MaaS).
Features & Contributions
Hybrid Feature Engineering:
Transformer embeddings (DeBERTa v3 base)
Classical NLP features: token length, lexical overlap, Jaccard similarity, negation counts
Statistical Testing:
Wilcoxon signed-rank test
Bootstrap confidence intervals
Cohen’s D, Cliff’s Delta, Conditional Error Reduction
Stress Testing:
Noise injection
Feature ablation
Adversarial perturbation
Counterfactual consistency
Identity invariance
Causal Inference & A/B Testing:
Treatment effect estimation (ATE, doubly robust)
Frequentist two-proportion Z-test
Bayesian Beta-Bernoulli A/B testing
Deployment Ready:
Ready for FastAPI + Docker deployment as a MaaS
Preprocessing pipelines compatible with test datasets
-- Installation
Clone the repository:
git clone https://github.com/username/hybrid-nli.git
cd hybrid-nli
Create a virtual environment (optional but recommended):
python -m venv venv
source venv/bin/activate # Linux / Mac
venv\Scripts\activate # Windows
Install dependencies:
pip install -r requirements.txt
Key dependencies: torch, transformers, scikit-learn, pandas, numpy, scipy, statsmodels, shap, lime, mlflow
Data Preparation
The dataset should include:
id : Unique identifier
premise : Premise text
hypothesis : Hypothesis text
lang_abv : Language abbreviation
language : Language name
Preprocessing and feature extraction are handled via the add_nli_features() function.
from nli_features import add_nli_features
df = add_nli_features(df)
Modeling Pipeline
- DeBERTa Embeddings from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("microsoft/deberta-v3-base") model = AutoModelForSequenceClassification.from_pretrained("microsoft/deberta-v3-base")
Embeddings are extracted with extract_embeddings(texts) function.
Output: np.ndarray of shape (N, 768).
- Classical Features
Features: prem_len, hyp_len, len_diff, token_overlap, jaccard, prem_neg, hyp_neg, neg_xor
Normalization/encoding handled internally in the pipeline.
Hybrid Model Training
Classical classifier: LogisticRegression with class_weight='balanced'
Hybrid feature matrix: [DeBERTa embeddings] + [Classical features]
from sklearn.linear_model import LogisticRegression clf = LogisticRegression(C=1.0, solver="lbfgs", max_iter=1000, n_jobs=-1, class_weight="balanced") clf.fit(X_train, y_train)
Evaluation & Statistical Testing
Metrics: Accuracy, F1 Macro
Statistical Tests:
Wilcoxon signed-rank
Cohen’s D
Cliff’s Delta
Conditional Error Reduction
Bootstrap Confidence Intervals
Example:
from sklearn.metrics import accuracy_score, f1_score acc = accuracy_score(y_test, pred_hybrid) f1 = f1_score(y_test, pred_hybrid, average="macro")
Stress Testing
Noise Injection: Random character swaps
Feature Ablation: Remove classical features
Adversarial Perturbation: Swap adjacent tokens
Counterfactual Consistency: Partial text truncation
Identity Invariance: Replace named entities
Example:
X_noisy = sp.hstack([X_text_noisy_sparse, X_other_test_sparse]) acc_noisy, f1_noisy = evaluate_model(clf, X_noisy, y_test)
Causal Inference
ATE (Average Treatment Effect) via logistic regression
Doubly Robust Estimation
Placebo Testing
import statsmodels.api as sm model = sm.Logit(df["correct"], df[["treatment"] + confounders]).fit()
A/B Testing
Frequentist: Two-proportion Z-test
Bayesian: Beta-Bernoulli posterior sampling
from statsmodels.stats.proportion import proportions_ztest stat, pval = proportions_ztest([successes_hybrid, successes_deberta], [n_hybrid, n_deberta])
Deployment
FastAPI backend serving the hybrid model
Docker containerization
Supports MaaS (Model-as-a-Service) architecture
License
MIT License