YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Hybrid NLI Model: DeBERTa + Classical Features

Table of Contents:

Project Overview
Features & Contributions
Installation
Data Preparation
Modeling Pipeline
Hybrid Model Training
Evaluation & Statistical Testing
Stress Testing
Impossibility Testing
Causal Inference
A/B Testing
Deployment

-- License

MIT

-- Project Overview

This project implements a hybrid Natural Language Inference (NLI) model that combines DeBERTa embeddings with handcrafted classical features for improved performance. The hybrid model is evaluated rigorously using statistical testing, stress testing, and causal inference to ensure robustness before deployment as a Model-as-a-Service (MaaS).

Features & Contributions

Hybrid Feature Engineering:

Transformer embeddings (DeBERTa v3 base)

Classical NLP features: token length, lexical overlap, Jaccard similarity, negation counts

Statistical Testing:

Wilcoxon signed-rank test

Bootstrap confidence intervals

Cohen’s D, Cliff’s Delta, Conditional Error Reduction

Stress Testing:

Noise injection

Feature ablation

Adversarial perturbation

Counterfactual consistency

Identity invariance

Causal Inference & A/B Testing:

Treatment effect estimation (ATE, doubly robust)

Frequentist two-proportion Z-test

Bayesian Beta-Bernoulli A/B testing

Deployment Ready:

Ready for FastAPI + Docker deployment as a MaaS

Preprocessing pipelines compatible with test datasets

-- Installation

Clone the repository:

git clone https://github.com/username/hybrid-nli.git
cd hybrid-nli

Create a virtual environment (optional but recommended):

python -m venv venv
source venv/bin/activate  # Linux / Mac
venv\Scripts\activate     # Windows

Install dependencies:

pip install -r requirements.txt

Key dependencies: torch, transformers, scikit-learn, pandas, numpy, scipy, statsmodels, shap, lime, mlflow

Data Preparation

The dataset should include:

id : Unique identifier

premise : Premise text

hypothesis : Hypothesis text

lang_abv : Language abbreviation

language : Language name

Preprocessing and feature extraction are handled via the add_nli_features() function.

from nli_features import add_nli_features
df = add_nli_features(df)

Modeling Pipeline

DeBERTa Embeddings from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("microsoft/deberta-v3-base") model = AutoModelForSequenceClassification.from_pretrained("microsoft/deberta-v3-base")

Embeddings are extracted with extract_embeddings(texts) function.

Output: np.ndarray of shape (N, 768).

Classical Features

Features: prem_len, hyp_len, len_diff, token_overlap, jaccard, prem_neg, hyp_neg, neg_xor

Normalization/encoding handled internally in the pipeline.

Hybrid Model Training

Classical classifier: LogisticRegression with class_weight='balanced'

Hybrid feature matrix: [DeBERTa embeddings] + [Classical features]

from sklearn.linear_model import LogisticRegression clf = LogisticRegression(C=1.0, solver="lbfgs", max_iter=1000, n_jobs=-1, class_weight="balanced") clf.fit(X_train, y_train)

Evaluation & Statistical Testing

Metrics: Accuracy, F1 Macro

Statistical Tests:

Wilcoxon signed-rank

Cohen’s D

Cliff’s Delta

Conditional Error Reduction

Bootstrap Confidence Intervals

Example:

from sklearn.metrics import accuracy_score, f1_score acc = accuracy_score(y_test, pred_hybrid) f1 = f1_score(y_test, pred_hybrid, average="macro")

Stress Testing

Noise Injection: Random character swaps

Feature Ablation: Remove classical features

Adversarial Perturbation: Swap adjacent tokens

Counterfactual Consistency: Partial text truncation

Identity Invariance: Replace named entities

Example:

X_noisy = sp.hstack([X_text_noisy_sparse, X_other_test_sparse]) acc_noisy, f1_noisy = evaluate_model(clf, X_noisy, y_test)

Causal Inference

ATE (Average Treatment Effect) via logistic regression

Doubly Robust Estimation

Placebo Testing

import statsmodels.api as sm model = sm.Logit(df["correct"], df[["treatment"] + confounders]).fit()

A/B Testing

Frequentist: Two-proportion Z-test

Bayesian: Beta-Bernoulli posterior sampling

from statsmodels.stats.proportion import proportions_ztest stat, pval = proportions_ztest([successes_hybrid, successes_deberta], [n_hybrid, n_deberta])

Deployment

FastAPI backend serving the hybrid model

Docker containerization

Supports MaaS (Model-as-a-Service) architecture

License

MIT License

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support