# Nepali Hate Content Detection — API Reference > **Base URL:** `http://localhost:8000` > **Interactive docs:** `http://localhost:8000/docs` (Swagger UI) > **Start server:** `uvicorn backend.app.main:app --reload --host 0.0.0.0 --port 8000` from `major_project/` root --- ## Table of Contents 1. [Health Check](#1-health-check) 2. [Status & Capabilities](#2-status--capabilities) 3. [Predict — Single Text](#3-predict--single-text) 4. [Analyze — Preprocessing Info](#4-analyze--preprocessing-info) 5. [Explain — LIME](#5-explain--lime) 6. [Explain — SHAP](#6-explain--shap) 7. [Explain — Captum IG](#7-explain--captum-ig) 8. [Batch Predict (Streaming)](#8-batch-predict-streaming) 9. [History — Fetch](#9-history--fetch) 10. [History — Stats](#10-history--stats) 11. [History — Clear](#11-history--clear) 12. [Error Reference](#12-error-reference) 13. [TypeScript Types](#13-typescript-types) 14. [Frontend Integration Guide](#14-frontend-integration-guide) --- ## 1. Health Check ``` GET /health ``` Returns whether the server is up and the model has finished loading. Call this on app mount to gate the UI. **Response `200`** ```json { "status": "ok", "model_loaded": true, "device": "cpu" } ``` | Field | Type | Notes | |---|---|---| | `status` | `string` | Always `"ok"` if server is running | | `model_loaded` | `boolean` | `false` while model is still downloading/loading at startup | | `device` | `string` | `"cpu"` or `"cuda"` | **Frontend use:** Poll this every 2 seconds on mount until `model_loaded === true`, then unlock the main UI. --- ## 2. Status & Capabilities ``` GET /api/status ``` Returns which optional XAI packages are installed. Call once on load to decide which Explain buttons to show or hide. **Response `200`** ```json { "model_loaded": true, "device": "cpu", "preprocessor": true, "lime": true, "shap": true, "captum": false } ``` | Field | Type | Notes | |---|---|---| | `model_loaded` | `boolean` | Same as `/health` | | `device` | `string` | `"cpu"` or `"cuda"` | | `preprocessor` | `boolean` | If `false`, raw text is passed to model without script conversion | | `lime` | `boolean` | Whether `lime` package is installed | | `shap` | `boolean` | Whether `shap` package is installed | | `captum` | `boolean` | Whether `captum` package is installed | **Frontend use:** If `captum === false`, disable the Captum tab. Same for LIME/SHAP. --- ## 3. Predict — Single Text ``` POST /api/predict Content-Type: application/json ``` Core endpoint. Preprocesses input → runs XLM-RoBERTa-large → returns label + probabilities + preprocessing details. **Request body** ```json { "text": "महिलाले घरमा बस्नु पर्छ", "save_to_history": true } ``` | Field | Type | Required | Notes | |---|---|---|---| | `text` | `string` | ✅ | 1–5000 chars. Devanagari, Romanized Nepali, English, or mixed. Must not be whitespace only | | `save_to_history` | `boolean` | ❌ | Default `true`. Saves result to `data/prediction_history.jsonl` as background task | **Response `200`** ```json { "prediction": "OS", "confidence": 0.9909, "probabilities": { "NO": 0.0034, "OO": 0.0041, "OR": 0.0016, "OS": 0.9909 }, "original_text": "महिलाले घरमा बस्नु पर्छ", "preprocessed_text": "महिलाले घरमा बस्नु पर्छ", "emoji_features": { "has_hate_emoji": 0, "has_mockery_emoji": 0, "has_positive_emoji": 0, "has_sadness_emoji": 0, "has_fear_emoji": 0, "has_disgust_emoji": 0, "hate_emoji_count": 0, "mockery_emoji_count": 0, "positive_emoji_count": 0, "sadness_emoji_count": 0, "fear_emoji_count": 0, "disgust_emoji_count": 0, "total_emoji_count": 0, "hate_to_positive_ratio": 0.0, "has_mixed_sentiment": 0, "unknown_emoji_count": 0, "has_unknown_emoji": 0, "known_emoji_ratio": 1.0 }, "script_info": { "script_type": "devanagari", "confidence": 0.98 }, "error": null } ``` **Prediction labels** | Label | Meaning | Display color | |---|---|---| | `NO` | Non-offensive | Green `#28a745` | | `OO` | Other-offensive (general) | Yellow `#ffc107` | | `OR` | Offensive-Racist (race/ethnicity/religion hate) | Red `#dc3545` | | `OS` | Offensive-Sexist (gender/sexuality hate) | Purple `#6f42c1` | **`emoji_features` fields** 18 fields total. All are `int` except `hate_to_positive_ratio` and `known_emoji_ratio` which are `float`. | Field | Description | |---|---| | `has_hate_emoji` | Binary flag: 1 if text contains anger/weapon emojis | | `hate_emoji_count` | Count of hate-related emojis | | `has_positive_emoji` | Binary flag | | `positive_emoji_count` | Count | | `total_emoji_count` | Total emoji count | | `hate_to_positive_ratio` | `hate_count / max(positive_count, 1)` | | `has_mixed_sentiment` | 1 if both hate and positive emojis present | | `unknown_emoji_count` | Emojis not in the mapping dictionary | | `known_emoji_ratio` | Fraction of emojis that have Nepali translations | **`script_info` fields** | Field | Description | |---|---| | `script_type` | One of: `devanagari`, `romanized_nepali`, `english`, `mixed`, `other` | | `confidence` | Float 0–1 | **Error cases** | Status | Condition | |---|---| | `422` | Empty or whitespace-only text | | `503` | Model not yet loaded | | `503` | Out of memory during inference | | `500` | Unexpected server error | --- ## 4. Analyze — Preprocessing Info ``` POST /api/analyze Content-Type: application/json ``` Lightweight endpoint — runs only script detection and emoji analysis, does **not** run the model. Use for the preprocessing details panel without triggering a full prediction. **Request body** ```json { "text": "timi murkha chau 😡" } ``` | Field | Type | Required | Notes | |---|---|---|---| | `text` | `string` | ✅ | 1–5000 chars | **Response `200`** ```json { "script_info": { "script_type": "romanized_nepali", "confidence": 0.80 }, "emoji_info": { "emojis_found": ["😡"], "total_count": 1, "known_emojis": ["😡"], "known_count": 1, "unknown_emojis": [], "unknown_count": 0, "coverage": 1.0 } } ``` **`emoji_info` fields** | Field | Type | Description | |---|---|---| | `emojis_found` | `string[]` | All emoji characters found in text | | `total_count` | `number` | Total emoji count | | `known_emojis` | `string[]` | Emojis that have a Nepali translation mapping | | `known_count` | `number` | | | `unknown_emojis` | `string[]` | Emojis not in the mapping dictionary | | `unknown_count` | `number` | | | `coverage` | `number` | `known_count / total_count`, or `1.0` if no emojis | --- ## 5. Explain — LIME ``` POST /api/explain/lime Content-Type: application/json ``` Generates word-level importance scores using LIME (Local Interpretable Model-agnostic Explanations). LIME perturbs the **preprocessed** text, so token labels always align with what the model saw. **Request body** ```json { "text": "महिलाले घरमा बस्नु पर्छ", "num_samples": 200, "n_steps": 50 } ``` | Field | Type | Required | Default | Notes | |---|---|---|---|---| | `text` | `string` | ✅ | — | 1–2000 chars (shorter limit than predict — LIME runs many model calls) | | `num_samples` | `integer` | ❌ | `200` | Range 50–1000. Higher = more reliable scores, higher latency | | `n_steps` | `integer` | ❌ | `50` | Only used by Captum, ignored here | **Response `200`** ```json { "method": "LIME", "prediction": "OS", "confidence": 0.9909, "word_scores": [ { "word": "घरमा", "score": 0.182 }, { "word": "महिलाले", "score": 0.143 }, { "word": "बस्नु", "score": 0.091 }, { "word": "पर्छ", "score": -0.034 } ], "preprocessed_text": "महिलाले घरमा बस्नु पर्छ", "convergence_delta": null, "error": null } ``` **`word_scores` interpretation** | Score | Meaning | |---|---| | Positive | Word pushes prediction **toward** the predicted class | | Negative | Word pushes prediction **away** from the predicted class | | High absolute value | Strong influence | Words are returned in LIME's natural order (by score magnitude). Sort by `abs(score)` descending for a ranked importance bar chart. **Frontend rendering:** Horizontal bar chart. Positive bars green, negative bars red. Display `word` on the y-axis. --- ## 6. Explain — SHAP ``` POST /api/explain/shap Content-Type: application/json ``` Generates attributions using SHAP. Falls back to leave-one-out occlusion if the primary SHAP text masker fails. **Request body** — same shape as LIME. `num_samples` is ignored; `n_steps` is ignored. ```json { "text": "महिलाले घरमा बस्नु पर्छ" } ``` **Response `200`** ```json { "method": "SHAP", "prediction": "OS", "confidence": 0.9909, "word_scores": [ { "word": "घरमा", "score": 0.211 }, { "word": "महिलाले", "score": 0.178 }, { "word": "बस्नु", "score": 0.095 }, { "word": "पर्छ", "score": -0.021 } ], "preprocessed_text": "महिलाले घरमा बस्नु पर्छ", "convergence_delta": null, "error": null } ``` Word scores are **sorted by descending absolute value** — most influential words first. If the fallback was used, `error` will be `"Used gradient_fallback"` (not a failure — result is still valid). --- ## 7. Explain — Captum IG ``` POST /api/explain/captum Content-Type: application/json ``` Generates subword token attributions using Layer Integrated Gradients (Captum). Works at the subword tokenizer level, so words may appear as `▁महिलाले` (SentencePiece prefix). **Request body** ```json { "text": "महिलाले घरमा बस्नु पर्छ", "n_steps": 50 } ``` | Field | Type | Required | Default | Notes | |---|---|---|---|---| | `text` | `string` | ✅ | — | 1–2000 chars | | `n_steps` | `integer` | ❌ | `50` | Range 10–200. Increase to 100+ if `convergence_delta > 0.05` | | `num_samples` | `integer` | ❌ | `200` | Only used by LIME, ignored here | **Response `200`** ```json { "method": "Captum-IG", "prediction": "OS", "confidence": 0.9909, "word_scores": [ { "word": "महिलाले", "score": 0.842 }, { "word": "घरमा", "score": 0.631 }, { "word": "बस्नु", "score": 0.417 }, { "word": "पर्छ", "score": 0.203 } ], "preprocessed_text": "महिलाले घरमा बस्नु पर्छ", "convergence_delta": 0.0031, "error": null } ``` | Field | Notes | |---|---| | `word_scores[].score` | Signed attribution (sum of subword attributions). Positive = contributes to prediction | | `convergence_delta` | Quality indicator. Values below `0.05` = reliable. Increase `n_steps` if high | **⚠️ Memory warning:** Captum is the most memory-intensive method. It may return `422` on low-RAM cloud deployments. Use LIME or SHAP as fallback — the frontend should check `captum` in `/api/status` before showing this option. --- ## 8. Batch Predict (Streaming) ``` POST /api/batch Content-Type: application/json ``` Classifies multiple texts and **streams results back as NDJSON** (Newline-Delimited JSON). Each text is processed independently — an error on one does not abort the batch. **Request body** ```json { "texts": [ "यो राम्रो छ", "तिमी मुर्ख हौ", "timi murkha chau" ] } ``` | Field | Type | Required | Notes | |---|---|---|---| | `texts` | `string[]` | ✅ | 1–200 items. Empty strings are stripped silently | **Response — NDJSON stream** `Content-Type: application/x-ndjson` Each line is a complete JSON object. Two types of lines: **Progress line** (one per text): ```json { "index": 0, "total": 3, "result": { "text": "यो राम्रो छ", "full_text": "यो राम्रो छ", "prediction": "NO", "confidence": 0.9721, "preprocessed_text": "यो राम्रो छ" } } ``` **Final sentinel line** (last line always): ```json { "done": true, "total": 3 } ``` **Error result** (when one text fails): ```json { "index": 1, "total": 3, "result": { "text": "...", "full_text": "...", "prediction": "Error", "confidence": 0.0, "preprocessed_text": "", "error": "error message" } } ``` **Frontend streaming example (fetch API):** ```javascript const response = await fetch("http://localhost:8000/api/batch", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ texts }), }); const reader = response.body.getReader(); const decoder = new TextDecoder(); let buffer = ""; while (true) { const { done, value } = await reader.read(); if (done) break; buffer += decoder.decode(value, { stream: true }); const lines = buffer.split("\n"); buffer = lines.pop(); // keep incomplete last line for (const line of lines) { if (!line.trim()) continue; const data = JSON.parse(line); if (data.done) { // Batch complete setProgress(100); } else { // Update progress bar and results table setProgress(Math.round(((data.index + 1) / data.total) * 100)); appendResult(data.result); } } } ``` --- ## 9. History — Fetch ``` GET /api/history?limit=100&offset=0 ``` Returns saved predictions in reverse-chronological order (newest first). **Query parameters** | Param | Type | Default | Range | Description | |---|---|---|---|---| | `limit` | `integer` | `100` | 1–500 | Max records to return | | `offset` | `integer` | `0` | ≥0 | Skip this many records from the newest end | **Response `200`** ```json { "items": [ { "timestamp": "2026-04-10T10:23:41.123456", "text": "तिमी मुर्ख हौ", "prediction": "OO", "confidence": 0.8732, "probabilities": { "NO": 0.08, "OO": 0.87, "OR": 0.03, "OS": 0.02 }, "preprocessed_text": "तिमी मुर्ख हौ", "emoji_features": { "total_emoji_count": 0, "..." : "..." } } ], "total": 42, "limit": 100, "offset": 0 } ``` **Pagination example:** ``` Page 1: GET /api/history?limit=20&offset=0 Page 2: GET /api/history?limit=20&offset=20 Page 3: GET /api/history?limit=20&offset=40 ``` --- ## 10. History — Stats ``` GET /api/history/stats ``` Returns aggregated statistics without fetching every record. Use for the dashboard summary row. **Response `200` (with history)** ```json { "total": 42, "avg_confidence": 0.8741, "class_counts": { "NO": 18, "OO": 12, "OR": 5, "OS": 7 }, "most_common_class": "NO" } ``` **Response `200` (empty history)** ```json { "total": 0, "avg_confidence": null, "class_counts": {}, "most_common_class": null } ``` --- ## 11. History — Clear ``` DELETE /api/history ``` Permanently deletes the history file. No confirmation prompt — handle that in the UI. **Response `200`** ```json { "message": "History cleared. 42 record(s) deleted.", "deleted_count": 42 } ``` **Response `404`** (if already empty) ```json { "detail": "History is already empty — nothing to clear." } ``` --- ## 12. Error Reference All error responses follow FastAPI's standard shape: ```json { "detail": "Human-readable error message" } ``` | Status | Meaning | When it happens | |---|---|---| | `422` | Validation error | Empty text, batch > 200, invalid field types | | `503` | Service unavailable | Model still loading at startup, out of memory | | `404` | Not found | History already empty on DELETE | | `500` | Internal server error | Unexpected exception in inference or XAI | --- ## 13. TypeScript Types Copy these into your React/Vite project: ```typescript // ── Labels ────────────────────────────────────────────────────────────────── export type Label = "NO" | "OO" | "OR" | "OS" | "Error"; export const LABEL_META = { NO: { text: "Non-Offensive", color: "#28a745" }, OO: { text: "Other-Offensive", color: "#ffc107" }, OR: { text: "Offensive-Racist", color: "#dc3545" }, OS: { text: "Offensive-Sexist", color: "#6f42c1" }, Error: { text: "Error", color: "#6c757d" }, } as const; // ── /health ────────────────────────────────────────────────────────────────── export interface HealthResponse { status: string; model_loaded: boolean; device: string; } // ── /api/status ─────────────────────────────────────────────────────────────── export interface StatusResponse { model_loaded: boolean; device: string; preprocessor: boolean; lime: boolean; shap: boolean; captum: boolean; } // ── /api/predict ────────────────────────────────────────────────────────────── export interface PredictRequest { text: string; save_to_history?: boolean; } export interface EmojiFeatures { has_hate_emoji: number; has_mockery_emoji: number; has_positive_emoji: number; has_sadness_emoji: number; has_fear_emoji: number; has_disgust_emoji: number; hate_emoji_count: number; mockery_emoji_count: number; positive_emoji_count: number; sadness_emoji_count: number; fear_emoji_count: number; disgust_emoji_count: number; total_emoji_count: number; hate_to_positive_ratio: number; has_mixed_sentiment: number; unknown_emoji_count: number; has_unknown_emoji: number; known_emoji_ratio: number; } export interface ScriptInfo { script_type: "devanagari" | "romanized_nepali" | "english" | "mixed" | "other"; confidence: number; } export interface PredictResponse { prediction: Label; confidence: number; probabilities: Record; original_text: string; preprocessed_text: string; emoji_features: EmojiFeatures; script_info: ScriptInfo | null; error: string | null; } // ── /api/analyze ────────────────────────────────────────────────────────────── export interface AnalyzeRequest { text: string; } export interface EmojiInfo { emojis_found: string[]; total_count: number; known_emojis: string[]; known_count: number; unknown_emojis: string[]; unknown_count: number; coverage: number; } export interface AnalyzeResponse { script_info: ScriptInfo; emoji_info: EmojiInfo; } // ── /api/explain/* ──────────────────────────────────────────────────────────── export interface ExplainRequest { text: string; num_samples?: number; // LIME only, default 200 n_steps?: number; // Captum only, default 50 } export interface WordScore { word: string; score: number; } export interface ExplainResponse { method: "LIME" | "SHAP" | "Captum-IG"; prediction: Label; confidence: number; word_scores: WordScore[]; preprocessed_text: string; convergence_delta: number | null; // Captum only error: string | null; } // ── /api/batch ──────────────────────────────────────────────────────────────── export interface BatchRequest { texts: string[]; } export interface BatchResult { text: string; // truncated to 80 chars full_text: string; prediction: Label; confidence: number; preprocessed_text: string; error?: string; } export interface BatchProgressLine { index: number; total: number; result: BatchResult; } export interface BatchDoneLine { done: true; total: number; } export type BatchStreamLine = BatchProgressLine | BatchDoneLine; // ── /api/history ────────────────────────────────────────────────────────────── export interface HistoryItem { timestamp: string; // ISO 8601 text: string; prediction: Label; confidence: number; probabilities: Record; preprocessed_text: string; emoji_features: EmojiFeatures; } export interface HistoryResponse { items: HistoryItem[]; total: number; limit: number; offset: number; } export interface HistoryStatsResponse { total: number; avg_confidence: number | null; class_counts: Record; most_common_class: string | null; } ``` --- ## 14. Frontend Integration Guide ### Recommended call order on app load ``` 1. GET /health → poll until model_loaded === true 2. GET /api/status → store capabilities, show/hide XAI buttons 3. Ready to accept input ``` ### Single prediction flow ``` user submits text → POST /api/predict → show prediction badge (color from LABEL_META) → show probability bar chart (4 bars) → show preprocessing details (script_info + emoji_features) → if emoji_features.total_emoji_count > 0, show emoji breakdown panel ``` ### Explainability flow ``` user selects LIME / SHAP / Captum tab → check status.lime / status.shap / status.captum before enabling tab → POST /api/explain/lime (or /shap or /captum) → render horizontal bar chart from word_scores - sort by abs(score) descending - positive score → green bar - negative score → red bar → for Captum: show convergence_delta warning if > 0.05 ``` ### Batch flow ``` user pastes texts or uploads CSV → POST /api/batch → read response as NDJSON stream (see streaming example in §8) → update progress bar: (index + 1) / total * 100 → append each result to results table as it arrives → on { done: true }, finalize and enable download CSV ``` ### History flow ``` on History tab open: → GET /api/history/stats → show summary metrics → GET /api/history?limit=20&offset=0 → show table pagination: → GET /api/history?limit=20&offset=N clear button: → confirm in UI first → DELETE /api/history ``` ### CORS The backend allows requests from `http://localhost:5173` (Vite default) and `http://localhost:3000` (CRA default). If you deploy the frontend to a different URL, set the `FRONTEND_URL` environment variable before starting the server: ```bash FRONTEND_URL=https://yourapp.vercel.app uvicorn backend.app.main:app ... ``` ### Environment variables | Variable | Default | Description | |---|---|---| | `MODEL_PATH` | `models/saved_models/xlm_roberta_results/large_final` | Local model path. Falls back to HuggingFace if not found | | `HF_MODEL_ID` | `UDHOV/xlm-roberta-large-nepali-hate-classification` | HuggingFace model ID | | `HISTORY_FILE` | `data/prediction_history.jsonl` | History file location | | `FRONTEND_URL` | `""` | Additional CORS origin for deployed frontend |