llm-pushback / README.md
Sahil Seemant
Optimize cloud build: use Python 3.11 and remove vllm
bedb746

A newer version of the Streamlit SDK is available: 1.57.0

Upgrade
metadata
title: LLM Pushback
emoji: 🧠
colorFrom: blue
colorTo: indigo
sdk: streamlit
sdk_version: 1.35.0
python_version: 3.11
app_file: chat_gui.py
pinned: false

Ministral Alignment Lab: BullshitBench Evaluation

A specialized research project focused on aligning small language models (Ministral-3B and Qwen-4B) to identify and push back against nonsensical logical premises using Direct Preference Optimization (DPO) and Supervised Fine-Tuning (SFT) on Apple Silicon (MLX).

πŸš€ Key Results

  • Ministral-3B: Improved from 4.0% (Baseline) to 74.2% (SFT V3) Green Rate.
  • Qwen-4B: Jumped from 35.0% (Baseline) to 79.0% (SFT V1) Green Rate with 100% accuracy in Physics.

πŸ“¦ Project Structure

  • chat_gui.py: Interactive Streamlit dashboard with real-time reasoning visualization.
  • finetune.py / post_eval.py: Core pipeline for LoRA fine-tuning and evaluation.
  • data/: Curated dataset of reasoning pairs used for alignment.
  • deepseek_evaluation_report.md: Full technical audit of model performance.

πŸ› οΈ Setup

  1. Requirements: pip install -r requirements.txt
  2. Run GUI: streamlit run chat_gui.py

πŸ“Š Deployment

For sharing within your network, see the Deployment Guide.


Evaluated using DeepSeek R1 14B as the judge.