zay25's picture
update readme
24f020d verified
metadata
library_name: transformers
tags: []

MNLP_M3_quantized_model

This model is a quantized version of the best-performing MCQA model from our CS-552 Modern NLP project (Milestone 3). It was optimized for efficient inference while maintaining strong accuracy on STEM multiple-choice question answering tasks.

Model Summary

  • Base model: hssawhney/Best-Performing-Model
  • Quantization type: Post-Training Quantization (PTQ)
  • Precision: W8A8
  • Method: SmoothQuant + GPTQ via LLMCompressor
  • Excluded layers: lm_head (to preserve logits quality)
  • Final model size: ~717 MB

Calibration Details

  • Calibration dataset: 512 samples randomly selected from zay25/MNLP_M3_quantized_dataset
  • The calibration set preserves the original format (STEM MCQA) and was selected to represent a broad distribution of question types.

Intended Use

This model is intended for:

  • STEM-focused multiple-choice question answering
  • Educational assistant systems
  • Low-resource inference environments (e.g., CPU, edge devices)

It is not intended for freeform generation or use outside the MCQA format.

License

This model inherits the license of the base model. Check the hssawhney/Best-Performing-Model repo for license terms.