MICE / README.md

Update README.md

e111026 verified 8 months ago

4.32 kB

	---
	library_name: transformers
	license: apache-2.0
	language:
	- en
	base_model:
	- answerdotai/ModernBERT-large
	pipeline_tag: text-classification
	author: Shreyan C (@thethinkmachine)
	datasets:
	- BhabhaAI/DEITA-Complexity
	---


	![image/png](https://cdn-uploads.huggingface.co/production/uploads/6628d1f30058447ca0a3824a/mJHXrHHfl8AjshtWvxrfK.png)

	# Maxwell Instruction Complexity Estimator (MICE)

	[![Model Version](https://img.shields.io/badge/version-v0.2-blue)](https://huggingface.co/thethinkmachine/Maxwell-Task-Complexity-Scorer-v0.2) [![License](https://img.shields.io/badge/license-Apache%202.0-green)](LICENSE) [![Downloads](https://img.shields.io/badge/downloads-10K%2B-brightgreen)](#)

	A fast, efficient, and accurate instruction complexity scorer powered by ModernBERT-Large. MICE predicts normalized task difficulty scores (0–1) for English instructions, with an easy option to rescale to custom ranges.

	---

	## 🚀 Features

	* Lightweight & Fast: Leverages a compact backbone (ModernBERT-Large + LoRA) with only 14.4M trainable parameters.
	* Data-Driven: Trained on 66.5K English instruction–score pairs from the DEITA-Complexity dataset.
	* High Fidelity: Matches the performance of models 34× larger on standard complexity benchmarks.
	* Flexible Scoring: Outputs normalized scores (0–1) by default, with optional denormalization to any range (e.g., \[1–6], \[0–100]).

	---

	## 🔧 Usage

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	model_name = "thethinkmachine/Maxwell-Task-Complexity-Scorer-v0.2"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForSequenceClassification.from_pretrained(model_name)

	# 1. Get normalized complexity (0–1)
	def get_normalized_score(text: str) -> float:
	inputs = tokenizer(text, return_tensors="pt")
	with torch.no_grad():
	logits = model(**inputs).logits.squeeze()
	return float(logits)

	# 2. Denormalize to [min_score, max_score]
	def get_denormalized_score(text: str, min_score: float = 1, max_score: float = 6) -> float:
	norm = get_normalized_score(text)
	raw = norm * (max_score - min_score) + min_score
	return float(round(raw, 2))

	# Example
	query = "Is learning equivalent to decreasing local entropy?"
	print("Normalized:", get_normalized_score(query))
	print("Evol-Complexity [1–6]:", get_denormalized_score(query))
	```

	---

	## 📖 Model Details

	* Architecture: ModernBERT-Large backbone with LoRA adapters (rank 32, alpha 64, dropout 0.1).
	* Task: Sequence Classification.
	* Languages: English.
	* Training Data: 66,500 instruction–score pairs from \[BhabhaAI/DEITA-Complexity].
	* Normalization: Min–max scaled to \[0,1]; denormalization recommended via `score * (max - min) + min`.

	### Data Distribution

	\| Original Score \| Count \| % \|
	\| -------------- \| ------ \| ----- \|
	\| 1 \| 8,729 \| 13.3% \|
	\| 2 \| 5,399 \| 8.2% \|
	\| 3 \| 10,937 \| 16.7% \|
	\| 4 \| 9,801 \| 15.0% \|
	\| 5 \| 24,485 \| 37.4% \|
	\| 6 \| 6,123 \| 9.3% \|

	Outliers (0,7–9) were pruned (<1% of data).

	---

	## ⚙️ Training Configuration

	* Optimizer: AdamW (lr=5e-5, weight decay=0.01)
	* Batch Size: 8
	* Epochs: 3
	* Max Seq. Length: 512
	* Warmup: 10% of total steps
	* Compute: 50.3M tokens, TTP ratio ≈3.5

	---

	## 🌱 Environmental Impact

	* Compute Used: 16h on 1× NVIDIA L4 GPU (72W TDP) in GCP asia-south1.
	* CO₂ Emissions: 0.87 kg CO₂eq (fully offset).
	* Estimator: ML CO₂ Impact Calculator.

	---

	## 🔍 Bias & Limitations

	* Domain Bias: Trained primarily on general English; may underperform on technical/coding/math instructions.
	* Language: English-only.
	* Scaling Caution: Denormalization preserves ordering but absolute values depend on chosen range.

	---

	## 📚 Citation

	If you use MICE in your research, please cite:

	> Chaubey, S. (2024). Maxwell Instruction Complexity Estimator (MICE). https://huggingface.co/thethinkmachine/MICE

	---

	## 🙋‍♂️ Author & Contact

	Shreyan C ([thethinkmachine](https://huggingface.co/thethinkmachine))
	Email: [shreyan.chaubey@gmail.com](mailto:shreyan.chaubey@gmail.com)

	This project is licensed under the Apache 2.0 License.