OptiMind-SFT / README.md

Slightly edit the citation information

817e066 verified 4 days ago

6.6 kB

	---
	license: mit
	language:
	- en
	base_model:
	- unsloth/gpt-oss-20b-BF16
	tags:
	- optimization
	- operations-research
	- milp
	- gurobi
	- sft
	- transformers
	---

	# Model Overview

	OptiMind-SFT is a specialized 20B parameter model designed to bridge the gap between natural language and executable optimization solvers. It automates the translation of complex decision-making problems—such as supply chain planning, scheduling, and resource allocation—into correct MILP formulations.


	# Model Summary

	Developer: Microsoft Research, Machine Learning and Optimization (MLO) Group \
	Model Architecture: Mixture-of-Experts (MoE) variant of the transformer architecture (gpt-oss family). \
	Parameters: 20 Billion (3.6B activated) \
	Inputs: Natural language optimization problem description. \
	Context Length: 128,000 tokens \
	Outputs: Mathematical formulation and executable Python code using GurobiPy. \
	GPUs: 8x NVIDIA B200 (Training), 8x NVIDIA H100 (Inference/Evaluation) \
	Training Time: ~8 hours \
	Public Data Summary: Cleaned subsets of [OR-Instruct](https://huggingface.co/datasets/CardinalOperations/OR-Instruct-Data-3K) and [OptMATH-Train](https://huggingface.co/datasets/Aurora-Gem/OptMATH-Train) \
	Dates: Trained in October 2025 \
	Status: Static model trained on cleaned public datasets \
	Release Date: November 2025 \
	License: MIT \
	Model Dependencies: [unsloth/gpt-oss-20b-BF16](https://huggingface.co/unsloth/gpt-oss-20b-BF16) \
	Additional Assets: [GitHub Repository](https://github.com/microsoft/OptiGuide)


	# Usage

	## Sample Useage

	OptiMind-SFT is best served with SGLang. we use SGLang’s OpenAI-compatible API together with the official openai Python client:

	```
	pip install "sglang[all]" openai gurobipy

	# Make sure you have a valid Gurobi license and PYTHON>=3.12
	python -m sglang.launch_server \
	--model-path microsoft/OptiMind-SFT \
	--host 0.0.0.0 \
	--port 30000 \
	--tensor-parallel-size 1 \
	--trust-remote-code
	```

	Below is the sample code to query the model:
	```
	from openai import OpenAI

	# SGLang exposes an OpenAI-compatible endpoint
	client = OpenAI(
	base_url="http://localhost:30000/v1",
	api_key="EMPTY" # Not used by local SGLang, but required by the client
	)

	system_prompt = """You are an expert in optimization and mixed integer programming. You are given an
	optimization problem and you need to solve it using gurobipy.
	Reason step by step before generating the gurobipy code.
	When you respond, first think carefully.
	After thinking, output the math modeling of the problem.
	Finally output a ```python ...``` code block that solves the problem.
	The code must include:
	import gurobipy as gp
	from gurobipy import GRB
	"""

	user_problem = "A factory produces products A and B with capacity and demand constraints ..."

	response = client.chat.completions.create(
	model="microsoft/OptiMind-SFT",
	messages=[
	{"role": "system", "content": system_prompt},
	{"role": "user", "content": user_problem},
	],
	temperature=0.9, # recommended default
	top_p=1.0, # recommended default
	max_tokens=4096,
	)

	print(response.choices[0].message.content)
	```

	This will return a response that first describes the mathematical model and then includes a python code block implementing it in gurobipy.


	## Primary Use Cases

	- Translating natural-language Operations Research (OR) problems into mixed-integer linear programs (MILPs) and corresponding `gurobipy` code for research and prototyping.
	- Studying and benchmarking NL to MILP modeling pipelines on public OR datasets such as IndustryOR, Mamo-Complex, and OptMATH.
	- Educational use for teaching how to derive optimization models (variables, constraints, objectives) from informal problem descriptions.
	- Performing ablations and research on solver-in-the-loop prompting and multi-turn correction in domain-specific modeling tasks.

	## Out-of-Scope Use Cases

	- General-purpose chat, open-domain reasoning, or tasks unrelated to optimization modeling.
	- Safety-critical or regulated applications (e.g., healthcare, finance, legal decisions, credit scoring) without expert human review of both the model output and the resulting optimization.
	- Fully automated deployment where optimization results are used directly for real-world decisions without human oversight.
	- Automatic execution of generated code in production systems without sandboxing, logging, and appropriate security controls.


	## Technical Requirements & Integration

	We recommend ≥32GB GPU VRAM (e.g., A100/H100/B200) for comfortable inference, especially for long prompts and multi-turn interactions.
	Please checkout our [GitHub page](https://github.com/microsoft/OptiGuide) for instructions on the inference pipeline.

	# Data Overview
	## Training and Validation Data
	We fine-tune OptiMind-SFT on cleaned versions of the OR-Instruct and OptMATH training sets, and validate on a held-out validation split drawn from the same cleaned corpora.

	## Testing Data
	For testing, we use manually cleaned and expert-validated versions of the IndustryOR, Mamo-Complex, and OptMATH benchmarks. Please visit our [GitHub page](https://github.com/microsoft/OptiGuide) to download the cleaned benchmarks.

	# Known Technical Limitations

	- The model can still produce incorrect formulations or invalid code, or declare feasibility/optimality incorrectly.
	- It is specialized to OR benchmarks; behavior on general text or other problem domains is not guaranteed.
	- No dedicated red-teaming against unsafe content categories (e.g., hate, violence, self-harm) or jailbreak attacks has been performed; the paper focuses on technical robustness metrics.

	Users must keep a human in the loop for all consequential decisions and carefully review any generated code before execution.

	# Other Sources & Maintenance
	- Evaluation code and cleaned benchmarks: [GitHub page](https://github.com/microsoft/OptiGuide)
	- Paper: [Arxiv link](https://arxiv.org/abs/2509.22979)
	For questions, issues, or feature requests, please use the GitHub issue tracker or the Hugging Face “Community” tab.

	# Citation
	If you use OptiMind-SFT or the associated datasets/benchmarks in your work, please cite:

	```
	@article{zhang2025optimind,
	title={OptiMind: Teaching LLMs to Think Like Optimization Experts},
	author={Zhang, Xinzhi and Chen, Zeyi and Zope, Humishka and Barbalho, Hugo and Mellou, Konstantina and Molinaro, Marco and Kulkarni, Janardhan and Menache, Ishai and Li, Sirui},
	journal={arXiv preprint arXiv:2509.22979},
	year={2025}
	}
	```