Instructions to use Shinegupta/ShineMath with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Shinegupta/ShineMath with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("AI-MO/NuminaMath-7B-TIR")
model = PeftModel.from_pretrained(base_model, "Shinegupta/ShineMath")

Transformers

How to use Shinegupta/ShineMath with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Shinegupta/ShineMath")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("Shinegupta/ShineMath", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Shinegupta/ShineMath with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Shinegupta/ShineMath"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Shinegupta/ShineMath",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Shinegupta/ShineMath

SGLang

How to use Shinegupta/ShineMath with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Shinegupta/ShineMath" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Shinegupta/ShineMath",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Shinegupta/ShineMath" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Shinegupta/ShineMath",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Shinegupta/ShineMath with Docker Model Runner:
```
docker model run hf.co/Shinegupta/ShineMath
```

ShineMath: Mathematical Olympiad Language Model

ShineMath is a custom-trained LoRA adapter designed to assist with mathematical olympiad problems, reasoning, step-by-step solution generation, and proof writing.
It was fine-tuned for challenging math tasks using efficient PEFT methods.

Author: Shine Gupta (@shine_gupta17)
Repository: Shinegupta/ShineMath

Model Details

Type: PEFT LoRA adapter (not a full model – load on top of a base LLM)
Files included: adapter_model.safetensors, adapter_config.json, tokenizer files, chat_template.jinja, generation_config.json
Size: ~82.5 MB (lightweight and easy to share/load)
Intended use: Solving/generating IMO-style problems, AMC/AIME prep, mathematical reasoning, explanations

Usage (with PEFT + Transformers)

Since this is a LoRA adapter, load it on top of the base model:

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

base_model_name = "meta-llama/Llama-3-8B"  # ← Replace with your actual base model!
adapter_name = "Shinegupta/ShineMath"

tokenizer = AutoTokenizer.from_pretrained(adapter_name)
model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    torch_dtype=torch.bfloat16,  # or "auto"
    device_map="auto"
)
model = PeftModel.from_pretrained(model, adapter_name)

# Example
prompt = "Solve: Let x² + y² = 1. Find the maximum value of x + y under the constraint x, y ≥ 0."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Simpler with pipeline (auto-handles adapter):

from transformers import pipeline

pipe = pipeline(
    "text-generation",
    model=base_model_name,
    peft_model=adapter_name,  # Loads the LoRA automatically
    device_map="auto"
)

result = pipe("Prove by induction that the sum of the first n natural numbers is n(n+1)/2.")
print(result[0]["generated_text"])

Tip: Use the chat_template.jinja for chat/instruct formats if your base model supports it (e.g., apply_chat_template).

Applications

Solving and generating mathematical olympiad problems (IMO, AIME, AMC, etc.)
Step-by-step solution explanations
Mathematical reasoning, theorem proving, and algebraic manipulations

License

See the LICENSE file or specify here (e.g., Apache-2.0 for open use).

Citation

If you use ShineMath in research or projects, please cite:

author = Shine Gupta, title = ShineMath: Mathematical Olympiad Language Model, publisher = Hugging Face, howpublished = https://huggingface.co/Shinegupta/ShineMath

For questions, collaborations, or issues — open a discussion on the model page! Happy math solving!

Downloads last month: 1