Deepmath-Competitive-1.5B-Preview / README.md

Update README.md

5b209bc verified 9 months ago

3.86 kB

	---
	library_name: transformers
	tags:
	- math
	- cot
	- text-generation-inference
	- preview
	- experimental
	license: apache-2.0
	language:
	- en
	base_model:
	- Qwen/Qwen2.5-1.5B-Instruct
	pipeline_tag: text-generation
	---

	![DMC.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/WYWprTh49LUnIw-HiTcU-.png)

	# Deepmath-Competitive-1.5B-Preview

	> Deepmath-Competitive-1.5B-Preview is a chain-of-thought reasoning model fine-tuned from Qwen-1.5B, purpose-built for solving mathematical problems in both English and Chinese with a focus on long-context understanding. It enables advanced reasoning and detailed step-by-step problem solving in a compact form — ideal for competitive exam preparation, tutoring systems, and math-focused AI assistants.

	## Key Features

	1. Chain-of-Thought Math Reasoning
	Specifically trained to output detailed intermediate steps for math problems, Deepmath-Competitive-1.5B-Preview ensures interpretability and logical clarity — vital for learning and validation.

	2. Bilingual Proficiency (English + Chinese)
	Proficient in understanding and solving math problems in both English and Simplified Chinese, supporting diverse educational needs.

	3. Long-Context Reasoning
	Optimized for long-form math problems and word problem comprehension, enabling reasoning over extended contexts and compound queries.

	4. Compact yet Powerful
	With just 1.5B parameters, it delivers robust performance on arithmetic, algebra, geometry, logic, and competitive exam-style word problems with minimal computational cost.

	5. Structured Step-by-Step Computation
	Produces clean, stepwise outputs that mimic expert human problem-solving, helping learners follow the process and logic intuitively.

	## Quickstart with Transformers

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_name = "prithivMLmods/Deepmath-Competitive-1.5B-Preview"

	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype="auto",
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained(model_name)

	prompt = "Solve: A train travels 180 km in 3 hours. What is its average speed?"
	messages = [
	{"role": "system", "content": "You are a helpful tutor skilled in solving math problems with step-by-step explanations."},
	{"role": "user", "content": prompt}
	]
	text = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)
	model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

	generated_ids = model.generate(
	**model_inputs,
	max_new_tokens=512
	)
	generated_ids = [
	output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
	]

	response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
	```

	## Intended Use

	- Math Tutoring Bots: Delivers in-depth, multi-step solutions for students preparing for competitive and school-level math.
	- Bilingual Educational Apps: Effective in English and Chinese teaching environments.
	- STEM Reasoning Tools: Supports structured reasoning across science and engineering questions.
	- Compact LLM Deployments: Suitable for low-latency environments like mobile apps, edge devices, or web integrations.

	## Limitations

	1. Domain Focus:
	Primarily tuned for mathematics; performance may drop outside STEM or logical domains.

	2. Model Scale:
	While efficient, it may underperform on abstract or research-level problems compared to larger models.

	3. Inherited Biases:
	As a fine-tune of Qwen-1.5B, some pretraining biases may persist. Review is advised in critical applications.

	4. Prompt Sensitivity:
	Performs best with clearly structured prompts and formal question phrasing.