ZentithLLM
/

NexusLLM-Math-1B-v1

Model card Files Files and versions

NexusLLM-Math-1B-v1 / README.md

Nishant9689's picture

Update README.md

73d5919 verified about 1 month ago

|

history blame contribute delete

2.61 kB

	---
	language:
	- en
	tags:
	- math
	- education
	- llama-3
	- peft
	- lora
	base_model: meta-llama/Llama-3.2-1B-Instruct
	license: apache-2.0
	---

	# NexusLLM-Math-1B-v1

	## Model Details
	NexusLLM-Math-1B-v1 is a fine-tuned version of Llama 3.2 (1B parameters) optimized specifically for solving advanced high-school mathematics problems, with a focus on JEE Main and Advanced syllabus topics.

	- Developed by: ZentithLLM
	- Model Type: Causal Language Model (Fine-tuned with LoRA)
	- Language: English
	- Base Model: meta-llama/Llama-3.2-1B-Instruct
	- Precision: FP16

	## Intended Use
	This model is designed to act as an educational assistant for 11th-grade mathematics. It is trained to provide step-by-step reasoning and explanations for complex topics, rather than just outputting the final answer.

	Primary Topics Covered:
	- Binomial Theorem
	- Geometry (Circle Theorems, cyclic quadrilaterals, tangents, etc.)

	## Training Data
	The model was trained on a custom dataset of structured mathematics Q&A pairs. The dataset maps specific mathematical prompts to detailed completions, heavily utilizing an `explanation` field to teach the model the underlying mathematical logic and derivation steps.

	## Training Procedure
	The model was fine-tuned using the standard Hugging Face `trl` and `peft` libraries on a single NVIDIA T4 GPU, utilizing strictly native FP16 precision to ensure mathematical gradient stability.

	- Training Framework: Pure Hugging Face (No Unsloth/Quantization)
	- Method: LoRA (Low-Rank Adaptation)
	- Rank (r): 32
	- Alpha: 32
	- Optimizer: adamw_torch
	- Learning Rate: 2e-4
	- Max Sequence Length: 2048

	## How to Use
	Because this model was trained on a specific dataset structure, you must wrap your prompts in the `### Instruction:` and `### Response:` format for it to output the correct mathematical explanations.

	```python
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_id = "ZentithLLM/NexusLLM-Math-1B-v1"

	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	torch_dtype=torch.float16,
	device_map="auto"
	)

	question = "What is the general term in the expansion of (x+y)^n?"
	formatted_prompt = f"### Instruction:\\n{question}\\n\\n### Response:\\n"

	inputs = tokenizer(formatted_prompt, return_tensors="pt").to(model.device)

	outputs = model.generate(
	**inputs,
	max_new_tokens=250,
	temperature=0.3,
	do_sample=True,
	pad_token_id=tokenizer.eos_token_id
	)

	print(tokenizer.decode(outputs[0], skip_special_tokens=True))