File size: 5,511 Bytes

# JEE NUJAN Math Expert 🎯📚

**The Ultimate JEE Mathematics AI Tutor - Fine-tuned Specialist**

This is a fine-tuned version of [JEE NUJAN Mix v2 Base](https://huggingface.co/shivs28/jee_nujan_mix_v2_base) specifically trained on JEE-style mathematics problems to excel at Indian competitive exam mathematics.

## 🏆 Model Details

- **Base Model**: `shivs28/jee_nujan_mix_v2_base`
- **Fine-tuning Dataset**: 500+ JEE-relevant mathematics problems from MATH dataset
- **Training Steps**: 150 (optimized for mathematical reasoning)
- **LoRA Configuration**: Rank 32, Alpha 64 (high-performance setup)
- **Specialization**: JEE Main & Advanced mathematics problems

## 🎯 Mathematical Capabilities

This model excels at:

### Core JEE Topics
- **Algebra**: Quadratic equations, inequalities, sequences & series
- **Calculus**: Limits, derivatives, integrals, applications
- **Coordinate Geometry**: Lines, circles, parabolas, ellipses, hyperbolas
- **Trigonometry**: Identities, equations, inverse functions
- **Probability**: Conditional probability, distributions, combinatorics
- **Number Theory**: Divisibility, modular arithmetic, prime numbers
- **Vector Algebra**: Dot product, cross product, scalar triple product

### Problem-Solving Approach
- **Step-by-step Solutions**: Clear mathematical progression
- **Multiple Methods**: Shows different approaches when applicable
- **Error Prevention**: Highlights common JEE mistakes
- **Time-Efficient**: Optimized for exam conditions

## 🚀 Usage Examples

### Basic Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "shivs28/jee_nujan_math_expert"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)

# JEE problem format
jee_prompt = '''<|problem|>
Find the number of real solutions of the equation x³ - 3x² + 2x - 1 = 0 in the interval [0, 3].

<|solution|>'''

inputs = tokenizer(jee_prompt, return_tensors="pt")
outputs = model.generate(
    **inputs, 
    max_length=800, 
    temperature=0.1,  # Low temperature for mathematical accuracy
    do_sample=True,
    pad_token_id=tokenizer.pad_token_id,
    repetition_penalty=1.05
)

solution = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(solution)
```

### Advanced JEE Problem
```python
complex_problem = '''<|problem|>
In triangle ABC, if a = 7, b = 8, c = 9, find:
1. The area of triangle ABC
2. The radius of the circumscribed circle
3. The radius of the inscribed circle

<|solution|>'''

# Generate comprehensive solution
inputs = tokenizer(complex_problem, return_tensors="pt")
outputs = model.generate(
    **inputs,
    max_length=1200,
    temperature=0.05,  # Very low for multi-step problems
    top_p=0.95,
    do_sample=True,
    pad_token_id=tokenizer.pad_token_id
)
```

## ⚙️ Recommended Generation Settings

### For JEE Main Problems
```python
generation_config = {
    "max_length": 800,
    "temperature": 0.1,
    "top_p": 0.95,
    "do_sample": True,
    "repetition_penalty": 1.05,
    "pad_token_id": tokenizer.pad_token_id
}
```

### For JEE Advanced Problems
```python
advanced_config = {
    "max_length": 1200,      # Longer for complex solutions
    "temperature": 0.05,     # Very low for accuracy
    "top_p": 0.9,
    "do_sample": True,
    "repetition_penalty": 1.1,
    "pad_token_id": tokenizer.pad_token_id
}
```

## 🎯 Training Details

- **Architecture**: LoRA fine-tuning on base model
- **Training Data**: Carefully curated JEE-relevant problems
- **Optimization**: Focused on mathematical reasoning patterns
- **Validation**: Tested on held-out JEE problems

### LoRA Configuration
- **Rank (r)**: 32
- **Alpha**: 64  
- **Dropout**: 0.1
- **Target Modules**: All attention and MLP layers
- **Trainable Parameters**: ~2.1% of total parameters

## 🏅 Best Practices for JEE Preparation

1. **Use specific problem format**: Always use `<|problem|>` and `<|solution|>` tags
2. **Low temperature**: Use 0.05-0.1 for mathematical accuracy
3. **Adequate length**: Set max_length based on problem complexity
4. **Multiple attempts**: Try different seeds for various solution approaches
5. **Verify results**: Always cross-check mathematical calculations

## 📈 Use Cases

### For Students
- **Practice Problems**: Generate solutions with explanations
- **Concept Clarification**: Understand mathematical reasoning
- **Exam Preparation**: Practice with JEE-style problems
- **Error Analysis**: Learn from common mistakes

### For Educators
- **Solution Generation**: Create detailed problem solutions
- **Teaching Aid**: Step-by-step mathematical explanations
- **Problem Variation**: Generate similar problems for practice
- **Assessment**: Evaluate student understanding

## 🔧 Technical Specifications

- **Base Architecture**: Transformer-based language model
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
- **Precision**: 16-bit floating point
- **Context Length**: 768 tokens (optimized for detailed solutions)
- **Vocabulary**: Extended with mathematical notation

## 📝 Citation

If you use this model in your research or educational content, please cite:

```bibtex
@model{jee_nujan_math_expert,
  title={JEE NUJAN Math Expert: Fine-tuned Mathematics Specialist},
  author={shivs28},
  year={2025},
  url={https://huggingface.co/shivs28/jee_nujan_math_expert}
}
```

## 🤝 Contributing

Found an issue or have suggestions? Open an issue on the model repository!