shivs28
/

jee_nujan_math_expert

Safetensors

Model card Files Files and versions

xet

Community

shivs28 commited on Aug 10, 2025

Commit

5ca12b8

verified ·

1 Parent(s): 583cb50

Update README.md

Browse files

Files changed (1) hide show

README.md +171 -206

README.md CHANGED Viewed

@@ -1,207 +1,172 @@
----
-base_model: shivs28/jee_nujan_mix_v2_base
-library_name: peft
-pipeline_tag: text-generation
-tags:
-- base_model:adapter:shivs28/jee_nujan_mix_v2_base
-- lora
-- transformers
----
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]
-### Framework versions
-- PEFT 0.17.0

+# JEE NUJAN Math Expert 🎯📚
+**The Ultimate JEE Mathematics AI Tutor - Fine-tuned Specialist**
+This is a fine-tuned version of [JEE NUJAN Mix v2 Base](https://huggingface.co/shivs28/jee_nujan_mix_v2_base) specifically trained on JEE-style mathematics problems to excel at Indian competitive exam mathematics.
+## 🏆 Model Details
+- **Base Model**: `shivs28/jee_nujan_mix_v2_base`
+- **Fine-tuning Dataset**: 500+ JEE-relevant mathematics problems from MATH dataset
+- **Training Steps**: 150 (optimized for mathematical reasoning)
+- **LoRA Configuration**: Rank 32, Alpha 64 (high-performance setup)
+- **Specialization**: JEE Main & Advanced mathematics problems
+## 🎯 Mathematical Capabilities
+This model excels at:
+### Core JEE Topics
+- **Algebra**: Quadratic equations, inequalities, sequences & series
+- **Calculus**: Limits, derivatives, integrals, applications
+- **Coordinate Geometry**: Lines, circles, parabolas, ellipses, hyperbolas
+- **Trigonometry**: Identities, equations, inverse functions
+- **Probability**: Conditional probability, distributions, combinatorics
+- **Number Theory**: Divisibility, modular arithmetic, prime numbers
+- **Vector Algebra**: Dot product, cross product, scalar triple product
+### Problem-Solving Approach
+- **Step-by-step Solutions**: Clear mathematical progression
+- **Multiple Methods**: Shows different approaches when applicable
+- **Error Prevention**: Highlights common JEE mistakes
+- **Time-Efficient**: Optimized for exam conditions
+## 🚀 Usage Examples
+### Basic Usage
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+model_name = "shivs28/jee_nujan_math_expert"
+tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)
+# JEE problem format
+jee_prompt = '''<|problem|>
+Find the number of real solutions of the equation x³ - 3x² + 2x - 1 = 0 in the interval [0, 3].
+<|solution|>'''
+inputs = tokenizer(jee_prompt, return_tensors="pt")
+outputs = model.generate(
+    **inputs,
+    max_length=800,
+    temperature=0.1,  # Low temperature for mathematical accuracy
+    do_sample=True,
+    pad_token_id=tokenizer.pad_token_id,
+    repetition_penalty=1.05
+)
+solution = tokenizer.decode(outputs[0], skip_special_tokens=True)
+print(solution)
+```
+### Advanced JEE Problem
+```python
+complex_problem = '''<|problem|>
+In triangle ABC, if a = 7, b = 8, c = 9, find:
+1. The area of triangle ABC
+2. The radius of the circumscribed circle
+3. The radius of the inscribed circle
+<|solution|>'''
+# Generate comprehensive solution
+inputs = tokenizer(complex_problem, return_tensors="pt")
+outputs = model.generate(
+    **inputs,
+    max_length=1200,
+    temperature=0.05,  # Very low for multi-step problems
+    top_p=0.95,
+    do_sample=True,
+    pad_token_id=tokenizer.pad_token_id
+)
+```
+## ⚙️ Recommended Generation Settings
+### For JEE Main Problems
+```python
+generation_config = {
+    "max_length": 800,
+    "temperature": 0.1,
+    "top_p": 0.95,
+    "do_sample": True,
+    "repetition_penalty": 1.05,
+    "pad_token_id": tokenizer.pad_token_id
+}
+```
+### For JEE Advanced Problems
+```python
+advanced_config = {
+    "max_length": 1200,      # Longer for complex solutions
+    "temperature": 0.05,     # Very low for accuracy
+    "top_p": 0.9,
+    "do_sample": True,
+    "repetition_penalty": 1.1,
+    "pad_token_id": tokenizer.pad_token_id
+}
+```
+## 🎯 Training Details
+- **Architecture**: LoRA fine-tuning on base model
+- **Training Data**: Carefully curated JEE-relevant problems
+- **Optimization**: Focused on mathematical reasoning patterns
+- **Validation**: Tested on held-out JEE problems
+### LoRA Configuration
+- **Rank (r)**: 32
+- **Alpha**: 64
+- **Dropout**: 0.1
+- **Target Modules**: All attention and MLP layers
+- **Trainable Parameters**: ~2.1% of total parameters
+## 🏅 Best Practices for JEE Preparation
+1. **Use specific problem format**: Always use `<|problem|>` and `<|solution|>` tags
+2. **Low temperature**: Use 0.05-0.1 for mathematical accuracy
+3. **Adequate length**: Set max_length based on problem complexity
+4. **Multiple attempts**: Try different seeds for various solution approaches
+5. **Verify results**: Always cross-check mathematical calculations
+## 📈 Use Cases
+### For Students
+- **Practice Problems**: Generate solutions with explanations
+- **Concept Clarification**: Understand mathematical reasoning
+- **Exam Preparation**: Practice with JEE-style problems
+- **Error Analysis**: Learn from common mistakes
+### For Educators
+- **Solution Generation**: Create detailed problem solutions
+- **Teaching Aid**: Step-by-step mathematical explanations
+- **Problem Variation**: Generate similar problems for practice
+- **Assessment**: Evaluate student understanding
+## 🔧 Technical Specifications
+- **Base Architecture**: Transformer-based language model
+- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
+- **Precision**: 16-bit floating point
+- **Context Length**: 768 tokens (optimized for detailed solutions)
+- **Vocabulary**: Extended with mathematical notation
+## 📝 Citation
+If you use this model in your research or educational content, please cite:
+```bibtex
+@model{jee_nujan_math_expert,
+  title={JEE NUJAN Math Expert: Fine-tuned Mathematics Specialist},
+  author={shivs28},
+  year={2025},
+  url={https://huggingface.co/shivs28/jee_nujan_math_expert}
+}
+```
+## 🤝 Contributing
+Found an issue or have suggestions? Open an issue on the model repository!