Kirim-ai
/

Kirim-1-Math

+---
+license: apache-2.0
+language:
+- zh
+- en
+library_name: transformers
+pipeline_tag: text-generation
+tags:
+- causal-lm
+- math
+- reasoning
+- tool-calling
+- function-calling
+- bilingual
+- code
+- symbolic-solver
+- llm
+- pytorch
+base_model: Kirim-ai/Kirim-V1-base
+datasets: []
+metrics:
+- math
+- gsm8k
+- minerva
+model-index:
+- name: Kirim-1-Math
+  results: []
+widget:
+- text: "Solve: ∫(x² + 2x + 1)dx"
+  example_title: "Calculus Integration"
+- text: "解方程组: 2x + 3y = 12, 4x - y = 5"
+  example_title: "System of Equations (Chinese)"
+- text: "Use the calculator tool to compute 2^128"
+  example_title: "Tool Calling Example"
+---
+# Kirim-1-Math (30B)
+<div align="center">
+**The First Kirim Model with Advanced Mathematical Reasoning and Tool Calling**
+[Base Model](https://huggingface.co/Kirim-ai/Kirim-V1-base) • [Technical Paper]()
+</div>
+---
+##  Introduction
+**Kirim-1-Math** is a 30-billion parameter mathematical reasoning model, representing a major leap in the Kirim model series. As the **first Kirim model with tool calling capabilities**, it combines advanced mathematical problem-solving with the ability to use external tools and execute calculations.
+###  Key Features
+-  **Advanced Math Reasoning**: Trained on mathematical proofs, olympiad problems, and research papers
+-  **Tool Calling**: First in Kirim series with function calling capabilities
+-  **Symbolic Solver**: Handles algebraic manipulation, calculus, and symbolic computation
+-  **Bilingual**: Solves problems in both Chinese and English
+-  **Code Execution**: Can write and execute Python code for numerical solutions
+-  **LaTeX Output**: Generates properly formatted mathematical expressions
+-  **30B Parameters**: More powerful reasoning than 7B/13B variants
+---
+##  Model Specifications
+| Parameter | Value | Comparison |
+|-----------|-------|------------|
+| Parameters | 30B | 2.3× larger than base |
+| Hidden Size | 5,120 | Enhanced capacity |
+| Layers | 48 | Deep reasoning |
+| Attention Heads | 40 | Fine-grained attention |
+| KV Heads | 8 (GQA) | Memory efficient |
+| Context Length | 32,768 tokens | Extended problems |
+| Vocabulary | 102,400 | Same as base |
+| Tool Calling | ✅ Yes | **New feature!** |
+| Precision | BFloat16 | High quality |
+### Architecture Highlights
+- **Deeper Network**: 48 layers for complex multi-step reasoning
+- **Wider Hidden States**: 5,120 dimensions for richer representations
+- **Grouped Query Attention**: 5:1 ratio (40:8) for efficiency
+- **Extended Training**: Specialized on mathematical datasets
+---
+## 🚀 Quick Start
+### Installation
+```bash
+pip install transformers torch accelerate sympy
+```
+### Basic Usage
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+# Load model
+model = AutoModelForCausalLM.from_pretrained(
+    "Kirim-ai/Kirim-1-Math",
+    torch_dtype="auto",
+    device_map="auto",
+    trust_remote_code=True
+)
+tokenizer = AutoTokenizer.from_pretrained(
+    "Kirim-ai/Kirim-1-Math",
+    trust_remote_code=True
+)
+# Solve a math problem
+messages = [
+    {"role": "user", "content": "Solve the quadratic equation: x² - 5x + 6 = 0"}
+]
+inputs = tokenizer.apply_chat_template(
+    messages,
+    return_tensors="pt",
+    add_generation_prompt=True
+).to(model.device)
+outputs = model.generate(
+    inputs,
+    max_new_tokens=2048,
+    temperature=0.1,  # Lower temperature for math
+    top_p=0.95,
+    do_sample=False   # Deterministic for accuracy
+)
+response = tokenizer.decode(outputs[0], skip_special_tokens=True)
+print(response)
+```
+---
+##  Tool Calling
+Kirim-1-Math is the **first Kirim model** with built-in tool calling capabilities.
+### Available Tools
+The model can use these built-in mathematical tools:
+1. **Calculator**: Precise arithmetic operations
+2. **Symbolic Solver**: Algebraic manipulations
+3. **Code Executor**: Run Python/SymPy code
+4. **Plot Generator**: Create mathematical visualizations
+5. **Theorem Lookup**: Access mathematical theorems and formulas
+### Tool Calling Example
+```python
+messages = [
+    {
+        "role": "user",
+        "content": "Calculate 2^1024 and tell me how many digits it has"
+    }
+]
+# Model will automatically decide to use calculator tool
+inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
+outputs = model.generate(inputs, max_new_tokens=2048)
+# Response will include tool calls like:
+# <tool_call>
+# {
+#   "name": "calculator",
+#   "arguments": {
+#     "expression": "2**1024"
+#   }
+# }
+# </tool_call>
+```
+### Custom Tool Definition
+```python
+tools = [
+    {
+        "type": "function",
+        "function": {
+            "name": "scientific_calculator",
+            "description": "Perform advanced scientific calculations",
+            "parameters": {
+                "type": "object",
+                "properties": {
+                    "expression": {
+                        "type": "string",
+                        "description": "Mathematical expression to evaluate"
+                    },
+                    "precision": {
+                        "type": "integer",
+                        "description": "Decimal precision",
+                        "default": 10
+                    }
+                },
+                "required": ["expression"]
+            }
+        }
+    }
+]
+# Include tools in prompt
+messages = [
+    {"role": "system", "content": f"You have access to these tools: {tools}"},
+    {"role": "user", "content": "Calculate sin(π/4) with 15 decimal places"}
+]
+```
+---
+##  Mathematical Capabilities
+### 1. Algebraic Reasoning
+```python
+# Example: Solve system of equations
+problem = """
+解方程组:
+2x + 3y = 12
+4x - y = 5
+"""
+response = model.generate_solution(problem)
+# Output includes step-by-step solution with reasoning
+```
+### 2. Calculus
+```python
+# Integration
+problem = "Calculate: ∫(x³ + 2x² - x + 1)dx"
+# Differentiation
+problem = "Find dy/dx if y = ln(x²) + e^(3x)"
+```
+### 3. Probability & Statistics
+```python
+problem = """
+A bag contains 5 red balls and 3 blue balls.
+What's the probability of drawing 2 red balls without replacement?
+"""
+```
+### 4. Number Theory
+```python
+problem = "Prove that √2 is irrational"
+# Model provides formal mathematical proof
+```
+### 5. Geometry
+```python
+problem = """
+In triangle ABC, if AB = 5, BC = 7, and AC = 8,
+find the area using Heron's formula.
+"""
+```
+---
+##  Use Cases
+### 1. Educational Tutoring
+```python
+messages = [
+    {
+        "role": "user",
+        "content": "I don't understand how to complete the square. Can you explain and show an example?"
+    }
+]
+# Provides step-by-step explanations
+```
+### 2. Research Assistance
+```python
+messages = [
+    {
+        "role": "user",
+        "content": "Help me verify this proof about convergence of infinite series"
+    }
+]
+# Analyzes mathematical proofs
+```
+### 3. Homework Help
+```python
+messages = [
+    {
+        "role": "user",
+        "content": "Solve these 10 calculus problems and show your work"
+    }
+]
+# Solves problems with detailed steps
+```
+### 4. Competition Preparation
+```python
+messages = [
+    {
+        "role": "user",
+        "content": "Give me 5 AMC-level problems to practice"
+    }
+]
+# Generates practice problems
+```
+### 5. Code-Assisted Solving
+```python
+messages = [
+    {
+        "role": "user",
+        "content": "Use numerical methods to find roots of x^5 - 3x^3 + 2x - 1 = 0"
+    }
+]
+# Writes and executes numerical solver
+```
+---
+##  Advanced Features
+### Step-by-Step Reasoning
+The model shows its work:
+```
+Problem: Solve x² - 5x + 6 = 0
+Solution:
+Step 1: Identify this as a quadratic equation in standard form ax² + bx + c = 0
+        where a=1, b=-5, c=6
+Step 2: Try factoring: We need two numbers that multiply to 6 and add to -5
+        Those numbers are -2 and -3
+Step 3: Factor: (x - 2)(x - 3) = 0
+Step 4: Apply zero product property:
+        x - 2 = 0  or  x - 3 = 0
+Step 5: Solve each equation:
+        x = 2  or  x = 3
+Answer: x = 2 or x = 3
+```
+### LaTeX Output
+```python
+# Request LaTeX formatted output
+messages = [
+    {
+        "role": "user",
+        "content": "Solve this and format the answer in LaTeX: ∫(x² + 1)/(x³ + 3x + 1)dx"
+    }
+]
+# Output includes:
+# $$\int \frac{x^2 + 1}{x^3 + 3x + 1}dx = ...$$
+```
+### Symbolic Manipulation
+Uses SymPy internally for symbolic computation:
+```python
+from sympy import symbols, expand, factor, simplify
+# Model can perform:
+# - Expansion: (x+1)³ → x³ + 3x² + 3x + 1
+# - Factoring: x² - 4 → (x-2)(x+2)
+# - Simplification: (x²-1)/(x-1) → x+1
+```
+---
+##  Deployment
+### System Requirements
+**Minimum (4-bit Quantization):**
+- GPU: 20GB VRAM (RTX 4090, A5000)
+- RAM: 32GB
+- Storage: 30GB
+**Recommended (BF16):**
+- GPU: 48GB VRAM (A40, A6000)
+- RAM: 64GB
+- Storage: 70GB
+**Optimal (Production):**
+- GPU: 80GB VRAM (A100, H100)
+- RAM: 128GB
+- Storage: 100GB SSD
+### Quantization Options
+```python
+# 8-bit (30GB VRAM)
+model = AutoModelForCausalLM.from_pretrained(
+    "Kirim-ai/Kirim-1-Math",
+    load_in_8bit=True,
+    device_map="auto"
+)
+# 4-bit (20GB VRAM)
+model = AutoModelForCausalLM.from_pretrained(
+    "Kirim-ai/Kirim-1-Math",
+    load_in_4bit=True,
+    device_map="auto"
+)
+```
+---
+##  Training Details
+### Training Data
+- **Mathematics Corpus**: 500B tokens
+  - Mathematical proofs and papers
+  - Olympiad problems (IMO, USAMO, AMC)
+  - Textbooks (algebra through advanced calculus)
+  - Math Stack Exchange
+  - arXiv math papers
+- **Code**: 200B tokens
+  - Mathematical Python libraries (NumPy, SymPy, SciPy)
+  - Computational notebooks
+  - Algorithm implementations
+- **General**: 800B tokens
+  - From Kirim-V1-base pre-training
+**Total**: 1.5 Trillion tokens
+### Training Process
+**Stage 1: Continued Pre-training** (from Kirim-V1-base)
+- Started from 13B base checkpoint
+- Expanded to 30B parameters
+- Trained on math-heavy corpus
+- Duration: 45 days on 512x H100 GPUs
+**Stage 2: Mathematical Instruction Tuning**
+- 200K high-quality math problems with solutions
+- Step-by-step reasoning examples
+- Duration: 5 days
+**Stage 3: Tool Calling Training**
+- 50K tool-calling examples
+- Function definition and execution
+- Error handling and recovery
+- Duration: 3 days
+**Stage 4: Reinforcement Learning**
+- Reward model based on solution correctness
+- Self-verification training
+- Duration: 7 days
+---
+##  Limitations
+- **Computation Limits**: Cannot perform extremely large calculations without tools
+- **Proof Verification**: May occasionally make logical errors in complex proofs
+- **Theorem Knowledge**: Limited to theorems in training data (pre-Oct 2024)
+- **Visual Math**: Cannot process images of equations or diagrams
+- **Real-time Data**: Cannot access current mathematical research or live data
+---
+##  Model Series Comparison
+| Model | Parameters | Purpose | Tool Calling | Best For |
+|-------|------------|---------|--------------|----------|
+| Kirim-V1-base | 13B | Foundation | ❌ | Research, fine-tuning |
+| Kirim-V1-7B-Chat | 7B | Conversation | ❌ | Production chatbots |
+| **Kirim-1-Math** | 30B | Mathematics | ✅ | **Math problems, STEM education** |
+| Kirim-V2 (coming) | 30B+ | Multimodal | ✅ | Visual reasoning |
+---
+##  Citation
+```bibtex
+@misc{kirim2024math,
+  title={Kirim-1-Math: Advanced Mathematical Reasoning with Tool Calling},
+  author={Kirim AI Research Team},
+  year={2025},
+  publisher={Kirim AI},
+  url={https://huggingface.co/Kirim-ai/Kirim-1-Math}
+}
+```
+---
+##  Contributing
+We welcome contributions!
+---
+##  License
+Apache License 2.0 - See [LICENSE](LICENSE) for details.