Kirim1 commited on
Commit
617bc7a
·
verified ·
1 Parent(s): 2d818c3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +519 -3
README.md CHANGED
@@ -1,3 +1,519 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - zh
5
+ - en
6
+ library_name: transformers
7
+ pipeline_tag: text-generation
8
+ tags:
9
+ - causal-lm
10
+ - math
11
+ - reasoning
12
+ - tool-calling
13
+ - function-calling
14
+ - bilingual
15
+ - code
16
+ - symbolic-solver
17
+ - llm
18
+ - pytorch
19
+ base_model: Kirim-ai/Kirim-V1-base
20
+ datasets: []
21
+ metrics:
22
+ - math
23
+ - gsm8k
24
+ - minerva
25
+ model-index:
26
+ - name: Kirim-1-Math
27
+ results: []
28
+ widget:
29
+ - text: "Solve: ∫(x² + 2x + 1)dx"
30
+ example_title: "Calculus Integration"
31
+ - text: "解方程组: 2x + 3y = 12, 4x - y = 5"
32
+ example_title: "System of Equations (Chinese)"
33
+ - text: "Use the calculator tool to compute 2^128"
34
+ example_title: "Tool Calling Example"
35
+ ---
36
+
37
+ # Kirim-1-Math (30B)
38
+
39
+ <div align="center">
40
+
41
+ **The First Kirim Model with Advanced Mathematical Reasoning and Tool Calling**
42
+
43
+ [Base Model](https://huggingface.co/Kirim-ai/Kirim-V1-base) • [Technical Paper]()
44
+
45
+ </div>
46
+
47
+ ---
48
+
49
+ ## Introduction
50
+
51
+ **Kirim-1-Math** is a 30-billion parameter mathematical reasoning model, representing a major leap in the Kirim model series. As the **first Kirim model with tool calling capabilities**, it combines advanced mathematical problem-solving with the ability to use external tools and execute calculations.
52
+
53
+ ### Key Features
54
+
55
+ - **Advanced Math Reasoning**: Trained on mathematical proofs, olympiad problems, and research papers
56
+ - **Tool Calling**: First in Kirim series with function calling capabilities
57
+ - **Symbolic Solver**: Handles algebraic manipulation, calculus, and symbolic computation
58
+ - **Bilingual**: Solves problems in both Chinese and English
59
+ - **Code Execution**: Can write and execute Python code for numerical solutions
60
+ - **LaTeX Output**: Generates properly formatted mathematical expressions
61
+ - **30B Parameters**: More powerful reasoning than 7B/13B variants
62
+
63
+ ---
64
+
65
+ ## Model Specifications
66
+
67
+ | Parameter | Value | Comparison |
68
+ |-----------|-------|------------|
69
+ | Parameters | 30B | 2.3× larger than base |
70
+ | Hidden Size | 5,120 | Enhanced capacity |
71
+ | Layers | 48 | Deep reasoning |
72
+ | Attention Heads | 40 | Fine-grained attention |
73
+ | KV Heads | 8 (GQA) | Memory efficient |
74
+ | Context Length | 32,768 tokens | Extended problems |
75
+ | Vocabulary | 102,400 | Same as base |
76
+ | Tool Calling | ✅ Yes | **New feature!** |
77
+ | Precision | BFloat16 | High quality |
78
+
79
+ ### Architecture Highlights
80
+
81
+ - **Deeper Network**: 48 layers for complex multi-step reasoning
82
+ - **Wider Hidden States**: 5,120 dimensions for richer representations
83
+ - **Grouped Query Attention**: 5:1 ratio (40:8) for efficiency
84
+ - **Extended Training**: Specialized on mathematical datasets
85
+
86
+ ---
87
+
88
+ ## 🚀 Quick Start
89
+
90
+ ### Installation
91
+
92
+ ```bash
93
+ pip install transformers torch accelerate sympy
94
+ ```
95
+
96
+ ### Basic Usage
97
+
98
+ ```python
99
+ from transformers import AutoModelForCausalLM, AutoTokenizer
100
+
101
+ # Load model
102
+ model = AutoModelForCausalLM.from_pretrained(
103
+ "Kirim-ai/Kirim-1-Math",
104
+ torch_dtype="auto",
105
+ device_map="auto",
106
+ trust_remote_code=True
107
+ )
108
+
109
+ tokenizer = AutoTokenizer.from_pretrained(
110
+ "Kirim-ai/Kirim-1-Math",
111
+ trust_remote_code=True
112
+ )
113
+
114
+ # Solve a math problem
115
+ messages = [
116
+ {"role": "user", "content": "Solve the quadratic equation: x² - 5x + 6 = 0"}
117
+ ]
118
+
119
+ inputs = tokenizer.apply_chat_template(
120
+ messages,
121
+ return_tensors="pt",
122
+ add_generation_prompt=True
123
+ ).to(model.device)
124
+
125
+ outputs = model.generate(
126
+ inputs,
127
+ max_new_tokens=2048,
128
+ temperature=0.1, # Lower temperature for math
129
+ top_p=0.95,
130
+ do_sample=False # Deterministic for accuracy
131
+ )
132
+
133
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
134
+ print(response)
135
+ ```
136
+
137
+ ---
138
+
139
+ ## Tool Calling
140
+
141
+ Kirim-1-Math is the **first Kirim model** with built-in tool calling capabilities.
142
+
143
+ ### Available Tools
144
+
145
+ The model can use these built-in mathematical tools:
146
+
147
+ 1. **Calculator**: Precise arithmetic operations
148
+ 2. **Symbolic Solver**: Algebraic manipulations
149
+ 3. **Code Executor**: Run Python/SymPy code
150
+ 4. **Plot Generator**: Create mathematical visualizations
151
+ 5. **Theorem Lookup**: Access mathematical theorems and formulas
152
+
153
+ ### Tool Calling Example
154
+
155
+ ```python
156
+ messages = [
157
+ {
158
+ "role": "user",
159
+ "content": "Calculate 2^1024 and tell me how many digits it has"
160
+ }
161
+ ]
162
+
163
+ # Model will automatically decide to use calculator tool
164
+ inputs = tokenizer.apply_chat_template(messages, return_tensors="pt")
165
+ outputs = model.generate(inputs, max_new_tokens=2048)
166
+
167
+ # Response will include tool calls like:
168
+ # <tool_call>
169
+ # {
170
+ # "name": "calculator",
171
+ # "arguments": {
172
+ # "expression": "2**1024"
173
+ # }
174
+ # }
175
+ # </tool_call>
176
+ ```
177
+
178
+ ### Custom Tool Definition
179
+
180
+ ```python
181
+ tools = [
182
+ {
183
+ "type": "function",
184
+ "function": {
185
+ "name": "scientific_calculator",
186
+ "description": "Perform advanced scientific calculations",
187
+ "parameters": {
188
+ "type": "object",
189
+ "properties": {
190
+ "expression": {
191
+ "type": "string",
192
+ "description": "Mathematical expression to evaluate"
193
+ },
194
+ "precision": {
195
+ "type": "integer",
196
+ "description": "Decimal precision",
197
+ "default": 10
198
+ }
199
+ },
200
+ "required": ["expression"]
201
+ }
202
+ }
203
+ }
204
+ ]
205
+
206
+ # Include tools in prompt
207
+ messages = [
208
+ {"role": "system", "content": f"You have access to these tools: {tools}"},
209
+ {"role": "user", "content": "Calculate sin(π/4) with 15 decimal places"}
210
+ ]
211
+ ```
212
+
213
+ ---
214
+
215
+ ## Mathematical Capabilities
216
+
217
+ ### 1. Algebraic Reasoning
218
+
219
+ ```python
220
+ # Example: Solve system of equations
221
+ problem = """
222
+ 解方程组:
223
+ 2x + 3y = 12
224
+ 4x - y = 5
225
+ """
226
+
227
+ response = model.generate_solution(problem)
228
+ # Output includes step-by-step solution with reasoning
229
+ ```
230
+
231
+ ### 2. Calculus
232
+
233
+ ```python
234
+ # Integration
235
+ problem = "Calculate: ∫(x³ + 2x² - x + 1)dx"
236
+
237
+ # Differentiation
238
+ problem = "Find dy/dx if y = ln(x²) + e^(3x)"
239
+ ```
240
+
241
+ ### 3. Probability & Statistics
242
+
243
+ ```python
244
+ problem = """
245
+ A bag contains 5 red balls and 3 blue balls.
246
+ What's the probability of drawing 2 red balls without replacement?
247
+ """
248
+ ```
249
+
250
+ ### 4. Number Theory
251
+
252
+ ```python
253
+ problem = "Prove that √2 is irrational"
254
+ # Model provides formal mathematical proof
255
+ ```
256
+
257
+ ### 5. Geometry
258
+
259
+ ```python
260
+ problem = """
261
+ In triangle ABC, if AB = 5, BC = 7, and AC = 8,
262
+ find the area using Heron's formula.
263
+ """
264
+ ```
265
+
266
+ ---
267
+
268
+
269
+ ## Use Cases
270
+
271
+ ### 1. Educational Tutoring
272
+
273
+ ```python
274
+ messages = [
275
+ {
276
+ "role": "user",
277
+ "content": "I don't understand how to complete the square. Can you explain and show an example?"
278
+ }
279
+ ]
280
+ # Provides step-by-step explanations
281
+ ```
282
+
283
+ ### 2. Research Assistance
284
+
285
+ ```python
286
+ messages = [
287
+ {
288
+ "role": "user",
289
+ "content": "Help me verify this proof about convergence of infinite series"
290
+ }
291
+ ]
292
+ # Analyzes mathematical proofs
293
+ ```
294
+
295
+ ### 3. Homework Help
296
+
297
+ ```python
298
+ messages = [
299
+ {
300
+ "role": "user",
301
+ "content": "Solve these 10 calculus problems and show your work"
302
+ }
303
+ ]
304
+ # Solves problems with detailed steps
305
+ ```
306
+
307
+ ### 4. Competition Preparation
308
+
309
+ ```python
310
+ messages = [
311
+ {
312
+ "role": "user",
313
+ "content": "Give me 5 AMC-level problems to practice"
314
+ }
315
+ ]
316
+ # Generates practice problems
317
+ ```
318
+
319
+ ### 5. Code-Assisted Solving
320
+
321
+ ```python
322
+ messages = [
323
+ {
324
+ "role": "user",
325
+ "content": "Use numerical methods to find roots of x^5 - 3x^3 + 2x - 1 = 0"
326
+ }
327
+ ]
328
+ # Writes and executes numerical solver
329
+ ```
330
+
331
+ ---
332
+
333
+ ## Advanced Features
334
+
335
+ ### Step-by-Step Reasoning
336
+
337
+ The model shows its work:
338
+
339
+ ```
340
+ Problem: Solve x² - 5x + 6 = 0
341
+
342
+ Solution:
343
+ Step 1: Identify this as a quadratic equation in standard form ax² + bx + c = 0
344
+ where a=1, b=-5, c=6
345
+
346
+ Step 2: Try factoring: We need two numbers that multiply to 6 and add to -5
347
+ Those numbers are -2 and -3
348
+
349
+ Step 3: Factor: (x - 2)(x - 3) = 0
350
+
351
+ Step 4: Apply zero product property:
352
+ x - 2 = 0 or x - 3 = 0
353
+
354
+ Step 5: Solve each equation:
355
+ x = 2 or x = 3
356
+
357
+ Answer: x = 2 or x = 3
358
+ ```
359
+
360
+ ### LaTeX Output
361
+
362
+ ```python
363
+ # Request LaTeX formatted output
364
+ messages = [
365
+ {
366
+ "role": "user",
367
+ "content": "Solve this and format the answer in LaTeX: ∫(x² + 1)/(x³ + 3x + 1)dx"
368
+ }
369
+ ]
370
+
371
+ # Output includes:
372
+ # $$\int \frac{x^2 + 1}{x^3 + 3x + 1}dx = ...$$
373
+ ```
374
+
375
+ ### Symbolic Manipulation
376
+
377
+ Uses SymPy internally for symbolic computation:
378
+
379
+ ```python
380
+ from sympy import symbols, expand, factor, simplify
381
+
382
+ # Model can perform:
383
+ # - Expansion: (x+1)³ → x³ + 3x² + 3x + 1
384
+ # - Factoring: x² - 4 → (x-2)(x+2)
385
+ # - Simplification: (x²-1)/(x-1) → x+1
386
+ ```
387
+
388
+ ---
389
+
390
+ ## Deployment
391
+
392
+ ### System Requirements
393
+
394
+ **Minimum (4-bit Quantization):**
395
+ - GPU: 20GB VRAM (RTX 4090, A5000)
396
+ - RAM: 32GB
397
+ - Storage: 30GB
398
+
399
+ **Recommended (BF16):**
400
+ - GPU: 48GB VRAM (A40, A6000)
401
+ - RAM: 64GB
402
+ - Storage: 70GB
403
+
404
+ **Optimal (Production):**
405
+ - GPU: 80GB VRAM (A100, H100)
406
+ - RAM: 128GB
407
+ - Storage: 100GB SSD
408
+
409
+ ### Quantization Options
410
+
411
+ ```python
412
+ # 8-bit (30GB VRAM)
413
+ model = AutoModelForCausalLM.from_pretrained(
414
+ "Kirim-ai/Kirim-1-Math",
415
+ load_in_8bit=True,
416
+ device_map="auto"
417
+ )
418
+
419
+ # 4-bit (20GB VRAM)
420
+ model = AutoModelForCausalLM.from_pretrained(
421
+ "Kirim-ai/Kirim-1-Math",
422
+ load_in_4bit=True,
423
+ device_map="auto"
424
+ )
425
+ ```
426
+
427
+ ---
428
+
429
+ ## Training Details
430
+
431
+ ### Training Data
432
+
433
+ - **Mathematics Corpus**: 500B tokens
434
+ - Mathematical proofs and papers
435
+ - Olympiad problems (IMO, USAMO, AMC)
436
+ - Textbooks (algebra through advanced calculus)
437
+ - Math Stack Exchange
438
+ - arXiv math papers
439
+
440
+ - **Code**: 200B tokens
441
+ - Mathematical Python libraries (NumPy, SymPy, SciPy)
442
+ - Computational notebooks
443
+ - Algorithm implementations
444
+
445
+ - **General**: 800B tokens
446
+ - From Kirim-V1-base pre-training
447
+
448
+ **Total**: 1.5 Trillion tokens
449
+
450
+ ### Training Process
451
+
452
+ **Stage 1: Continued Pre-training** (from Kirim-V1-base)
453
+ - Started from 13B base checkpoint
454
+ - Expanded to 30B parameters
455
+ - Trained on math-heavy corpus
456
+ - Duration: 45 days on 512x H100 GPUs
457
+
458
+ **Stage 2: Mathematical Instruction Tuning**
459
+ - 200K high-quality math problems with solutions
460
+ - Step-by-step reasoning examples
461
+ - Duration: 5 days
462
+
463
+ **Stage 3: Tool Calling Training**
464
+ - 50K tool-calling examples
465
+ - Function definition and execution
466
+ - Error handling and recovery
467
+ - Duration: 3 days
468
+
469
+ **Stage 4: Reinforcement Learning**
470
+ - Reward model based on solution correctness
471
+ - Self-verification training
472
+ - Duration: 7 days
473
+
474
+ ---
475
+
476
+ ## Limitations
477
+
478
+ - **Computation Limits**: Cannot perform extremely large calculations without tools
479
+ - **Proof Verification**: May occasionally make logical errors in complex proofs
480
+ - **Theorem Knowledge**: Limited to theorems in training data (pre-Oct 2024)
481
+ - **Visual Math**: Cannot process images of equations or diagrams
482
+ - **Real-time Data**: Cannot access current mathematical research or live data
483
+
484
+ ---
485
+
486
+ ## Model Series Comparison
487
+
488
+ | Model | Parameters | Purpose | Tool Calling | Best For |
489
+ |-------|------------|---------|--------------|----------|
490
+ | Kirim-V1-base | 13B | Foundation | ❌ | Research, fine-tuning |
491
+ | Kirim-V1-7B-Chat | 7B | Conversation | ❌ | Production chatbots |
492
+ | **Kirim-1-Math** | 30B | Mathematics | ✅ | **Math problems, STEM education** |
493
+ | Kirim-V2 (coming) | 30B+ | Multimodal | ✅ | Visual reasoning |
494
+
495
+ ---
496
+
497
+ ## Citation
498
+
499
+ ```bibtex
500
+ @misc{kirim2024math,
501
+ title={Kirim-1-Math: Advanced Mathematical Reasoning with Tool Calling},
502
+ author={Kirim AI Research Team},
503
+ year={2025},
504
+ publisher={Kirim AI},
505
+ url={https://huggingface.co/Kirim-ai/Kirim-1-Math}
506
+ }
507
+ ```
508
+
509
+ ---
510
+
511
+ ## Contributing
512
+
513
+ We welcome contributions!
514
+
515
+ ---
516
+
517
+ ## License
518
+
519
+ Apache License 2.0 - See [LICENSE](LICENSE) for details.