Kaileh57 commited on
Commit
b90bb79
·
verified ·
1 Parent(s): fc21c61

Added model card for v0.2

Browse files
Files changed (1) hide show
  1. readme.md +90 -0
readme.md ADDED
@@ -0,0 +1,90 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Ursa Minor v0.2
2
+
3
+ [![Apache License 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
4
+
5
+ A reasoning-enhanced language model distilled from Google's Gemini 2.0 Flash Thinking into Qwen 1.5B. Version 0.2 shows significant improvement over v0.1, with enhanced reasoning capabilities and coherent text generation.
6
+
7
+ ## Model Overview
8
+
9
+ Ursa Minor v0.2 is designed to mimic the chain-of-thought reasoning patterns of Google's Gemini 2.0 Flash Thinking model. The model demonstrates step-by-step reasoning for problem-solving tasks and provides explanations with visible thought processes.
10
+
11
+ ### Specifications
12
+
13
+ - **Base Model**: Qwen 1.5B, a 1.5 billion parameter decoder-only transformer model
14
+ - **Context Window**: 4096 tokens
15
+ - **Tokenizer**: Same as Qwen 1.5B
16
+ - **Parameter Count**: 1.5B
17
+
18
+ ## Model Access
19
+
20
+ The model is available on Hugging Face in two versions:
21
+
22
+ - Original: [https://huggingface.co/Kaileh57/Ursa_Minor](https://huggingface.co/Kaileh57/Ursa_Minor)
23
+ - Quantized: [https://huggingface.co/mradermacher/Ursa_Minor-GGUF](https://huggingface.co/mradermacher/Ursa_Minor-GGUF)
24
+
25
+ ## Usage Example
26
+
27
+ ```python
28
+ from transformers import AutoModelForCausalLM, AutoTokenizer
29
+
30
+ # Load the model and tokenizer
31
+ model_path = "Kaileh57/Ursa_Minor"
32
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
33
+ model = AutoModelForCausalLM.from_pretrained(model_path)
34
+
35
+ # Create a reasoning prompt
36
+ prompt = """Think through this step by step:
37
+
38
+ How would you determine if a number is a prime number? Design an algorithm and trace through it for the number 29.
39
+
40
+ """
41
+
42
+ # Format the prompt
43
+ formatted_prompt = tokenizer.apply_chat_template([
44
+ {"role": "system", "content": "You are a helpful assistant."},
45
+ {"role": "user", "content": prompt}
46
+ ], tokenize=False)
47
+
48
+ # Generate a response
49
+ inputs = tokenizer(formatted_prompt, return_tensors="pt").to(model.device)
50
+ outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.7)
51
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
52
+
53
+ print(response)
54
+ ```
55
+
56
+ ## Training Methodology
57
+
58
+ The model was created using knowledge distillation techniques, where Qwen 1.5B (student model) was trained to mimic the reasoning patterns of Gemini 2.0 Flash Thinking (teacher model). This approach transfers reasoning capabilities from the larger teacher model to the smaller student model.
59
+
60
+ The distillation process used one of two primary methods:
61
+ - **Logit-based distillation**: Where the student model is trained to produce similar output probability distributions as the teacher
62
+ - **Hidden states-based distillation**: Where the internal representations of the student model are aligned with those of the teacher
63
+
64
+ ## Intended Use
65
+
66
+ This model is designed to:
67
+
68
+ - Demonstrate step-by-step reasoning for problem-solving tasks
69
+ - Break down complex problems into manageable components
70
+ - Provide explanations with visible thought processes
71
+ - Support educational scenarios where seeing the reasoning process is beneficial
72
+
73
+ ## Limitations
74
+
75
+ - **Reasoning Depth**: May not achieve the same reasoning depth as Gemini due to parameter count differences
76
+ - **Scope**: Reasoning capabilities are limited to the types of problems it was exposed to during training
77
+ - **Mathematical Accuracy**: May make calculation errors on complex mathematical problems
78
+ - **Hallucination**: May occasionally generate plausible-sounding but incorrect reasoning steps
79
+ - **Size Constraints**: At 1.5B parameters, has less capacity than larger models like Gemini
80
+
81
+ ## Ethical Considerations
82
+
83
+ - The model may inherit biases present in both the Qwen base model and the Gemini responses
84
+ - Reasoning chains may occasionally reinforce stereotypes or contain subtle biases
85
+ - The model should not be used for critical decision-making without human oversight
86
+ - Responses should be verified for correctness, especially for domain-specific reasoning
87
+
88
+ ## License
89
+
90
+ This project is licensed under the Apache License 2.0.