Sculptor-AI
/

Ursa_Minor

Safetensors

qwen2

Model card Files Files and versions

xet

Community

Kaileh57 commited on Feb 28, 2025

Commit

b90bb79

verified ·

1 Parent(s): fc21c61

Added model card for v0.2

Browse files

Files changed (1) hide show

readme.md +90 -0

readme.md ADDED Viewed

	@@ -0,0 +1,90 @@

+# Ursa Minor v0.2
+[![Apache License 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
+A reasoning-enhanced language model distilled from Google's Gemini 2.0 Flash Thinking into Qwen 1.5B. Version 0.2 shows significant improvement over v0.1, with enhanced reasoning capabilities and coherent text generation.
+## Model Overview
+Ursa Minor v0.2 is designed to mimic the chain-of-thought reasoning patterns of Google's Gemini 2.0 Flash Thinking model. The model demonstrates step-by-step reasoning for problem-solving tasks and provides explanations with visible thought processes.
+### Specifications
+- **Base Model**: Qwen 1.5B, a 1.5 billion parameter decoder-only transformer model
+- **Context Window**: 4096 tokens
+- **Tokenizer**: Same as Qwen 1.5B
+- **Parameter Count**: 1.5B
+## Model Access
+The model is available on Hugging Face in two versions:
+- Original: [https://huggingface.co/Kaileh57/Ursa_Minor](https://huggingface.co/Kaileh57/Ursa_Minor)
+- Quantized: [https://huggingface.co/mradermacher/Ursa_Minor-GGUF](https://huggingface.co/mradermacher/Ursa_Minor-GGUF)
+## Usage Example
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+# Load the model and tokenizer
+model_path = "Kaileh57/Ursa_Minor"
+tokenizer = AutoTokenizer.from_pretrained(model_path)
+model = AutoModelForCausalLM.from_pretrained(model_path)
+# Create a reasoning prompt
+prompt = """Think through this step by step:
+How would you determine if a number is a prime number? Design an algorithm and trace through it for the number 29.
+"""
+# Format the prompt
+formatted_prompt = tokenizer.apply_chat_template([
+    {"role": "system", "content": "You are a helpful assistant."},
+    {"role": "user", "content": prompt}
+], tokenize=False)
+# Generate a response
+inputs = tokenizer(formatted_prompt, return_tensors="pt").to(model.device)
+outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.7)
+response = tokenizer.decode(outputs[0], skip_special_tokens=True)
+print(response)
+```
+## Training Methodology
+The model was created using knowledge distillation techniques, where Qwen 1.5B (student model) was trained to mimic the reasoning patterns of Gemini 2.0 Flash Thinking (teacher model). This approach transfers reasoning capabilities from the larger teacher model to the smaller student model.
+The distillation process used one of two primary methods:
+- **Logit-based distillation**: Where the student model is trained to produce similar output probability distributions as the teacher
+- **Hidden states-based distillation**: Where the internal representations of the student model are aligned with those of the teacher
+## Intended Use
+This model is designed to:
+- Demonstrate step-by-step reasoning for problem-solving tasks
+- Break down complex problems into manageable components
+- Provide explanations with visible thought processes
+- Support educational scenarios where seeing the reasoning process is beneficial
+## Limitations
+- **Reasoning Depth**: May not achieve the same reasoning depth as Gemini due to parameter count differences
+- **Scope**: Reasoning capabilities are limited to the types of problems it was exposed to during training
+- **Mathematical Accuracy**: May make calculation errors on complex mathematical problems
+- **Hallucination**: May occasionally generate plausible-sounding but incorrect reasoning steps
+- **Size Constraints**: At 1.5B parameters, has less capacity than larger models like Gemini
+## Ethical Considerations
+- The model may inherit biases present in both the Qwen base model and the Gemini responses
+- Reasoning chains may occasionally reinforce stereotypes or contain subtle biases
+- The model should not be used for critical decision-making without human oversight
+- Responses should be verified for correctness, especially for domain-specific reasoning
+## License
+This project is licensed under the Apache License 2.0.