File size: 7,416 Bytes
fa231dd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7cccd0c
 
88e1347
 
 
 
e3ae007
7cccd0c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
---

language:
- en
license: apache-2.0
library_name: transformers
tags:
- text-generation
- granite
- math
- physics
- qlora
- ibm
base_model: ibm-granite/granite-3.3-2b-instruct
datasets:
- nvidia/Nemotron-RL-math-advanced_calculations
- camel-ai/physics
model-index:
- name: Galena-2B
  results: []
---


# Galena-2B: Granite 3.3 Math & Physics Model

![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)
![Python](https://img.shields.io/badge/python-3.10+-blue.svg)
![Transformers](https://img.shields.io/badge/🤗-transformers-orange.svg)

A specialized 2B parameter language model fine-tuned on advanced mathematics and physics datasets. Built on IBM's [Granite 3.3-2B Instruct](https://huggingface.co/ibm-granite/granite-3.3-2b-instruct) base model with LoRA fine-tuning on 26k instruction-response pairs covering advanced calculations and physics concepts.

## Download Model Artifacts

The HF checkpoint and GGUF exports are hosted externally (e.g., Hugging Face) and
are **not** stored inside this repository. Fetch them before running the
examples:

```bash

python scripts/download_artifacts.py --artifact all

```

- `--source huggingface` (default) pulls from `xJoepec/galena-2b-math-physics`.
- `--source mirror --hf-url ... --gguf-url ...` lets you point to release assets/CDN downloads instead.

Artifacts install under `models/math-physics/{hf,gguf}` and are ignored by Git.

## Quick Start

### Using Hugging Face Transformers

```python

from transformers import AutoModelForCausalLM, AutoTokenizer



# Load model and tokenizer

model = AutoModelForCausalLM.from_pretrained(

    "models/math-physics/hf",

    device_map="auto",

    trust_remote_code=True

)

tokenizer = AutoTokenizer.from_pretrained("models/math-physics/hf")



# Generate response

prompt = "Explain the relationship between energy and momentum in special relativity."

messages = [{"role": "user", "content": prompt}]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)



outputs = model.generate(inputs, max_new_tokens=256, temperature=0.7)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(response)

```

### Using llama.cpp (GGUF)

```bash

# Requires llama.cpp build and downloaded GGUF artifact

./llama.cpp/build/bin/llama-cli \

  -m models/math-physics/gguf/granite-math-physics-f16.gguf \

  -p "Calculate the escape velocity from Earth's surface." \

  -n 256 \

  --temp 0.7

```

## Model Details

- **Base Model**: [ibm-granite/granite-3.3-2b-instruct](https://huggingface.co/ibm-granite/granite-3.3-2b-instruct)
- **Parameters**: 2.0B
- **Architecture**: GraniteForCausalLM (40 layers, 2048 hidden size, 32 attention heads)
- **Context Length**: 131,072 tokens (128k)
- **Training Method**: QLoRA (4-bit quantization with Low-Rank Adaptation)
- **Fine-tuning Data**: 26k examples blending:
  - **nvidia/Nemotron-RL-math-advanced_calculations** - Advanced calculator tasks with tool reasoning traces

  - **camel-ai/physics** - Physics dialogue pairs with topic/subtopic metadata



### Model Formats



| Format | Location (after download) | Size | Use Case |

|--------|---------------------------|------|----------|

| **Hugging Face** | `models/math-physics/hf/` | ~5.0 GB | PyTorch, Transformers, vLLM, further fine-tuning |

| **GGUF (F16)** | `models/math-physics/gguf/` | ~4.7 GB | llama.cpp, Ollama, LM Studio, on-device inference |



## Installation



### Prerequisites



- Python 3.10 or higher

- CUDA 12.1+ (for GPU acceleration)

- `huggingface_hub` (installed via `pip install -r requirements.txt`) for scripted downloads



### For Transformers Usage



```bash

# Clone repository

git clone <repository-url>

cd galena-2B



# Install dependencies

pip install -r requirements.txt



# Download artifacts (Hugging Face by default)

python scripts/download_artifacts.py --artifact hf

```



### For llama.cpp Usage



```bash

# Clone llama.cpp (if not already available)

git clone https://github.com/ggerganov/llama.cpp.git

cd llama.cpp



# Build with CUDA support (Linux/WSL)

cmake -B build -DGGML_CUDA=ON

cmake --build build --config Release



# Run inference

python scripts/download_artifacts.py --artifact gguf

./build/bin/llama-cli -m ../galena-2B/models/math-physics/gguf/granite-math-physics-f16.gguf

```



## Usage Examples



See the [`examples/`](examples/) directory for detailed usage demonstrations:



- **[basic_usage.py](examples/basic_usage.py)** - Simple model loading and inference

- **[chat_example.py](examples/chat_example.py)** - Interactive chat session

- **[llama_cpp_example.sh](examples/llama_cpp_example.sh)** - GGUF inference with llama.cpp



## Training Details



The model was fine-tuned using the following configuration:



```bash

# LoRA fine-tuning

python scripts/train_lora.py \

  --base_model ibm-granite/granite-3.3-2b-instruct \

  --dataset_path data/math_physics.jsonl \

  --output_dir outputs/granite-math-physics-lora \

  --use_4bit --gradient_checkpointing \

  --per_device_train_batch_size 1 \

  --gradient_accumulation_steps 4 \

  --num_train_epochs 1 \

  --max_steps 500 \

  --batching_strategy padding \

  --max_seq_length 512 \

  --bf16 \

  --trust_remote_code

```



For detailed training methodology and dataset preparation, see [MODEL_CARD.md](MODEL_CARD.md).



## Performance & Limitations



**Strengths:**
- Advanced mathematical calculations and reasoning
- Physics concepts and problem-solving
- Tool-augmented reasoning for complex calculations
- Efficient 2B parameter footprint suitable for edge deployment

**Limitations:**
- Specialized for math/physics; may underperform on general tasks
- 500-step fine-tune optimized for domain knowledge, not extensive generalization
- Inherits base model biases and constraints
- Best suited for educational and research applications

## Citation

If you use this model in your research, please cite:

```bibtex

@software{galena_2b_2024,

  title = {Galena-2B: Granite 3.3 Math & Physics Model},

  author = {Your Name},

  year = {2024},

  url = {https://github.com/yourusername/galena-2B},

  note = {Fine-tuned from IBM Granite 3.3-2B Instruct}

}

```

Also cite the base Granite model:

```bibtex

@software{granite_3_3_2024,

  title = {Granite 3.3: IBM's Open Foundation Models},

  author = {IBM Research},

  year = {2024},

  url = {https://www.ibm.com/granite}

}

```

## License

This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.

The base Granite 3.3 model is also released under Apache 2.0 by IBM.

## Acknowledgments

- **IBM Research** for the Granite 3.3 foundation models
- **NVIDIA** for the Nemotron-RL-math dataset
- **CAMEL-AI** for the physics dialogue dataset
- **Hugging Face** for the Transformers library and model hosting infrastructure
- **llama.cpp** project for efficient GGUF inference

## Links

- [IBM Granite Models](https://www.ibm.com/granite)
- [Base Model: granite-3.3-2b-instruct](https://huggingface.co/ibm-granite/granite-3.3-2b-instruct)
- [Hugging Face Transformers](https://github.com/huggingface/transformers)
- [llama.cpp](https://github.com/ggerganov/llama.cpp)

## Support

For issues, questions, or contributions, please open an issue in this repository's issue tracker.