Update README.md
Browse files
README.md
CHANGED
|
@@ -1,72 +1,81 @@
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
library_name: transformers
|
| 3 |
-
license: other
|
| 4 |
-
base_model: google/gemma-3-12b-it
|
| 5 |
tags:
|
| 6 |
-
-
|
| 7 |
-
-
|
| 8 |
-
-
|
| 9 |
-
model-index:
|
| 10 |
-
- name: model
|
| 11 |
-
results: []
|
| 12 |
---
|
| 13 |
|
| 14 |
-
|
| 15 |
-
should probably proofread and complete it, then remove this comment. -->
|
| 16 |
|
| 17 |
-
|
| 18 |
|
| 19 |
-
|
| 20 |
-
It
|
| 21 |
-
- Loss: 0.6036
|
| 22 |
|
| 23 |
-
|
| 24 |
|
| 25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
|
| 27 |
-
##
|
|
|
|
| 28 |
|
| 29 |
-
|
|
|
|
| 30 |
|
| 31 |
-
##
|
|
|
|
|
|
|
|
|
|
| 32 |
|
| 33 |
-
|
|
|
|
|
|
|
| 34 |
|
| 35 |
-
##
|
| 36 |
|
| 37 |
-
###
|
|
|
|
| 38 |
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
-
|
| 42 |
-
-
|
| 43 |
-
-
|
| 44 |
-
- distributed_type: multi-GPU
|
| 45 |
-
- num_devices: 8
|
| 46 |
-
- total_train_batch_size: 256
|
| 47 |
-
- total_eval_batch_size: 16
|
| 48 |
-
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 49 |
-
- lr_scheduler_type: cosine
|
| 50 |
-
- lr_scheduler_warmup_ratio: 0.1
|
| 51 |
-
- num_epochs: 3
|
| 52 |
|
| 53 |
-
|
|
|
|
|
|
|
|
|
|
| 54 |
|
| 55 |
-
|
| 56 |
-
|
| 57 |
-
| 0.5745 | 0.3708 | 1000 | 0.6201 |
|
| 58 |
-
| 0.5569 | 0.7416 | 2000 | 0.5984 |
|
| 59 |
-
| 0.457 | 1.1123 | 3000 | 0.5947 |
|
| 60 |
-
| 0.4518 | 1.4831 | 4000 | 0.5845 |
|
| 61 |
-
| 0.4531 | 1.8539 | 5000 | 0.5761 |
|
| 62 |
-
| 0.3369 | 2.2247 | 6000 | 0.6050 |
|
| 63 |
-
| 0.3369 | 2.5955 | 7000 | 0.6043 |
|
| 64 |
-
| 0.3272 | 2.9663 | 8000 | 0.6036 |
|
| 65 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 66 |
|
| 67 |
-
|
|
|
|
|
|
|
| 68 |
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
- Datasets 4.0.0
|
| 72 |
-
- Tokenizers 0.22.1
|
|
|
|
| 1 |
---
|
| 2 |
+
license: gemma
|
| 3 |
+
datasets:
|
| 4 |
+
- NbAiLab/aurora-sft-2512-filtered
|
| 5 |
+
language:
|
| 6 |
+
- 'no'
|
| 7 |
+
- nb
|
| 8 |
+
- nn
|
| 9 |
+
base_model:
|
| 10 |
+
- google/gemma-3-12b-it
|
| 11 |
+
pipeline_tag: image-text-to-text
|
| 12 |
library_name: transformers
|
|
|
|
|
|
|
| 13 |
tags:
|
| 14 |
+
- conversational
|
| 15 |
+
- instruct
|
| 16 |
+
- experimental
|
|
|
|
|
|
|
|
|
|
| 17 |
---
|
| 18 |
|
| 19 |
+
# Borealis 12b Instruct (Preview)
|
|
|
|
| 20 |
|
| 21 |
+
Release: Dec 22nd, 2025.
|
| 22 |
|
| 23 |
+
## Model summary
|
| 24 |
+
**NbAiLab/borealis-12b-instruct-preview** is a **12b-parameter** instruction-tuned **preview** model intended for early testing and feedback. It is an **experiment** and should be treated as pre-release quality.
|
|
|
|
| 25 |
|
| 26 |
+
This model is based on [**google/gemma-3-12b-it**](https://huggingface.co/google/gemma-3-12b-it), and fine-tuned on textual instructions only.
|
| 27 |
|
| 28 |
+
| Model | Bits | Format |
|
| 29 |
+
|---|---:|---|
|
| 30 |
+
| [NbAiLab/borealis-12b-instruct-preview](https://huggingface.co/NbAiLab/borealis-12b-instruct-preview) | BF16 | Transformers (safetensors) |
|
| 31 |
+
| [NbAiLab/borealis-12b-instruct-preview-gguf](https://huggingface.co/NbAiLab/borealis-12b-instruct-preview-gguf) | 8 | GGUF (`q8_0`) |
|
| 32 |
+
| [NbAiLab/borealis-12b-instruct-preview-gguf](https://huggingface.co/NbAiLab/borealis-12b-instruct-preview-gguf) | 16 | GGUF (`f16`) |
|
| 33 |
+
| [NbAiLab/borealis-12b-instruct-preview-gguf](https://huggingface.co/NbAiLab/borealis-12b-instruct-preview-gguf) | BF16 | GGUF (`bf16`) |
|
| 34 |
+
| [NbAiLab/borealis-12b-instruct-preview-mlx](https://huggingface.co/NbAiLab/borealis-12b-instruct-preview-mlx) | 32 | MLX |
|
| 35 |
+
| [NbAiLab/borealis-12b-instruct-preview-mlx-8bits](https://huggingface.co/NbAiLab/borealis-12b-instruct-preview-mlx-8bits) | 8 | MLX (quantized) |
|
| 36 |
|
| 37 |
+
## Training data
|
| 38 |
+
Supervised fine-tuning (SFT) uses **NbAiLab/aurora-sft-2512** (not released yet).
|
| 39 |
|
| 40 |
+
## ⚠️ Safety / alignment disclaimer (important)
|
| 41 |
+
This is a **preview experiment** and **has not been safety-aligned yet**. The model may produce **harmful, biased, or insensitive** outputs (including content that is offensive, unsafe, or inappropriate). Do not use it for safety-critical or high-stakes applications, and add your own safety mitigations if deploying.
|
| 42 |
|
| 43 |
+
## Intended use
|
| 44 |
+
- Norwegian-centric assistant-style tasks (e.g., drafting, summarization, Q&A, light reasoning).
|
| 45 |
+
- Assesstment of Norwegian writing style and quality.
|
| 46 |
+
- Early evaluation of behavior, language coverage (Norwegian / Bokmål / Nynorsk), and quality.
|
| 47 |
|
| 48 |
+
## Limitations
|
| 49 |
+
- Preview quality; outputs may be unstable and may hallucinate.
|
| 50 |
+
- Not aligned for safety; may follow harmful instructions or generate problematic content (see disclaimer above).
|
| 51 |
|
| 52 |
+
## Weights & formats
|
| 53 |
|
| 54 |
+
### Transformers (original)
|
| 55 |
+
- **NbAiLab/borealis-12b-instruct-preview** (safetensors).
|
| 56 |
|
| 57 |
+
### GGUF quantizations
|
| 58 |
+
Available in [**NbAiLab/borealis-12b-instruct-preview-gguf**](https://huggingface.co/NbAiLab/borealis-12b-instruct-preview-gguf):
|
| 59 |
+
- `model-q8_0.gguf`
|
| 60 |
+
- `model-f16.gguf`
|
| 61 |
+
- `model-bf16.gguf`
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 62 |
|
| 63 |
+
Use:
|
| 64 |
+
```bash
|
| 65 |
+
ollama run hf.co/NbAiLab/borealis-12b-instruct-preview-gguf:BF16
|
| 66 |
+
```
|
| 67 |
|
| 68 |
+
### MLX (Apple Silicon)
|
| 69 |
+
Available in [**NbAiLab/borealis-12b-instruct-preview-mlx**](https://huggingface.co/NbAiLab/borealis-12b-instruct-preview-mlx) and quantized to [8 bits](https://huggingface.co/NbAiLab/borealis-12b-instruct-preview-mlx-8bits).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 70 |
|
| 71 |
+
Use:
|
| 72 |
+
```bash
|
| 73 |
+
# Install MLX LM
|
| 74 |
+
uv tool install mlx-lm
|
| 75 |
|
| 76 |
+
# Interactive chat REPL
|
| 77 |
+
mlx_lm.chat --model "NbAiLab/borealis-12b-instruct-preview-mlx"
|
| 78 |
+
```
|
| 79 |
|
| 80 |
+
## Acknowledgements
|
| 81 |
+
Thanks to the **Gemma** team at Google for releasing Gemma 3 and to everyone contributing feedback on this preview.
|
|
|
|
|
|