versae commited on
Commit
ee331c1
·
verified ·
1 Parent(s): 90937fe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -51
README.md CHANGED
@@ -1,72 +1,81 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
2
  library_name: transformers
3
- license: other
4
- base_model: google/gemma-3-12b-it
5
  tags:
6
- - llama-factory
7
- - full
8
- - generated_from_trainer
9
- model-index:
10
- - name: model
11
- results: []
12
  ---
13
 
14
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
- should probably proofread and complete it, then remove this comment. -->
16
 
17
- # model
18
 
19
- This model is a fine-tuned version of [google/gemma-3-12b-it](https://huggingface.co/google/gemma-3-12b-it) on the aurora_sft_2512_filtered_train dataset.
20
- It achieves the following results on the evaluation set:
21
- - Loss: 0.6036
22
 
23
- ## Model description
24
 
25
- More information needed
 
 
 
 
 
 
 
26
 
27
- ## Intended uses & limitations
 
28
 
29
- More information needed
 
30
 
31
- ## Training and evaluation data
 
 
 
32
 
33
- More information needed
 
 
34
 
35
- ## Training procedure
36
 
37
- ### Training hyperparameters
 
38
 
39
- The following hyperparameters were used during training:
40
- - learning_rate: 1e-05
41
- - train_batch_size: 32
42
- - eval_batch_size: 2
43
- - seed: 42
44
- - distributed_type: multi-GPU
45
- - num_devices: 8
46
- - total_train_batch_size: 256
47
- - total_eval_batch_size: 16
48
- - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
49
- - lr_scheduler_type: cosine
50
- - lr_scheduler_warmup_ratio: 0.1
51
- - num_epochs: 3
52
 
53
- ### Training results
 
 
 
54
 
55
- | Training Loss | Epoch | Step | Validation Loss |
56
- |:-------------:|:------:|:----:|:---------------:|
57
- | 0.5745 | 0.3708 | 1000 | 0.6201 |
58
- | 0.5569 | 0.7416 | 2000 | 0.5984 |
59
- | 0.457 | 1.1123 | 3000 | 0.5947 |
60
- | 0.4518 | 1.4831 | 4000 | 0.5845 |
61
- | 0.4531 | 1.8539 | 5000 | 0.5761 |
62
- | 0.3369 | 2.2247 | 6000 | 0.6050 |
63
- | 0.3369 | 2.5955 | 7000 | 0.6043 |
64
- | 0.3272 | 2.9663 | 8000 | 0.6036 |
65
 
 
 
 
 
66
 
67
- ### Framework versions
 
 
68
 
69
- - Transformers 4.57.1
70
- - Pytorch 2.6.0+cu124
71
- - Datasets 4.0.0
72
- - Tokenizers 0.22.1
 
1
  ---
2
+ license: gemma
3
+ datasets:
4
+ - NbAiLab/aurora-sft-2512-filtered
5
+ language:
6
+ - 'no'
7
+ - nb
8
+ - nn
9
+ base_model:
10
+ - google/gemma-3-12b-it
11
+ pipeline_tag: image-text-to-text
12
  library_name: transformers
 
 
13
  tags:
14
+ - conversational
15
+ - instruct
16
+ - experimental
 
 
 
17
  ---
18
 
19
+ # Borealis 12b Instruct (Preview)
 
20
 
21
+ Release: Dec 22nd, 2025.
22
 
23
+ ## Model summary
24
+ **NbAiLab/borealis-12b-instruct-preview** is a **12b-parameter** instruction-tuned **preview** model intended for early testing and feedback. It is an **experiment** and should be treated as pre-release quality.
 
25
 
26
+ This model is based on [**google/gemma-3-12b-it**](https://huggingface.co/google/gemma-3-12b-it), and fine-tuned on textual instructions only.
27
 
28
+ | Model | Bits | Format |
29
+ |---|---:|---|
30
+ | [NbAiLab/borealis-12b-instruct-preview](https://huggingface.co/NbAiLab/borealis-12b-instruct-preview) | BF16 | Transformers (safetensors) |
31
+ | [NbAiLab/borealis-12b-instruct-preview-gguf](https://huggingface.co/NbAiLab/borealis-12b-instruct-preview-gguf) | 8 | GGUF (`q8_0`) |
32
+ | [NbAiLab/borealis-12b-instruct-preview-gguf](https://huggingface.co/NbAiLab/borealis-12b-instruct-preview-gguf) | 16 | GGUF (`f16`) |
33
+ | [NbAiLab/borealis-12b-instruct-preview-gguf](https://huggingface.co/NbAiLab/borealis-12b-instruct-preview-gguf) | BF16 | GGUF (`bf16`) |
34
+ | [NbAiLab/borealis-12b-instruct-preview-mlx](https://huggingface.co/NbAiLab/borealis-12b-instruct-preview-mlx) | 32 | MLX |
35
+ | [NbAiLab/borealis-12b-instruct-preview-mlx-8bits](https://huggingface.co/NbAiLab/borealis-12b-instruct-preview-mlx-8bits) | 8 | MLX (quantized) |
36
 
37
+ ## Training data
38
+ Supervised fine-tuning (SFT) uses **NbAiLab/aurora-sft-2512** (not released yet).
39
 
40
+ ## ⚠️ Safety / alignment disclaimer (important)
41
+ This is a **preview experiment** and **has not been safety-aligned yet**. The model may produce **harmful, biased, or insensitive** outputs (including content that is offensive, unsafe, or inappropriate). Do not use it for safety-critical or high-stakes applications, and add your own safety mitigations if deploying.
42
 
43
+ ## Intended use
44
+ - Norwegian-centric assistant-style tasks (e.g., drafting, summarization, Q&A, light reasoning).
45
+ - Assesstment of Norwegian writing style and quality.
46
+ - Early evaluation of behavior, language coverage (Norwegian / Bokmål / Nynorsk), and quality.
47
 
48
+ ## Limitations
49
+ - Preview quality; outputs may be unstable and may hallucinate.
50
+ - Not aligned for safety; may follow harmful instructions or generate problematic content (see disclaimer above).
51
 
52
+ ## Weights & formats
53
 
54
+ ### Transformers (original)
55
+ - **NbAiLab/borealis-12b-instruct-preview** (safetensors).
56
 
57
+ ### GGUF quantizations
58
+ Available in [**NbAiLab/borealis-12b-instruct-preview-gguf**](https://huggingface.co/NbAiLab/borealis-12b-instruct-preview-gguf):
59
+ - `model-q8_0.gguf`
60
+ - `model-f16.gguf`
61
+ - `model-bf16.gguf`
 
 
 
 
 
 
 
 
62
 
63
+ Use:
64
+ ```bash
65
+ ollama run hf.co/NbAiLab/borealis-12b-instruct-preview-gguf:BF16
66
+ ```
67
 
68
+ ### MLX (Apple Silicon)
69
+ Available in [**NbAiLab/borealis-12b-instruct-preview-mlx**](https://huggingface.co/NbAiLab/borealis-12b-instruct-preview-mlx) and quantized to [8 bits](https://huggingface.co/NbAiLab/borealis-12b-instruct-preview-mlx-8bits).
 
 
 
 
 
 
 
 
70
 
71
+ Use:
72
+ ```bash
73
+ # Install MLX LM
74
+ uv tool install mlx-lm
75
 
76
+ # Interactive chat REPL
77
+ mlx_lm.chat --model "NbAiLab/borealis-12b-instruct-preview-mlx"
78
+ ```
79
 
80
+ ## Acknowledgements
81
+ Thanks to the **Gemma** team at Google for releasing Gemma 3 and to everyone contributing feedback on this preview.