Update README.md

Browse files

Files changed (1) hide show

README.md +106 -3

README.md CHANGED Viewed

@@ -1,3 +1,106 @@
----
-license: apache-2.0
----

+---
+{
+  "language": ["en"],
+  "license": "apache-2.0",
+  "tags": [
+    "text-generation",
+    "causal-lm",
+    "instruction-tuning",
+    "supervised-fine-tuning",
+    "synthetic-qa",
+    "lora",
+    "axolotl",
+    "deepspeed",
+    "transformers",
+    "mistral",
+    "nemo",
+    "eu-hpc"
+  ],
+  "datasets": ["axolotl_deduplicated_synthetic_qa"],
+  "metrics": ["loss"],
+  "library_name": "transformers",
+  "framework": "pytorch",
+  "base_model": "mistralai/Mistral-Nemo-Instruct-2407",
+  "model_name": "mistral-12b-sft",
+  "pipeline_tag": "text-generation",
+  "task_categories": ["text-generation", "instruction-following"],
+  "model_type": "AutoModelForCausalLM",
+  "inference": {
+    "parameters": {
+      "max_new_tokens": 512,
+      "temperature": 0.7,
+      "top_p": 0.9
+    }
+  },
+  "trained_on": ["Leonardo EuroHPC"],
+  "description": "Supervised fine-tuning (SFT) of Mistral 12B Nemo Instruct on synthetic QA data using LoRA with Axolotl and DeepSpeed. Improves conversational reasoning and factual accuracy."
+}
+---
+# Mistral 12B — SFT (Supervised Fine-Tuning on Synthetic QA)
+**Model type:** Causal Language Model
+**Base model:** [mistralai/Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407)
+**License:** Apache 2.0
+**Framework:** [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl)
+---
+## Overview
+`mistral-12b-sft` is a **supervised fine-tuned** variant of Mistral-12B trained on high-quality synthetic QA data.
+This SFT phase enhances instruction following, factual reasoning, and conversational ability while maintaining model efficiency via 8-bit LoRA adapters.
+Training was conducted on **Leonardo EuroHPC**.
+---
+## Training Setup
+**Objective:** Supervised fine-tuning (instruction-following QA)
+**Adapter:** LoRA + 8-bit base
+**Precision:** bfloat16
+**Hardware:** 8 × 2 × A100 64 GB
+**Framework:** Axolotl + DeepSpeed + PyTorch 2.5.1 + CUDA 12.1
+**Runtime:** ~6 h
+**Validation:** 30 %
+---
+## Dataset
+| Dataset | Type | Description |
+|----------|------|-------------|
+| `axolotl_deduplicated_synthetic_qa.jsonl` | `alpaca_chat.load_qa` | Synthetic instruction–response pairs for QA and chat fine-tuning |
+---
+## Hyperparameters
+| Parameter | Value |
+|------------|-------|
+| Sequence length | 2048 |
+| Micro batch size | 2 |
+| Gradient accumulation | 2 |
+| Epochs | 1 |
+| Learning rate | 0.0002 |
+| LR scheduler | cosine |
+| Optimizer | AdamW (8-bit) |
+| Warmup steps | 10 |
+| Weight decay | 0.0 |
+| LoRA rank (r) | 16 |
+| LoRA alpha | 32 |
+| LoRA dropout | 0.05 |
+| LoRA targets | q_proj, k_proj, v_proj, o_proj |
+| Gradient checkpointing | ✅ |
+| Flash attention | ✅ |
+| Auto-resume | ✅ |
+| Loss watchdog | threshold 5.0, patience 3 |
+---
+## Tokenizer
+**Tokenizer type:** `AutoTokenizer`
+**Pad token:** `<|end_of_text|>`