Update README.md

Browse files

Files changed (1) hide show

README.md +54 -3

README.md CHANGED Viewed

@@ -1,3 +1,54 @@
----
-license: apache-2.0
----

+# Command-R 35B — SFT (Supervised Fine-Tuning)
+**Model type:** Causal Language Model
+**Base model:** [CohereLabs/c4ai-command-r-v01](https://huggingface.co/CohereLabs/c4ai-command-r-v01)
+**License:** Apache 2.0
+**Framework:** [Axolotl](https://github.com/OpenAccess-AI-Collective/axolotl)
+---
+## Overview
+`commandr-SFT` is a **supervised fine-tuned** variant of Cohere’s Command-R 35B model.
+Fine-tuning was performed on a high-quality instruction-following dataset using LoRA adapters, enabling improved conversational reasoning and question answering.
+---
+## Training Setup
+**Objective:** Supervised fine-tuning (instruction following)
+**Adapter type:** LoRA
+**Precision:** bfloat16
+**Hardware:** 8 nodes × 2 × NVIDIA A100 64GB GPUs
+**Training duration:** ~6 hours
+**Framework:** DeepSpeed ZeRO-1, Axolotl, PyTorch 2.5.1+cu121
+---
+## Dataset
+**Name:** `axolotl_deduplicated_synthetic_qa.jsonl`
+**Type:** Instruction-following synthetic QA dataset
+**Split:** 70% train / 30% validation
+Each sample follows a QA/chat format used in the `alpaca_chat.load_qa` schema.
+---
+## Hyperparameters
+| Parameter | Value |
+|------------|-------|
+| Sequence length | 2048 |
+| Micro batch size | 2 |
+| Gradient accumulation | 2 |
+| Learning rate | 2e-4 |
+| LR scheduler | cosine |
+| Optimizer | AdamW (8-bit) |
+| LoRA rank (r) | 16 |
+| LoRA alpha | 32 |
+| LoRA dropout | 0.05 |
+| Target modules | q_proj, v_proj, k_proj, o_proj |
+| Epochs | 1 |
+| Warmup steps | 10 |
+| Weight decay | 0.0 |