Update README.md

Browse files

Files changed (1) hide show

README.md +170 -3

README.md CHANGED Viewed

@@ -1,3 +1,170 @@
----
-license: llama2
----

+---
+{
+  "language": ["en"],
+  "license": "llama2",
+  "tags": [
+    "text-generation",
+    "causal-lm",
+    "supervised-fine-tuning",
+    "instruction-tuning",
+    "synthetic-qa",
+    "lora",
+    "axolotl",
+    "deepspeed",
+    "transformers",
+    "llava",
+    "eu-hpc"
+  ],
+  "datasets": [
+    "axolotl_deduplicated_synthetic_qa"
+  ],
+  "metrics": [
+    "loss"
+  ],
+  "library_name": "transformers",
+  "framework": "pytorch",
+  "base_model": "llava-hf/llava-1.5-7b-hf",
+  "model_name": "llava-7b-sft",
+  "pipeline_tag": "text-generation",
+  "task_categories": ["text-generation", "question-answering"],
+  "model_type": "llava",
+  "inference": {
+    "parameters": {
+      "max_new_tokens": 512,
+      "temperature": 0.7,
+      "top_p": 0.9
+    }
+  },
+  "trained_on": [
+    "Leonardo EuroHPC"
+  ],
+  "description": "Supervised fine-tuning (SFT) of LLaVA 1.5 7B on synthetic QA pairs using Axolotl and DeepSpeed ZeRO-1. The model improves text-based question answering and instruction following while preserving its multimodal capabilities."
+}
+---
+# LLaVA 7B — Supervised Fine-Tuning (SFT) on Synthetic QA
+**Model type:** Vision-Language Causal Model (text-finetuned LLaVA-1.5)
+**Base model:** [llava-hf/llava-1.5-7b-hf](https://huggingface.co/llava-hf/llava-1.5-7b-hf)
+**License:** Llama 2 Community License
+**Framework:** Axolotl + DeepSpeed ZeRO-1 (PyTorch 2.5.1 + CUDA 12.1)
+---
+## Overview
+`llava-7b-sft` is a **supervised fine-tuned** version of **LLaVA 1.5 7B**, trained on a synthetic instruction-following dataset of **question–answer pairs** to enhance text understanding and reasoning.
+Although derived from a multimodal base, this SFT run fine-tunes the **language model component** using LoRA adapters which were later **merged into the full model weights**.
+This model therefore supports **text-only generation** natively (without PEFT) and retains compatibility with the **multimodal processor and vision configuration** from LLaVA.
+Training was conducted on the **Leonardo EuroHPC** system using **Axolotl** and **DeepSpeed ZeRO-1**.
+---
+## Training Setup
+| Component | Specification |
+|:-----------|:--------------|
+| **Objective** | Supervised fine-tuning (instruction-following QA) |
+| **Adapter type** | LoRA (merged into full model) |
+| **Precision** | bfloat16 |
+| **Hardware** | 8 nodes × 2 × NVIDIA A100 64 GB GPUs |
+| **Framework** | Axolotl 0.6 + DeepSpeed ZeRO-1 (PyTorch 2.5.1 + CUDA 12.1) |
+| **Runtime** | ~24 hours |
+| **Checkpoints** | 2 per epoch |
+| **Vision tower** | Frozen during SFT |
+| **Dataset split** | 70% train / 30% validation |
+---
+## Dataset
+**Name:** `axolotl_deduplicated_synthetic_qa.jsonl`
+**Type:** Instruction-following synthetic QA dataset (Alpaca-style)
+Each record contains a single-turn question and a high-quality generated answer.
+This SFT data improves the model’s **reasoning**, **language coherence**, and **conversational QA** quality.
+---
+## Hyperparameters
+| Parameter | Value |
+|:-----------|:------|
+| Sequence length | 2048 |
+| Micro batch size | 1 |
+| Gradient accumulation | 4 |
+| Epochs | 1 |
+| Learning rate | 0.0002 |
+| LR scheduler | cosine |
+| Optimizer | AdamW (8-bit) |
+| Warmup steps | 10 |
+| Weight decay | 0.0 |
+| LoRA rank (r) | 16 |
+| LoRA alpha | 32 |
+| LoRA dropout | 0.05 |
+| LoRA target modules | `q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj` |
+| Gradient checkpointing | ✅ |
+| Flash attention | ✅ |
+| Validation set size | 0.3 |
+| Evals per epoch | 2 |
+---
+## Tokenizer & Processor
+| Component | Description |
+|:-----------|:-------------|
+| **Tokenizer type** | `AutoTokenizer` |
+| **Processor type** | `AutoProcessor` (compatible with LLaVA image+text inputs) |
+| **Pad token** | `<pad>` (ID 32001) |
+| **Chat template** | `llava` |
+The processor configuration allows image or text inputs; however, this release focuses on text-based supervised tuning.
+---
+## Files Included
+This repository contains the **fully merged model weights** and all required configs for direct use with `transformers`:
+- `config.json`
+- `model-*.safetensors`
+- `tokenizer.json`
+- `tokenizer_config.json`
+- `tokenizer.model`
+- `special_tokens_map.json`
+- `processor_config.json`
+- `preprocessor_config.json`
+- `vision_config.json`
+- `image_processor_config.json`
+- `README.md`
+---
+## Usage Example
+To run text-based generation with this model:
+```python
+import torch
+from transformers import AutoProcessor, AutoModelForCausalLM
+model_id = "ubitech-edg/llava-7b-sft"
+processor = AutoProcessor.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.bfloat16,
+    device_map="auto"
+)
+prompt = "USER: Explain the principle of energy conservation.\nASSISTANT:"
+inputs = processor(text=prompt, return_tensors="pt").to("cuda")
+with torch.inference_mode():
+    outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.7, top_p=0.9)
+print(processor.decode(outputs[0], skip_special_tokens=True))
+```