kramster
/

evolve-mistral

@@ -15,18 +15,34 @@ datasets:
 base_model: mistralai/Mistral-7B-Instruct-v0.2
 ---
-# 🧠 Evolve Mistral: Fine-Tuned Mistral-7B-Instruct on CRUD Coding Tasks
-This model is a fine-tuned version of [`mistralai/Mistral-7B-Instruct-v0.2`](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2), adapted for reasoning about structured CRUD-based code inputs and instruction-following tasks. It was trained on a dataset in [Alpaca](https://github.com/tatsu-lab/stanford_alpaca)-style format, using supervised fine-tuning (SFT).
 ---
-## 📂 Dataset
-The model was trained on:
 **[`kramster/crud-code-tests`](https://huggingface.co/datasets/kramster/crud-code-tests)**
-A dataset of instruction-based code snippets focusing on Create, Read, Update, and Delete operations in various programming contexts. It uses the Alpaca-style JSON format with fields: `instruction`, `input`, and `output`.
 ---
@@ -35,34 +51,36 @@ A dataset of instruction-based code snippets focusing on Create, Read, Update, a
 | Detail              | Value |
 |---------------------|-------|
 | Base model          | `mistralai/Mistral-7B-Instruct-v0.2` |
 | LoRA Config         | r=32, alpha=16 |
 | Framework           | Axolotl + DeepSpeed + LoRA |
-| Training Steps      | 51 |
 | Epochs              | ~3.94 |
-| Mixed Precision     | bfloat16 |
 | GPU                 | NVIDIA H100 80GB |
-| Training Duration   | 10m 26s |
-| Final Train Loss    | 0.0909 |
-| Final Eval Loss     | 0.1012 |
-| FLOPs used          | 347.6 trillion |
 ---
 ## 🧪 Evaluation Summary
-- **Eval runtime:** 2.84s
-- **Eval samples/sec:** 2.11
-- **Eval steps/sec:** 1.05
-- **Gradient norm (final):** 0.064
-- **Final LR:** 2.93e-7
 ---
-## 🧠 Example Usage
 vllm-api-server \
   --model kramster/evolve-mistral \
   --max-model-len 64000 \
   --rope-scaling '{"rope_type":"yarn","factor":4.0,"original_max_position_embeddings":32768}' \
   --no-enable-prefix-caching

 base_model: mistralai/Mistral-7B-Instruct-v0.2
 ---
+# 🧠 Evolve Mistral: Fine-Tuned Mistral-7B-Instruct for AI CRUD & Code Generation
+This is a fine-tuned version of [`mistralai/Mistral-7B-Instruct-v0.2`](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2), adapted specifically for **code generation, schema-driven CRUD reasoning, and full-stack boilerplate automation**. It powers the AI agent layer behind the [Self-Revolve project](https://github.com/self-evolving-runtimes/revolve).
 ---
+## 🌐 Project Context: Self-Revolve
+[Evolve Mistral](https://huggingface.co/kramster/evolve-mistral) is a fine-tuned open-source model **purpose-built for powering code generation** in the [Self-Revolve project](https://github.com/self-evolving-runtimes/revolve).
+> “Instantly generate full-stack admin panels, APIs, and UIs from your database schema—powered by AI agents & LLMs.”
+**Key capabilities:**
+- 🧠 Auto-generates CRUD APIs from DB schemas
+- ✨ Generates React/MUI admin interfaces
+- 🗃️ Supports SQL & NoSQL databases
+- ⚡ Works without OpenAI keys
+- 🚀 Open-source & self-hostable
+---
+## 📂 Dataset
 **[`kramster/crud-code-tests`](https://huggingface.co/datasets/kramster/crud-code-tests)**
+A high-quality Alpaca-style dataset focused on database and backend code generation. Each example contains:
+- `instruction`
+- `input`
+- `output`
 ---
 | Detail              | Value |
 |---------------------|-------|
 | Base model          | `mistralai/Mistral-7B-Instruct-v0.2` |
+| Dataset             | `crud-code-tests` (Alpaca-style) |
 | LoRA Config         | r=32, alpha=16 |
 | Framework           | Axolotl + DeepSpeed + LoRA |
 | Epochs              | ~3.94 |
+| Steps               | 51 |
+| Precision           | bfloat16 |
 | GPU                 | NVIDIA H100 80GB |
+| Duration            | ~10m |
+| Train Loss          | 0.0909 |
+| Eval Loss           | 0.1012 |
+| FLOPs               | ~347.6 trillion |
 ---
 ## 🧪 Evaluation Summary
+- Eval runtime: 2.84s
+- Samples/sec: 2.11
+- Steps/sec: 1.05
+- Final learning rate: 2.93e-7
+- Gradient norm: 0.064
 ---
+## 💻 Example Usage (VLLM)
+```bash
 vllm-api-server \
   --model kramster/evolve-mistral \
   --max-model-len 64000 \
   --rope-scaling '{"rope_type":"yarn","factor":4.0,"original_max_position_embeddings":32768}' \
   --no-enable-prefix-caching
+  ```