kramster commited on
Commit
35287d8
·
verified ·
1 Parent(s): 3143b60

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +36 -18
README.md CHANGED
@@ -15,18 +15,34 @@ datasets:
15
  base_model: mistralai/Mistral-7B-Instruct-v0.2
16
  ---
17
 
18
- # 🧠 Evolve Mistral: Fine-Tuned Mistral-7B-Instruct on CRUD Coding Tasks
19
 
20
- This model is a fine-tuned version of [`mistralai/Mistral-7B-Instruct-v0.2`](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2), adapted for reasoning about structured CRUD-based code inputs and instruction-following tasks. It was trained on a dataset in [Alpaca](https://github.com/tatsu-lab/stanford_alpaca)-style format, using supervised fine-tuning (SFT).
21
 
22
  ---
23
 
24
- ## 📂 Dataset
 
 
 
 
 
 
 
 
 
 
 
25
 
26
- The model was trained on:
 
 
27
 
28
  **[`kramster/crud-code-tests`](https://huggingface.co/datasets/kramster/crud-code-tests)**
29
- A dataset of instruction-based code snippets focusing on Create, Read, Update, and Delete operations in various programming contexts. It uses the Alpaca-style JSON format with fields: `instruction`, `input`, and `output`.
 
 
 
30
 
31
  ---
32
 
@@ -35,34 +51,36 @@ A dataset of instruction-based code snippets focusing on Create, Read, Update, a
35
  | Detail | Value |
36
  |---------------------|-------|
37
  | Base model | `mistralai/Mistral-7B-Instruct-v0.2` |
 
38
  | LoRA Config | r=32, alpha=16 |
39
  | Framework | Axolotl + DeepSpeed + LoRA |
40
- | Training Steps | 51 |
41
  | Epochs | ~3.94 |
42
- | Mixed Precision | bfloat16 |
 
43
  | GPU | NVIDIA H100 80GB |
44
- | Training Duration | 10m 26s |
45
- | Final Train Loss | 0.0909 |
46
- | Final Eval Loss | 0.1012 |
47
- | FLOPs used | 347.6 trillion |
48
-
49
 
50
  ---
51
 
52
  ## 🧪 Evaluation Summary
53
 
54
- - **Eval runtime:** 2.84s
55
- - **Eval samples/sec:** 2.11
56
- - **Eval steps/sec:** 1.05
57
- - **Gradient norm (final):** 0.064
58
- - **Final LR:** 2.93e-7
59
 
60
  ---
61
 
62
- ## 🧠 Example Usage
63
 
 
64
  vllm-api-server \
65
  --model kramster/evolve-mistral \
66
  --max-model-len 64000 \
67
  --rope-scaling '{"rope_type":"yarn","factor":4.0,"original_max_position_embeddings":32768}' \
68
  --no-enable-prefix-caching
 
 
15
  base_model: mistralai/Mistral-7B-Instruct-v0.2
16
  ---
17
 
18
+ # 🧠 Evolve Mistral: Fine-Tuned Mistral-7B-Instruct for AI CRUD & Code Generation
19
 
20
+ This is a fine-tuned version of [`mistralai/Mistral-7B-Instruct-v0.2`](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2), adapted specifically for **code generation, schema-driven CRUD reasoning, and full-stack boilerplate automation**. It powers the AI agent layer behind the [Self-Revolve project](https://github.com/self-evolving-runtimes/revolve).
21
 
22
  ---
23
 
24
+ ## 🌐 Project Context: Self-Revolve
25
+
26
+ [Evolve Mistral](https://huggingface.co/kramster/evolve-mistral) is a fine-tuned open-source model **purpose-built for powering code generation** in the [Self-Revolve project](https://github.com/self-evolving-runtimes/revolve).
27
+
28
+ > “Instantly generate full-stack admin panels, APIs, and UIs from your database schema—powered by AI agents & LLMs.”
29
+
30
+ **Key capabilities:**
31
+ - 🧠 Auto-generates CRUD APIs from DB schemas
32
+ - ✨ Generates React/MUI admin interfaces
33
+ - 🗃️ Supports SQL & NoSQL databases
34
+ - ⚡ Works without OpenAI keys
35
+ - 🚀 Open-source & self-hostable
36
 
37
+ ---
38
+
39
+ ## 📂 Dataset
40
 
41
  **[`kramster/crud-code-tests`](https://huggingface.co/datasets/kramster/crud-code-tests)**
42
+ A high-quality Alpaca-style dataset focused on database and backend code generation. Each example contains:
43
+ - `instruction`
44
+ - `input`
45
+ - `output`
46
 
47
  ---
48
 
 
51
  | Detail | Value |
52
  |---------------------|-------|
53
  | Base model | `mistralai/Mistral-7B-Instruct-v0.2` |
54
+ | Dataset | `crud-code-tests` (Alpaca-style) |
55
  | LoRA Config | r=32, alpha=16 |
56
  | Framework | Axolotl + DeepSpeed + LoRA |
 
57
  | Epochs | ~3.94 |
58
+ | Steps | 51 |
59
+ | Precision | bfloat16 |
60
  | GPU | NVIDIA H100 80GB |
61
+ | Duration | ~10m |
62
+ | Train Loss | 0.0909 |
63
+ | Eval Loss | 0.1012 |
64
+ | FLOPs | ~347.6 trillion |
 
65
 
66
  ---
67
 
68
  ## 🧪 Evaluation Summary
69
 
70
+ - Eval runtime: 2.84s
71
+ - Samples/sec: 2.11
72
+ - Steps/sec: 1.05
73
+ - Final learning rate: 2.93e-7
74
+ - Gradient norm: 0.064
75
 
76
  ---
77
 
78
+ ## 💻 Example Usage (VLLM)
79
 
80
+ ```bash
81
  vllm-api-server \
82
  --model kramster/evolve-mistral \
83
  --max-model-len 64000 \
84
  --rope-scaling '{"rope_type":"yarn","factor":4.0,"original_max_position_embeddings":32768}' \
85
  --no-enable-prefix-caching
86
+ ```