InterstellarCG commited on
Commit
ecc8338
·
verified ·
1 Parent(s): 5f5f901

Add model card

Browse files
Files changed (1) hide show
  1. README.md +49 -0
README.md ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ tags:
6
+ - text-to-sql
7
+ - spider
8
+ - hrm-text
9
+ base_model: sapientai/HRM-Text-1B
10
+ ---
11
+
12
+ # HRM-Text-1B-SQL-Spider
13
+
14
+ Fine-tuned version of [HRM-Text-1B](https://huggingface.co/sapientai/HRM-Text-1B) on the Spider text-to-SQL dataset.
15
+
16
+ ## Model Details
17
+
18
+ - **Base Model:** sapientai/HRM-Text-1B (1B parameters, hierarchical reasoning model)
19
+ - **Training Data:** Spider dataset (~20k examples)
20
+ - **Training:** 3 epochs, ~6 minutes on L40S GPU
21
+ - **Architecture:** Hierarchical Reasoning Model with H_cycles=2, L_cycles=3
22
+
23
+ ## Performance
24
+
25
+ | Model | Accuracy |
26
+ |-------|----------|
27
+ | Base | 8.00% |
28
+ | **Fine-tuned** | **70.00%** |
29
+
30
+ ## Usage
31
+
32
+
33
+
34
+ ## Training Details
35
+
36
+ - **Framework:** PyTorch with FlashAttention 3
37
+ - **Loss:** Cross-entropy
38
+ - **Hardware:** AWS L40S GPU
39
+ - **Training Time:** ~6 minutes
40
+
41
+ ## Limitations
42
+
43
+ - Maximum sequence length: 4096 tokens
44
+ - Requires FlashAttention 3 for inference (Ada Lovelace or newer GPUs)
45
+ - Best performance on Spider-style schema-aware SQL generation
46
+
47
+ ## License
48
+
49
+ MIT License