InterstellarCG's picture
Add model card
ecc8338 verified
---
license: mit
language:
- en
tags:
- text-to-sql
- spider
- hrm-text
base_model: sapientai/HRM-Text-1B
---
# HRM-Text-1B-SQL-Spider
Fine-tuned version of [HRM-Text-1B](https://huggingface.co/sapientai/HRM-Text-1B) on the Spider text-to-SQL dataset.
## Model Details
- **Base Model:** sapientai/HRM-Text-1B (1B parameters, hierarchical reasoning model)
- **Training Data:** Spider dataset (~20k examples)
- **Training:** 3 epochs, ~6 minutes on L40S GPU
- **Architecture:** Hierarchical Reasoning Model with H_cycles=2, L_cycles=3
## Performance
| Model | Accuracy |
|-------|----------|
| Base | 8.00% |
| **Fine-tuned** | **70.00%** |
## Usage
## Training Details
- **Framework:** PyTorch with FlashAttention 3
- **Loss:** Cross-entropy
- **Hardware:** AWS L40S GPU
- **Training Time:** ~6 minutes
## Limitations
- Maximum sequence length: 4096 tokens
- Requires FlashAttention 3 for inference (Ada Lovelace or newer GPUs)
- Best performance on Spider-style schema-aware SQL generation
## License
MIT License