InterstellarCG's picture
Add model card
ecc8338 verified
metadata
license: mit
language:
  - en
tags:
  - text-to-sql
  - spider
  - hrm-text
base_model: sapientai/HRM-Text-1B

HRM-Text-1B-SQL-Spider

Fine-tuned version of HRM-Text-1B on the Spider text-to-SQL dataset.

Model Details

  • Base Model: sapientai/HRM-Text-1B (1B parameters, hierarchical reasoning model)
  • Training Data: Spider dataset (~20k examples)
  • Training: 3 epochs, ~6 minutes on L40S GPU
  • Architecture: Hierarchical Reasoning Model with H_cycles=2, L_cycles=3

Performance

Model Accuracy
Base 8.00%
Fine-tuned 70.00%

Usage

Training Details

  • Framework: PyTorch with FlashAttention 3
  • Loss: Cross-entropy
  • Hardware: AWS L40S GPU
  • Training Time: ~6 minutes

Limitations

  • Maximum sequence length: 4096 tokens
  • Requires FlashAttention 3 for inference (Ada Lovelace or newer GPUs)
  • Best performance on Spider-style schema-aware SQL generation

License

MIT License