HRM-Text-1B-SQL-Spider

Fine-tuned version of HRM-Text-1B on the Spider text-to-SQL dataset.

Model Details

  • Base Model: sapientai/HRM-Text-1B (1B parameters, hierarchical reasoning model)
  • Training Data: Spider dataset (~20k examples)
  • Training: 3 epochs, ~6 minutes on L40S GPU
  • Architecture: Hierarchical Reasoning Model with H_cycles=2, L_cycles=3

Performance

Model Accuracy
Base 8.00%
Fine-tuned 70.00%

Usage

Training Details

  • Framework: PyTorch with FlashAttention 3
  • Loss: Cross-entropy
  • Hardware: AWS L40S GPU
  • Training Time: ~6 minutes

Limitations

  • Maximum sequence length: 4096 tokens
  • Requires FlashAttention 3 for inference (Ada Lovelace or newer GPUs)
  • Best performance on Spider-style schema-aware SQL generation

License

MIT License

Downloads last month
29
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support