InterstellarCG
/

HRM-Text-1B-SQL-Spider

Model card Files Files and versions

HRM-Text-1B-SQL-Spider / README.md

InterstellarCG's picture

Add model card

ecc8338 verified 3 days ago

|

history blame contribute delete

1.03 kB

	---
	license: mit
	language:
	- en
	tags:
	- text-to-sql
	- spider
	- hrm-text
	base_model: sapientai/HRM-Text-1B
	---

	# HRM-Text-1B-SQL-Spider

	Fine-tuned version of [HRM-Text-1B](https://huggingface.co/sapientai/HRM-Text-1B) on the Spider text-to-SQL dataset.

	## Model Details

	- Base Model: sapientai/HRM-Text-1B (1B parameters, hierarchical reasoning model)
	- Training Data: Spider dataset (~20k examples)
	- Training: 3 epochs, ~6 minutes on L40S GPU
	- Architecture: Hierarchical Reasoning Model with H_cycles=2, L_cycles=3

	## Performance

	\| Model \| Accuracy \|
	\|-------\|----------\|
	\| Base \| 8.00% \|
	\| Fine-tuned \| 70.00% \|

	## Usage



	## Training Details

	- Framework: PyTorch with FlashAttention 3
	- Loss: Cross-entropy
	- Hardware: AWS L40S GPU
	- Training Time: ~6 minutes

	## Limitations

	- Maximum sequence length: 4096 tokens
	- Requires FlashAttention 3 for inference (Ada Lovelace or newer GPUs)
	- Best performance on Spider-style schema-aware SQL generation

	## License

	MIT License