matsuo-llm-advanced-phase-e6a
Fine-tuned from Qwen/Qwen2.5-7B-Instruct for agent tasks.
Training Configuration
- LoRA: r=12, alpha=24 (r=8โ12 only change from Phase D)
- lr: 1e-5, epochs: 0.3, batch: 4ร4=16
- Data: Phase D identical (Spider/BIRD 70% + DBBench 20% + ALFWorld 10%, 3500 samples)
Datasets
u-10bei/dbbench_sft_dataset_react_v4โ Listed in the organizer-shared Phase B dataset list. Used as provided (no modification). Third-party synthetic SFT for DBBench format alignment; all tables, data, and queries are independently generated (per dataset description: "to avoid test data leakage").xlangai/spiderโ CC BY-SA 4.0 (Yale/Columbia Spider project)birdsql/bird_mini_devโ CC BY-SA 4.0 (HKU)- Official Phase B ALFWorld v5 dataset โ Organizer-provided, used as provided.
Compliance
- Evaluation data not used in training: No analysis of evaluation test data was conducted.
- LLM was not used for data quality filtering or selection.
- Inference code not modified.
Usage
Compatible with vLLM v0.13.0+.
- Downloads last month
- 41