laqwenta_sft_model / README.md

jeanprbt

doc: update model card

d6c3e4b verified 8 months ago

preview code

raw

history blame contribute delete

531 Bytes

metadata

library_name: transformers
datasets:
  - meta-math/MetaMathQA
base_model:
  - Qwen/Qwen3-0.6B-Base
pipeline_tag: text-generation

This model is a fine-tuned version of Qwen3-0.6B-Base, fine-tuned on a sub-sample of 6k pairs from MetaMathQA dataset using SFT and the trl library. It is the first step of LaQwenTa, a light-weight STEM QA answering model for educational purposes.