| | --- |
| | library_name: transformers |
| | tags: |
| | - small-lm |
| | - code |
| | - reasoning |
| | - slm |
| | license: apache-2.0 |
| | datasets: |
| | - theblackcat102/evol-codealpaca-v1 |
| | base_model: |
| | - Qwen/Qwen3-1.7B |
| | --- |
| | |
| | # Qwen3-1.7B-Code |
| |
|
| | This model is obtained by fine-tuning Qwen/Qwen3-1.7B on the [evol-codealpaca-v1](https://huggingface.co/datasets/theblackcat102/evol-codealpaca-v1) train split. |
| | The model is used in the experiments described in https://bknyaz.github.io/blog/2026/meta-merge/. |
| | Single A100 was used for fine-tuning and evaluation. |
| |
|
| | The following versions were used for train/eval: |
| |
|
| | - python >= 3.10 |
| | - torch : 2.9.0+cu128 |
| | - lm_eval : 0.4.9.1 |
| | - vllm : 0.11.1 |
| | - transformers : 4.57.6 |
| | - datasets : 3.2.0 |
| | - numpy : 2.2.6 |
| | |
| | ## Training |
| | |
| | The [TRL](https://github.com/huggingface/trl) library was used with SFT/full-rank options: |
| | |
| | ```bash |
| | python trl/scripts/sft.py --model_name_or_path Qwen/Qwen3-1.7B --dataset_name theblackcat102/evol-codealpaca-v1 --learning_rate 2e-5 \ |
| | --num_train_epochs 1 --per_device_train_batch_size 2 --gradient_accumulation_steps 8 --gradient_checkpointing --eos_token '<|im_end|>' --eval_strategy no \ |
| | --completion_only_loss True --report_to wandb --output_dir /path/to/the/finetuned/model |
| | ``` |
| | |
| | This is by far not the most compute and performance efficient fine-tuning, but it could be a good baseline. |
| | |
| | The dataset was preprocessed to the conversational format: |
| | |
| | ```python |
| | # trl/scripts/sft.py |
| |
|
| | dataset = load_dataset(...) |
| | |
| | def preprocess_function(example): |
| | return { |
| | "prompt": [{"role": "user", "content": example["instruction"]}], |
| | "completion": [ |
| | {"role": "assistant", "content": example['output']} |
| | ], |
| | } |
| | |
| | dataset = dataset.map(preprocess_function) |
| | ``` |
| | |
| | ## Evaluation |
| | |
| | Evaluation was done with lm_eval on the humaneval (instruct) benchmark: |
| |
|
| | ```bash |
| | python -m lm_eval --model vllm --model_args pretrained=${model},tensor_parallel_size=1,dtype=auto,gpu_memory_utilization=0.9,data_parallel_size=1 \ |
| | --tasks humaneval_instruct --batch_size 1 --apply_chat_template=True --confirm_run_unsafe_code --trust_remote_code |
| | ``` |
| |
|
| | ### Results |
| |
|
| | | Model | humaneval_instruct | |
| | |-----------------------|--------------------| |
| | | Qwen3-1.7B | 67.1 | |
| | | Qwen3-1.7B-Code | 69.5 | |
| | |
| | ## License |
| | |
| | Please refer to the license of the original model [Qwen/Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B) and dataset [evol-codealpaca-v1](https://huggingface.co/datasets/theblackcat102/evol-codealpaca-v1). |