File size: 1,047 Bytes
08b0cb8 f85d307 08b0cb8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
---
license: apache-2.0
base_model: Qwen/Qwen3-0.6B-Base
tags:
- dpo
- fdpo
- math
- code
- qwen3
- reasoning
datasets:
- albertfares/MNLP_M3_dpo_dataset
language:
- en
pipeline_tag: text-generation
---
# MNLP M3 fDPO Model (187k samples)
This model is a fine-tuned version of [Qwen/Qwen3-0.6B-Base](https://huggingface.co/Qwen/Qwen3-0.6B-Base) using **filtered Direct Preference Optimization (fDPO)** on the [MNLP M3 DPO dataset](https://huggingface.co/datasets/albertfares/MNLP_M3_dpo_dataset).
## Model Details
- **Base Model**: Qwen/Qwen3-0.6B-Base
- **Training Method**: fDPO (filtered Direct Preference Optimization)
- **Dataset**: MNLP M3 mixed dataset (~69k samples)
- **Format**: SafeTensors (secure format)
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("albertfares/MNLP_M3_dpo_model_69k")
tokenizer = AutoTokenizer.from_pretrained("albertfares/MNLP_M3_dpo_model_69k")
```
This model uses SafeTensors format for enhanced security and faster loading.
|