Mistral-7B-Instruct-v0.3-ORPO
This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.3 on the got_qa_pairs dataset added some special tokens for charcters,places for better tokenizing to mistral tokenizer.
Model description
More information needed
Intended uses & limitations
for question answering better to use the script on gpu includes retreivers for better contextual answers
Training and evaluation data
dataset uploaded in hash-map/got_qa_pairs 95% is train and 5% is validation randomly shuffled
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-06
- train_batch_size: 3
- eval_batch_size: 3
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 12
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 0.1
- num_epochs: 1.0
Framework versions
- PEFT 0.13.0
- Transformers 4.45.1
metrics
- bertscore 0.4
- bleu 0.59999
- rouge 0.43
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for hash-map/got_fine_tuned
Base model
mistralai/Mistral-7B-v0.3
Finetuned
mistralai/Mistral-7B-Instruct-v0.3