phi3-mini-math / README.md
jrc's picture
Update README.md
efa6ebf verified
|
raw
history blame
3.48 kB
metadata
license: apache-2.0
datasets:
  - TIGER-Lab/MATH-plus
language:
  - en
tags:
  - torchtune
  - minerva-math
library_name: transformers
pipeline_tag: text-generation

jrc/phi3-mini-math

Phi-3 Mini 4k Instruct model finetuned on math datasets.

Uses

How to Get Started with the Model

Use the code below to get started with the model.

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("jrc/phi3-mini-math", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("jrc/phi3-mini-math", trust_remote_code=True)

Training Details

Phi3 was trained using torchtune and the training script + config file are located in this repository.

CMD:

tune run lora_finetune_distributed.py --config mini_lora.yaml 

Training Data

[More Information Needed]

Training Procedure

Evaluation

The finetuned model is evaluated on minerva-math using EleutherAI Eval Harness through torchtune.

CMD:

tune run eleuther_eval --config eleuther_evaluation \
          checkpoint.checkpoint_dir=./lora-phi3-math \
          tasks=["minerva_math"] \
          batch_size=32 

RESULTS:

|               Tasks                |Version|Filter|n-shot|  Metric   |Value |   |Stderr|
|------------------------------------|-------|------|-----:|-----------|-----:|---|-----:|
|minerva_math                        |N/A    |none  |     4|exact_match|0.1670|±  |0.0051|
| - minerva_math_algebra             |      1|none  |     4|exact_match|0.2502|±  |0.0126|
| - minerva_math_counting_and_prob   |      1|none  |     4|exact_match|0.1329|±  |0.0156|
| - minerva_math_geometry            |      1|none  |     4|exact_match|0.1232|±  |0.0150|
| - minerva_math_intermediate_algebra|      1|none  |     4|exact_match|0.0576|±  |0.0078|
| - minerva_math_num_theory          |      1|none  |     4|exact_match|0.1148|±  |0.0137|
| - minerva_math_prealgebra          |      1|none  |     4|exact_match|0.3077|±  |0.0156|
| - minerva_math_precalc             |      1|none  |     4|exact_match|0.0623|±  |0.0104|

Technical Specifications [optional]

Hardware

4 x NVIDIA A100 GPUs

Max VRAM used per GPU: 29 GB

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]