fc91
/

phi3-mini-instruct-full_ethics-lora

Transformers

Safetensors

Model card Files Files and versions

xet

Community

fc91 commited on Jun 28, 2024

Commit

9f76d75

verified ·

1 Parent(s): 99dc78a

Update README.md

Browse files

Files changed (1) hide show

README.md +38 -17

README.md CHANGED Viewed

@@ -1,19 +1,22 @@
 ---
 library_name: transformers
-tags: []
 ---
 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
 ## Model Details
 ### Model Description
-<!-- Provide a longer summary of what this model is. -->
 This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
@@ -27,7 +30,7 @@ This is the model card of a 🤗 transformers model that has been pushed on the
 ### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
 - **Repository:** [More Information Needed]
 - **Paper [optional]:** [More Information Needed]
@@ -35,43 +38,46 @@ This is the model card of a 🤗 transformers model that has been pushed on the
 ## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 ### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 [More Information Needed]
 ### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
 [More Information Needed]
 ### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
 [More Information Needed]
 ## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
 [More Information Needed]
 ### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 ## How to Get Started with the Model
 Use the code below to get started with the model.
-[More Information Needed]
 ## Training Details
@@ -79,7 +85,7 @@ Use the code below to get started with the model.
 <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
 ### Training Procedure
@@ -92,25 +98,40 @@ Use the code below to get started with the model.
 #### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 #### Speeds, Sizes, Times [optional]
 <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
 ## Evaluation
 <!-- This section describes the evaluation protocols and provides the results. -->
 ### Testing Data, Factors & Metrics
 #### Testing Data
 <!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
 #### Factors
@@ -162,7 +183,7 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
 #### Hardware
-[More Information Needed]
 #### Software

 ---
 library_name: transformers
+license: cc-by-4.0
+datasets:
+- hendrycks/ethics
 ---
 # Model Card for Model ID
 <!-- Provide a quick summary of what the model is/does. -->
+Fine-tuned version of Phi-3-mini-4k-instruct on a subset of the hendrycks/ethics dataset
+<!--
 ## Model Details
 ### Model Description
+<!-- Provide a longer summary of what this model is.
 This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
 ### Model Sources [optional]
+<!-- Provide the basic links for the model.
 - **Repository:** [More Information Needed]
 - **Paper [optional]:** [More Information Needed]
 ## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model.
 ### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app.
 [More Information Needed]
 ### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app
 [More Information Needed]
 ### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for.
 [More Information Needed]
 ## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations.
 [More Information Needed]
 ### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations.
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. -->
 ## How to Get Started with the Model
 Use the code below to get started with the model.
+```markdown
+from transformers import AutoModel
+model = AutoModel.from_pretrained("fc91/phi3-mini-instruct-full_ethics-lora")
+```
 ## Training Details
 <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+["hendrycks/ethics"](https://huggingface.co/datasets/hendrycks/ethics)
 ### Training Procedure
 #### Training Hyperparameters
+```markdown
+per_device_train_batch_size=16
+per_device_eval_batch_size=32
+gradient_accumulation_steps=2
+gradient_checkpointing=True
+warmup_steps=100
+num_train_epochs=1
+learning_rate=0.00005
+weight_decay=0.01
+optim="adamw_hf"
+fp16=True
+```
 #### Speeds, Sizes, Times [optional]
 <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+The overall training took 1 hour 23 minutes.
 ## Evaluation
 <!-- This section describes the evaluation protocols and provides the results. -->
+Training Loss = 0.181700
+Validation Loss = 0.119734
 ### Testing Data, Factors & Metrics
 #### Testing Data
 <!-- This should link to a Dataset Card if possible. -->
+["hendrycks/ethics"](https://huggingface.co/datasets/hendrycks/ethics)
 #### Factors
 #### Hardware
+NVIDIA A100-SXM4-40GB
 #### Software