mistral-sci-phi-7B / README.md
hyper88's picture
Update README.md
62e1c1b
---
library_name: peft
---
# Model Card for Mistral-sci-phi
This model is a fine-tuned version of the Mistral-7B model, optimized for performance and efficiency using the PEFT library and INT4 quantization.
## Model Details
### Model Description
Mistral-sci-phi is a model fine-tuned from the Mistral-7B base model. It has been optimized for enhanced performance and reduced size, making it highly efficient for various NLP tasks. The model is trained using the "emrgnt-cmplxty/sciphi-textbooks-are-all-you-need" dataset from the Hugging Face Hub, ensuring it's well-suited for real-world applications.
- **Developed by:** Arturo de Pablo
- **Trained by:** IZX, Hyper88
- **Model type:** Causal Language Model
- **Language(s) (NLP):** English
- **License:** [More Information Needed]
- **Finetuned from model:** mistralai/Mistral-7B-v0.1
### Model Sources
- **Repository:** [hyper88/ast1test](https://huggingface.co/hyper88/mistral-sci-phi-7B)
## Uses
### Direct Use
The model can be used directly for generating text and other NLP tasks.
### Downstream Use
It can also be integrated into larger systems for more complex applications.
### Out-of-Scope Use
The model should not be used for tasks beyond its training and capability scope.
## Bias, Risks, and Limitations
The model inherits the biases and limitations of the base Mistral-7B model. Users should be cautious of these when using the model.
### Recommendations
Users should evaluate the model's performance and biases in their specific use case and make adjustments as necessary.
## How to Get Started with the Model
The model can be loaded and used for inference using the Hugging Face Transformers library.
## Training Details
### Training Data
The model was trained on the "emrgnt-cmplxty/sciphi-textbooks-are-all-you-need" dataset available on the Hugging Face Hub.
### Training Procedure
The model was fine-tuned using INT4 quantization to optimize its performance and size.
#### Training Hyperparameters
- Training was done with a learning rate of 2e-4
- Batch size of 12
- Trained for 3 epochs
## Evaluation
### Testing Data, Factors & Metrics
[More Information Needed]
### Results
[More Information Needed]
## Environmental Impact
The environmental impact is minimized due to the optimized size and efficiency of the model.
## Technical Specifications
### Model Architecture and Objective
The model is based on the Mistral-7B architecture and fine-tuned for enhanced performance.
### Compute Infrastructure
Sponsored by izx.ai
#### Software
- PEFT 0.6.0.dev0
## More Information
For more details, visit the [model repository](https://huggingface.co/hyper88/ast1test).
## Model Card Authors
- Arturo de Pablo (https://www.linkedin.com/in/arde88/)
## Model Card Contact
https://discord.gg/KGCeKP4ng9