hyper88
/

mistral-sci-phi-7B

Model card Files Files and versions

mistral-sci-phi-7B / README.md

hyper88's picture

Update README.md

62e1c1b over 2 years ago

|

history blame contribute delete

2.82 kB

	---
	library_name: peft
	---

	# Model Card for Mistral-sci-phi

	This model is a fine-tuned version of the Mistral-7B model, optimized for performance and efficiency using the PEFT library and INT4 quantization.

	## Model Details

	### Model Description

	Mistral-sci-phi is a model fine-tuned from the Mistral-7B base model. It has been optimized for enhanced performance and reduced size, making it highly efficient for various NLP tasks. The model is trained using the "emrgnt-cmplxty/sciphi-textbooks-are-all-you-need" dataset from the Hugging Face Hub, ensuring it's well-suited for real-world applications.

	- Developed by: Arturo de Pablo
	- Trained by: IZX, Hyper88
	- Model type: Causal Language Model
	- Language(s) (NLP): English
	- License: [More Information Needed]
	- Finetuned from model: mistralai/Mistral-7B-v0.1

	### Model Sources

	- Repository: [hyper88/ast1test](https://huggingface.co/hyper88/mistral-sci-phi-7B)


	## Uses

	### Direct Use

	The model can be used directly for generating text and other NLP tasks.

	### Downstream Use

	It can also be integrated into larger systems for more complex applications.

	### Out-of-Scope Use

	The model should not be used for tasks beyond its training and capability scope.

	## Bias, Risks, and Limitations

	The model inherits the biases and limitations of the base Mistral-7B model. Users should be cautious of these when using the model.

	### Recommendations

	Users should evaluate the model's performance and biases in their specific use case and make adjustments as necessary.

	## How to Get Started with the Model

	The model can be loaded and used for inference using the Hugging Face Transformers library.

	## Training Details

	### Training Data

	The model was trained on the "emrgnt-cmplxty/sciphi-textbooks-are-all-you-need" dataset available on the Hugging Face Hub.

	### Training Procedure

	The model was fine-tuned using INT4 quantization to optimize its performance and size.

	#### Training Hyperparameters

	- Training was done with a learning rate of 2e-4
	- Batch size of 12
	- Trained for 3 epochs

	## Evaluation

	### Testing Data, Factors & Metrics

	[More Information Needed]

	### Results

	[More Information Needed]

	## Environmental Impact

	The environmental impact is minimized due to the optimized size and efficiency of the model.

	## Technical Specifications

	### Model Architecture and Objective

	The model is based on the Mistral-7B architecture and fine-tuned for enhanced performance.

	### Compute Infrastructure

	Sponsored by izx.ai

	#### Software

	- PEFT 0.6.0.dev0

	## More Information

	For more details, visit the [model repository](https://huggingface.co/hyper88/ast1test).

	## Model Card Authors

	- Arturo de Pablo (https://www.linkedin.com/in/arde88/)

	## Model Card Contact

	https://discord.gg/KGCeKP4ng9