| --- |
| license: cc-by-nc-4.0 |
| tags: |
| - moe |
| - frankenmoe |
| - merge |
| - mergekit |
| - lazymergekit |
| base_model: |
| - mlabonne/NeuralDaredevil-7B |
| - BioMistral/BioMistral-7B |
| - mistralai/Mathstral-7B-v0.1 |
| - FPHam/Writing_Partner_Mistral_7B |
| library_name: transformers |
| pipeline_tag: text-generation |
| --- |
| |
| # EduMixtral-4x7B |
|
|
| <img src="https://cdn-uploads.huggingface.co/production/uploads/65ba68a15d2ef0a4b2c892b4/1hvgYltQRmbkzHMSXvGYh.jpeg" width=400> |
|
|
| EduMixtral-4x7B is an experimental model that combines different educational focused language models intended for downstream human/ai student/teacher application research. |
| Intended to cover: general knowledge, medical field, math, and writing assistance. |
|
|
| ## 🤏 Models Merged |
|
|
| EduMixtral-4x7B is a Mixture of Experts (MoE) made with the following models using [Mergekit](https://github.com/arcee-ai/mergekit): |
| * [mlabonne/NeuralDaredevil-7B](https://huggingface.co/mlabonne/NeuralDaredevil-7B) <- Base Model |
| * [BioMistral/BioMistral-7B](https://huggingface.co/BioMistral/BioMistral-7B) |
| * [mistralai/Mathstral-7B-v0.1](https://huggingface.co/mistralai/Mathstral-7B-v0.1) |
| * [FPHam/Writing_Partner_Mistral_7B](https://huggingface.co/FPHam/Writing_Partner_Mistral_7B) |
|
|
| ## 🧩 Configuration |
|
|
| ```yaml |
| base_model: mlabonne/NeuralDaredevil-7B |
| gate_mode: hidden |
| experts: |
| - source_model: mlabonne/NeuralDaredevil-7B |
| positive_prompts: |
| - "hello" |
| - "help" |
| - "question" |
| - "explain" |
| - "information" |
| - source_model: BioMistral/BioMistral-7B |
| positive_prompts: |
| - "medical" |
| - "health" |
| - "biomedical" |
| - "clinical" |
| - "anatomy" |
| - source_model: mistralai/Mathstral-7B-v0.1 |
| positive_prompts: |
| - "math" |
| - "calculation" |
| - "equation" |
| - "geometry" |
| - "algebra" |
| - source_model: FPHam/Writing_Partner_Mistral_7B |
| positive_prompts: |
| - "writing" |
| - "creative process" |
| - "story structure" |
| - "character development" |
| - "plot" |
| ``` |
|
|
| ## 💻 Usage |
|
|
| It is reccomended to load in 8bit or 4bit quantization |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig |
| |
| # Load the tokenizer and model |
| tokenizer = AutoTokenizer.from_pretrained("AdamLucek/EduMixtral-4x7B") |
| model = AutoModelForCausalLM.from_pretrained( |
| "AdamLucek/EduMixtral-4x7B", |
| device_map="cuda", |
| quantization_config=BitsAndBytesConfig(load_in_8bit=True) |
| ) |
| |
| # Prepare the input text |
| input_text = "Math problem: Xiaoli reads a 240-page story book. She reads (1/8) of the whole book on the first day and (1/5) of the whole book on the second day. How many pages did she read in total in two days?" |
| input_ids = tokenizer(input_text, return_tensors="pt").to("cuda") |
| |
| # Generate the output with specified parameters |
| outputs = model.generate( |
| **input_ids, |
| max_new_tokens=256, |
| num_return_sequences=1 |
| ) |
| |
| # Decode and print the generated text |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| ``` |
|
|
| **Output:** |
|
|
| >Solution: |
| >To find the total number of pages Xiaoli read in two days, we need to add the number of pages she read on the first day and the second day. |
| >On the first day, Xiaoli read 1/8 of the book. Since the book has 240 pages, the number of pages she read on the first day is: |
| >\[ \frac{1}{8} \times 240 = 30 \text{ pages} \] |
| >On the second day, Xiaoli read 1/5 of the book. The number of pages she read on the second day is: |
| >\[ \frac{1}{5} \times 240 = 48 \text{ pages} \] |
| >To find the total number of pages she read in two days, we add the pages she read on the first day and the second day: |
| >\[ 30 \text{ pages} + 48 \text{ pages} = 78 \text{ pages} \] |
| >Therefore, Xiaoli read a total of 78 pages in two days. |
| >Final answer: Xiaoli read 78 pages in total |