File size: 1,174 Bytes
9b47998
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
---
license: apache-2.0
tags:
- minimind
- science
- chemistry
- biology
- kanna
- sceletium-tortuosum
---

# MiniMind-Science

This repository contains **MiniMind** models (Small and MoE versions) trained on a curated mix of scientific datasets.

## Models
*   **`full_sft_science_512.pth`**: MiniMind-Small (26M params, dim=512). **Recommended**. 
    *   Pretrained on: Biology, Botany, and Kanna (Sceletium tortuosum) texts.
    *   Fine-tuned on: Chemistry QA and PubMed Summarization.
*   **`full_sft_science_moe_640_moe.pth`**: MiniMind-MoE (145M params, dim=640, 8 layers). Mixture-of-Experts version.

## Training Data
*   **Sceletium Tortuosum (Kanna)**: Custom dataset (`SAINTHALF/kanna_chunks_v2`).
*   **Biology/Botany**: Text corpus from `rag-datasets/rag-mini-bioasq`.
*   **Chemistry**: Conversational QA from `camel-ai/chemistry`.
*   **Medical**: Summarization data from `ccdv/pubmed-summarization`.

## Usage
These models are native PyTorch weights compatible with the [MiniMind](https://github.com/jingyaogong/minimind) architecture.

```python
# Example loading (requires MiniMind code)
model.load_state_dict(torch.load('full_sft_science_512.pth'))
```