| license: apache-2.0 | |
| tags: | |
| - minimind | |
| - science | |
| - chemistry | |
| - biology | |
| - kanna | |
| - sceletium-tortuosum | |
| # MiniMind-Science | |
| This repository contains **MiniMind** models (Small and MoE versions) trained on a curated mix of scientific datasets. | |
| ## Models | |
| * **`full_sft_science_512.pth`**: MiniMind-Small (26M params, dim=512). **Recommended**. | |
| * Pretrained on: Biology, Botany, and Kanna (Sceletium tortuosum) texts. | |
| * Fine-tuned on: Chemistry QA and PubMed Summarization. | |
| * **`full_sft_science_moe_640_moe.pth`**: MiniMind-MoE (145M params, dim=640, 8 layers). Mixture-of-Experts version. | |
| ## Training Data | |
| * **Sceletium Tortuosum (Kanna)**: Custom dataset (`SAINTHALF/kanna_chunks_v2`). | |
| * **Biology/Botany**: Text corpus from `rag-datasets/rag-mini-bioasq`. | |
| * **Chemistry**: Conversational QA from `camel-ai/chemistry`. | |
| * **Medical**: Summarization data from `ccdv/pubmed-summarization`. | |
| ## Usage | |
| These models are native PyTorch weights compatible with the [MiniMind](https://github.com/jingyaogong/minimind) architecture. | |
| ```python | |
| # Example loading (requires MiniMind code) | |
| model.load_state_dict(torch.load('full_sft_science_512.pth')) | |
| ``` | |