File size: 3,468 Bytes
2463036 f1bffab 2463036 e666033 2463036 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 | ---
license: apache-2.0
language:
- en
metrics:
- accuracy
base_model:
- khazarai/BioGenesis-ToT
pipeline_tag: text-generation
tags:
- biology
- medical
- science
- unsloth
- sft
---
# Model Card for BioGenesis-ToT

- **Overall Success Rate**:
- khazarai/BioGenesis-ToT: **51.45**
- Qwen/Qwen3-1.7B: **46.82**
- **Benchmark**: [emre/TARA_Turkish_LLM_Benchmark](https://huggingface.co/datasets/emre/TARA_Turkish_LLM_Benchmark)
GGUF version of https://huggingface.co/khazarai/BioGenesis-ToT
BioGenesis-ToT is a fine-tuned version of Qwen3-1.7B, optimized for mechanistic reasoning and explanatory understanding in biology.
This model has been trained on the [moremilk/ToT-Biology](https://huggingface.co/datasets/moremilk/ToT-Biology) dataset β a reasoning-rich collection of biology questions emphasizing why and how processes occur, rather than simply what happens.
The model demonstrates strong capabilities in:
- Structured biological explanation generation
- Logical and causal reasoning
- Chain-of-thought (ToT) reasoning in scientific contexts
- Interdisciplinary biological analysis (e.g., bioengineering, medicine, ecology)
## Uses
### π Intended Use
- Educational and scientific explanation generation
- Biological reasoning and tutoring applications
- Model interpretability research
- Training datasets for reasoning-focused LLMs
### β οΈ Limitations
- Not a replacement for expert biological judgment
- May occasionally over-generalize or simplify complex phenomena
- Limited to reasoning quality within biological contexts (not trained for creative writing or coding)
## π§ͺ Dataset: moremilk/ToT-Biology
The ToT-Biology dataset emphasizes mechanistic understanding and explanatory reasoning within biology.
Itβs designed to help AI models develop interpretable, step-by-step reasoning abilities for complex biological systems.
It spans a wide range of biological subdomains:
- Foundational biology: Cell biology, genetics, evolution, and ecology
- Advanced topics: Systems biology, synthetic biology, computational biophysics
- Applied domains: Medicine, agriculture, bioengineering, and environmental science
Dataset features include:
- π§© Logical reasoning styles β deductive, inductive, abductive, causal, and analogical
- π§ Problem-solving techniques β decomposition, elimination, systems thinking, trade-off analysis
- π¬ Real-world problem contexts β experiment design, pathway mapping, and data interpretation
- π Practical relevance β bridging theoretical reasoning and applied biological insight
- π Educational focus β for both AI training and human learning in scientific reasoning
## π§ Objective
This fine-tuning project aims to build an interpretable reasoning model capable of:
- Explaining biological mechanisms clearly and coherently
- Demonstrating transparent, step-by-step thought processes
- Applying logical reasoning techniques to biological and interdisciplinary problems
- Supporting educational and research use cases where reasoning transparency matters
## Citation
**BibTeX:**
```bibtex
@model{khazarai/BioGenesis-ToT,
title = {BioGenesis-ToT: A Fine-Tuned Model for Explanatory Biological Reasoning},
author = {Rustam Shiriyev},
year = {2025},
publisher = {Hugging Face},
base_model = {Qwen3-1.7B},
dataset = {moremilk/ToT-Biology},
license = {MIT}
}
``` |