BioGenesis-ToT-GGUF / README.md
Rustamshry's picture
Update README.md
e666033 verified
metadata
license: apache-2.0
language:
  - en
metrics:
  - accuracy
base_model:
  - khazarai/BioGenesis-ToT
pipeline_tag: text-generation
tags:
  - biology
  - medical
  - science
  - unsloth
  - sft

Model Card for BioGenesis-ToT

alt="General Benchmark Comparison Chart"

GGUF version of https://huggingface.co/khazarai/BioGenesis-ToT

BioGenesis-ToT is a fine-tuned version of Qwen3-1.7B, optimized for mechanistic reasoning and explanatory understanding in biology. This model has been trained on the moremilk/ToT-Biology dataset β€” a reasoning-rich collection of biology questions emphasizing why and how processes occur, rather than simply what happens.

The model demonstrates strong capabilities in:

  • Structured biological explanation generation
  • Logical and causal reasoning
  • Chain-of-thought (ToT) reasoning in scientific contexts
  • Interdisciplinary biological analysis (e.g., bioengineering, medicine, ecology)

Uses

πŸš€ Intended Use

  • Educational and scientific explanation generation
  • Biological reasoning and tutoring applications
  • Model interpretability research
  • Training datasets for reasoning-focused LLMs

⚠️ Limitations

  • Not a replacement for expert biological judgment
  • May occasionally over-generalize or simplify complex phenomena
  • Limited to reasoning quality within biological contexts (not trained for creative writing or coding)

πŸ§ͺ Dataset: moremilk/ToT-Biology

The ToT-Biology dataset emphasizes mechanistic understanding and explanatory reasoning within biology. It’s designed to help AI models develop interpretable, step-by-step reasoning abilities for complex biological systems.

It spans a wide range of biological subdomains:

  • Foundational biology: Cell biology, genetics, evolution, and ecology
  • Advanced topics: Systems biology, synthetic biology, computational biophysics
  • Applied domains: Medicine, agriculture, bioengineering, and environmental science

Dataset features include:

  • 🧩 Logical reasoning styles β€” deductive, inductive, abductive, causal, and analogical
  • 🧠 Problem-solving techniques β€” decomposition, elimination, systems thinking, trade-off analysis
  • πŸ”¬ Real-world problem contexts β€” experiment design, pathway mapping, and data interpretation
  • 🌍 Practical relevance β€” bridging theoretical reasoning and applied biological insight
  • πŸŽ“ Educational focus β€” for both AI training and human learning in scientific reasoning

🧭 Objective

This fine-tuning project aims to build an interpretable reasoning model capable of:

  • Explaining biological mechanisms clearly and coherently
  • Demonstrating transparent, step-by-step thought processes
  • Applying logical reasoning techniques to biological and interdisciplinary problems
  • Supporting educational and research use cases where reasoning transparency matters

Citation

BibTeX:

@model{khazarai/BioGenesis-ToT,
  title     = {BioGenesis-ToT: A Fine-Tuned Model for Explanatory Biological Reasoning},
  author    = {Rustam Shiriyev},
  year      = {2025},
  publisher = {Hugging Face},
  base_model = {Qwen3-1.7B},
  dataset   = {moremilk/ToT-Biology},
  license   = {MIT}
}