YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

ecocoder-cot-v1 β€” Ecological Chain-of-Thought Dataset

10 CoT traces for fine-tuning Nemotron on ecological reasoning + code generation.

Format

Each trace has 3 sections:

[CONTEXT] {paper abstract + method description}
[REASONING] {step-by-step ecological reasoning}
[CODE] {Python/R implementation}

Splits

Split Traces Size
train 8 ~40 KB
test 2 ~10 KB

Papers Covered

# Paper Method Code
1 GLOSSA (2505.05862) BART Bayesian SDM R
2 MaskSDM (2503.13057) DL + Shapley values PyTorch
3 GeoThinneR (2505.07867) kd-tree thinning R
4 HeteroGNN (2503.11900) Graph Neural Net PyTorch Geometric
5 CISO (2508.06704) Conditional SDM PyTorch
6 BioAnalyst (2507.09080) Foundation Model PyTorch
7 MultiScale (2411.04016) Multi-scale SDM PyTorch
8 LD-SDM (2312.08334) LLM + Taxonomy PyTorch + HF
9 PointProcess (2311.06755) Poisson Process R/INLA
10 EntropyBias (2508.02272) Shannon Entropy Python + R

Intended Use

Fine-tune nemotron-3-nano-30b-a3b (32.5B) with Unsloth 4-bit QLoRA on A100 80GB.

Training config

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="nvidia/Nemotron-3-Nano-30B-A3B-ablated",
    max_seq_length=4096,
    load_in_4bit=True,
)

Generation Pipeline

Papers (arXiv) β†’ DeepSeek v4 Pro CoT β†’ JSONL β†’ HuggingFace Dataset β†’ Unsloth QLoRA β†’ ecocoder-nemotron

Next: v2 (100 traces)

Scale to 100 papers across 6 SDM categories: Bayesian methods, deep learning, spatial methods, taxonomic integration, data integration, bias correction.


Built with DeepSeek v4 Pro Β· ecoseek-litdump Β· alrobles

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support