Hypernet Scaling Law Data
Data assets for scaling-law and preservation (catastrophic forgetting) experiments.
Contents
- OOD splits:
train_ood_scaling_law.pq,valid_ood_scaling_law.pq,eval_ood_scaling_law.pqβ train/valid/eval by domain (eval = held-out domains). - Scaling law:
train_scaling_law.pq,valid_scaling_law.pqβ 1hop/2hop/3hop QA. - With facts:
train_scaling_law_with_facts.pq,valid_scaling_law_with_facts.pqβ same +factscolumn from relation templates. - Preservation:
preservation_train.pq,preservation_eval.pq(andpreserve_data/,preserve_data_2hop/,preserve_data_combined/) β entities not in train, for preservation loss and eval. - Relation templates:
relation_template_mapping.csvβ relation label β question template and noun_template for fact generation. - EDA:
domain_counts_eda.csv,figures/β domain and n_hop stats/plots.
Schema (parquet)
Canonical columns: triplet_subject, triplet_relation, triplet_object, question_prompt, answer.
Some files add n_hop, facts (list of strings), or domain.
Usage
import pandas as pd
from huggingface_hub import hf_hub_download
path = hf_hub_download(repo_id="nace-ai/hypernet-scaling-law-data", filename="train_ood_scaling_law.pq")
df = pd.read_parquet(path)
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support