QVAC Genesis II Pretrained Model - Failure Analysis Prompt

Key Highlights

  • Pretrained on QVAC Genesis II
    This model has been pretrained on Tether’s QVAC Genesis II dataset.

    Dataset card: https://huggingface.co/datasets/qvac/GenesisII

    The checkpoint uses the Failure Analysis prompt format and was pretrained on approximately 53B tokens, using BF16 mixed precision and a 4,096-token context window, with a Qwen3-family 1.7B-parameter decoder-only transformer architecture.

  • Checkpoints in Hugging Face format
    Checkpoints are provided in standard Hugging Face format for inference, continual pretraining, and fine-tuning.

  • Educational coverage
    QVAC Genesis II includes the following domains:

    • Machine learning
    • High school statistics
    • High school chemistry
    • Econometrics
    • College chemistry
    • College physics
    • Geography
    • Astronomy
    • College computer science
    • Electrical engineering
    • High school computer science

Intended Uses

  • Continual pretraining or fine-tuning for educational applications (STEM tutoring, QA systems, curriculum support)
  • Benchmarking reasoning and subject-specific QA performance
  • Research into synthetic dataset-driven LLM pretraining

Model Details

Model Description

  • Developed by: QVAC by Tether
  • Model type: Decoder-only Transformer (causal LM)
  • Language(s) (NLP): Primarily English
  • License: Apache-2.0
  • Finetuned from model: None (randomly initialized)
  • Intended stage: Base pretrained model (no SFT / RLHF)

Dataset Details


Uses

Direct Use

  • General language modeling: next-token prediction, continuation, summarization, drafting.
  • Research baseline for scaling, data ablations, or tokenizer studies.

Downstream Use (recommended)

  • Continued Pre-Training (CPT) on more tokens.
  • SFT for assistants, domain experts, or task-specific models.
  • Preference optimization / RLHF for safer, more helpful behavior.
  • Adapters/LoRA for efficient domain specialization.

Out-of-Scope Use

  • High-stakes decision-making (medical/financial/legal)
  • Safety-critical or autonomous control systems
  • Unfiltered end-user deployment without alignment / safety layers
  • Any use that violates applicable laws or platform policies.

Bias, Risks, and Limitations

  • Bias & toxicity: May reflect or amplify biases present in web text.
  • Hallucinations: Can produce confident but incorrect statements or citations.
  • Security / privacy: May emit continuous random strings.
  • Context limit: 4,096 tokens; longer inputs require chunking.

Recommendations

  • Disclose limitations to downstream users.
  • Research model: not intended for production use cases.

How to Get Started

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "qvac/genesis-ii-model-failure-analysis"

tok = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

prompt = "Explain Newton's laws of motion."
inputs = tok(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=256, do_sample=True, top_p=0.9, temperature=0.7)
print(tok.decode(out[0], skip_special_tokens=True))

Training Details

Training Data

  • Dataset: qvac/GenesisII
  • Prompt format: Failure Analysis prompt
  • Total pretraining tokens: ~53B tokens
  • Tokenizer: Qwen3 tokenizer

Training Procedure

  • Framework: Megatron-LM (Megatron-Core)
  • Sequence length: 4,096
  • Precision: BF16
  • Optimizer: AdamW (β₁=0.9, β₂=0.95), weight decay 0.01
  • Learning rate: 2e-4 → 2e-5
  • Warmup: 607 steps
  • Training steps: 6,079
  • Gradient clipping: 1.0
  • Seed: 42
  • Logging: every 100 steps
  • Checkpointing: every 500 steps

Multi-Node GPU Setup

  • Cluster: 8 nodes, each with 8 NVIDIA H100 80GB (64 GPUs total)
  • Scheduler: Slurm
  • Launch: srun + torchrun
  • Framework: Megatron-LM

Technical Specifications

Model Architecture and Objective

  • Architecture: Qwen3-style decoder-only Transformer
  • Parameters: ~1.7B
  • Context length: 4,096 tokens
  • Objective: Causal LM (next-token prediction)

Conversion & Inference

This checkpoint is provided in Hugging Face format for inference with transformers.


Changelog

  • v0.1 (2025-12-16): Initial release - Failure Analysis prompt variant
Downloads last month
7
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including qvac/genesis-ii-model-failure-analysis