QVAC Genesis II Pretrained Model - Failure Analysis Prompt
Key Highlights
Pretrained on QVAC Genesis II
This model has been pretrained on Tether’s QVAC Genesis II dataset.
Dataset card: https://huggingface.co/datasets/qvac/GenesisII
The checkpoint uses the Failure Analysis prompt format and was pretrained on approximately 53B tokens, using BF16 mixed precision and a 4,096-token context window, with a Qwen3-family 1.7B-parameter decoder-only transformer architecture.
Checkpoints in Hugging Face format
Checkpoints are provided in standard Hugging Face format for inference, continual pretraining, and fine-tuning.
Educational coverage
QVAC Genesis II includes the following domains:
- Machine learning
- High school statistics
- High school chemistry
- Econometrics
- College chemistry
- College physics
- Geography
- Astronomy
- College computer science
- Electrical engineering
- High school computer science
Intended Uses
- Continual pretraining or fine-tuning for educational applications (STEM tutoring, QA systems, curriculum support)
- Benchmarking reasoning and subject-specific QA performance
- Research into synthetic dataset-driven LLM pretraining
Model Details
Model Description
- Developed by: QVAC by Tether
- Model type: Decoder-only Transformer (causal LM)
- Language(s) (NLP): Primarily English
- License: Apache-2.0
- Finetuned from model: None (randomly initialized)
- Intended stage: Base pretrained model (no SFT / RLHF)
Dataset Details
Uses
Direct Use
- General language modeling: next-token prediction, continuation, summarization, drafting.
- Research baseline for scaling, data ablations, or tokenizer studies.
Downstream Use (recommended)
- Continued Pre-Training (CPT) on more tokens.
- SFT for assistants, domain experts, or task-specific models.
- Preference optimization / RLHF for safer, more helpful behavior.
- Adapters/LoRA for efficient domain specialization.
Out-of-Scope Use
- High-stakes decision-making (medical/financial/legal)
- Safety-critical or autonomous control systems
- Unfiltered end-user deployment without alignment / safety layers
- Any use that violates applicable laws or platform policies.
Bias, Risks, and Limitations
- Bias & toxicity: May reflect or amplify biases present in web text.
- Hallucinations: Can produce confident but incorrect statements or citations.
- Security / privacy: May emit continuous random strings.
- Context limit: 4,096 tokens; longer inputs require chunking.
Recommendations
- Disclose limitations to downstream users.
- Research model: not intended for production use cases.
How to Get Started
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "qvac/genesis-ii-model-failure-analysis"
tok = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
prompt = "Explain Newton's laws of motion."
inputs = tok(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=256, do_sample=True, top_p=0.9, temperature=0.7)
print(tok.decode(out[0], skip_special_tokens=True))
Training Details
Training Data
- Dataset: qvac/GenesisII
- Prompt format: Failure Analysis prompt
- Total pretraining tokens: ~53B tokens
- Tokenizer: Qwen3 tokenizer
Training Procedure
- Framework: Megatron-LM (Megatron-Core)
- Sequence length: 4,096
- Precision: BF16
- Optimizer: AdamW (β₁=0.9, β₂=0.95), weight decay 0.01
- Learning rate: 2e-4 → 2e-5
- Warmup: 607 steps
- Training steps: 6,079
- Gradient clipping: 1.0
- Seed: 42
- Logging: every 100 steps
- Checkpointing: every 500 steps
Multi-Node GPU Setup
- Cluster: 8 nodes, each with 8 NVIDIA H100 80GB (64 GPUs total)
- Scheduler: Slurm
- Launch:
srun + torchrun
- Framework: Megatron-LM
Technical Specifications
Model Architecture and Objective
- Architecture: Qwen3-style decoder-only Transformer
- Parameters: ~1.7B
- Context length: 4,096 tokens
- Objective: Causal LM (next-token prediction)
Conversion & Inference
This checkpoint is provided in Hugging Face format for inference with transformers.
Changelog
- v0.1 (2025-12-16): Initial release - Failure Analysis prompt variant