YAML Metadata Warning: empty or missing yaml metadata in repo card

Check out the documentation for more information.

Safety Training Dataset

Comprehensive dataset of 2.4M occupational safety documents used to train SafetyBERT and SafetyALBERT models.

Dataset Overview

  • Size: 120MB compressed (all-data-combined.7z)
  • Documents: 2.4M safety reports and narratives
  • Sources: MSHA, OSHA, NTSB, FRA, IOGP, iChem, Safety Abstracts
  • Format: CSV files with narrative/abstract columns

Usage

from huggingface_hub import hf_hub_download
import py7zr

# Download and extract
data_file = hf_hub_download("adanish91/safety-training-data", "all-data-combined.7z")
with py7zr.SevenZipFile(data_file, 'r') as archive:
    archive.extractall("./data/")
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support