YAML Metadata Warning: empty or missing yaml metadata in repo card
Check out the documentation for more information.
Safety Training Dataset
Comprehensive dataset of 2.4M occupational safety documents used to train SafetyBERT and SafetyALBERT models.
Dataset Overview
- Size: 120MB compressed (all-data-combined.7z)
- Documents: 2.4M safety reports and narratives
- Sources: MSHA, OSHA, NTSB, FRA, IOGP, iChem, Safety Abstracts
- Format: CSV files with narrative/abstract columns
Usage
from huggingface_hub import hf_hub_download
import py7zr
# Download and extract
data_file = hf_hub_download("adanish91/safety-training-data", "all-data-combined.7z")
with py7zr.SevenZipFile(data_file, 'r') as archive:
archive.extractall("./data/")
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support