--- language: - en - fr pipeline_tag: video-classification tags: - human-activity-recognition - multimodal - sensor-fusion - edge-ai - privacy-preserving - pytorch - sparse-transformer --- # SAM-MM-HAR **SAM-MM-HAR** is a lightweight multimodal Human Activity Recognition model built by **AMEFORGE Lab** (Amega Mike) on a proprietary sparse Transformer architecture. It classifies 40 daily activities from privacy-preserving non-RGB sensors: Depth, Skeleton, IMU, mmWave Radar, IR and Thermal. Developed for the **CUHK-X Multimodal Human Activity Challenge** (co-located with UbiComp 2026). ## Key specs | Property | Value | |---|---| | Architecture | Sparse Transformer (proprietary — AMEFORGE) | | Parameters | {n_params:,} (~{n_params/1e6:.1f}M) | | Size on disk | {size:.1f} MB | | Classes | 40 daily activities | | Modalities | Depth · Skeleton · IMU · mmWave · IR · Thermal | | Val accuracy | {val_acc:.1f}% (cross-subject) | | Edge ready | ✅ CPU inference < 100 MB | ## Modalities The model handles missing modalities gracefully — any subset works at inference. | Modality | Encoder type | |---|---| | Depth | Patch Conv2D + sparse attention | | IR / Thermal | Patch Conv2D + sparse attention | | Skeleton | Joint linear + sparse attention | | IMU (6-axis) | Conv1D temporal | | mmWave Radar | Patch Conv2D + sparse attention | A **MotionCore** temporal world-model (GRU over per-frame embeddings) models human movement dynamics across frames — the key advantage over standard frame-by-frame classifiers. ## Classes (40) Wash_face · Brush_teeth · Wash_hands · Comb_hair · Put/Take_off_glasses · Put/Take_off_clothes · Put/Take_off_shoes · Drink_water · Eat · Read_book · Write · Use_phone · Use_laptop · Sit_down · Stand_up · Lie_down · Get_up · Walk · Run · Jump · Clap · Wave · Point · Throw · Kick · Pick_up · Put_down · Open/Close_door · Turn_on/off_light · Sweep_floor · Vacuum · Fall_down · Check_time · Take_body_temperature ## Inference ```python import torch from huggingface_hub import hf_hub_download ckpt = hf_hub_download("AMFORGE/sam-mm-har", "best.pt") # Load with inference.py from the repo # python inference.py --checkpoint best.pt --clip /path/to/clip_folder ``` ## Citation If you use SAM-MM-HAR, please cite: ```bibtex @misc{{sam_mm_har, title = {{SAM-MM-HAR: Multimodal Human Activity Recognition on Privacy-Preserving Sensors}}, author = {{AM}, year = {{2026}}, note = {{AMEFORGE Lab. Built on a proprietary sparse Transformer architecture.}}, }} ``` --- *Architecture internals are proprietary and not disclosed. © AMEFORGE Lab 2026* """