ArchMC
/

_model_1

Text Classification

Model card Files Files and versions

_model_1 / README.md

AlazarM's picture

Upload README.md with huggingface_hub

ed7e9c8 verified 9 days ago

|

history blame contribute delete

1.88 kB

	---
	license: mit
	language: en
	tags:
	- text-classification
	- toxicity
	- moderation
	- chat
	- bert
	- pytorch
	- onnx
	datasets:
	- dormlab/chat-corpus
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	pipeline_tag: text-classification
	---

	# Toxic Chat Moderation

	Binary classifier for real-time chat moderation. Flags toxic, hateful, harassing,
	sexually explicit, and otherwise inappropriate messages in gaming and social chat.

	Based on fine-tuned on 300K labeled chat messages.

	## Quick use



	## Performance

	\| Metric \| Score \|
	\|--------\|-------\|
	\| Accuracy \| 0.9768 \|
	\| F1 \| 0.9768 \|
	\| Precision \| 0.9643 \|
	\| Recall \| 0.9897 \|

	ONNX INT8 latency: ~1-3ms on Apple Silicon (CoreML/MPS).

	## Training

	- Architecture: bert-base-uncased (110M params), 2 labels (clean/toxic)
	- Hardware: Apple Silicon Mac Mini (MPS), single-node
	- Data: 153K messages (122,688 train / 15,336 val / 15,336 test)
	- Framework: PyTorch, HuggingFace Trainer
	- Export: ONNX dynamic INT8 quantization (105 MB)

	## Variants

	This repo provides two model formats:
	- — full PyTorch weights for use with usage: transformers <command> [<args>]

	positional arguments:
	{chat,convert,download,env,run,serve,add-new-model-like,add-fast-image-processor}
	transformers command helpers
	convert CLI tool to run convert model from original author
	checkpoints to Transformers PyTorch checkpoints.
	run Run a pipeline through the CLI
	serve CLI tool to run inference requests through REST and
	GraphQL endpoints.

	options:
	-h, --help show this help message and exit
	- — ONNX INT8 quantized for fast inference on CPU/CoreML

	## Label mapping

	\| Label \| Meaning \|
	\|--------\|---------\|
	\| 0 \| Clean — allow \|
	\| 1 \| Toxic — block/flag \|