OFPBadWord / README.md
BladeSzaSza's picture
Merge washyourmouthoutwithsoap dataset - Expand to 58 languages with 6,936 new words
43fb695

A newer version of the Gradio SDK is available: 6.12.0

Upgrade
metadata
title: OFPBadWord
emoji: πŸ”₯
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: Bad word checker sentinel for open floor protocol

πŸ”₯ OFP Bad Word Sentinel

A lightweight sentinel agent that monitors Open Floor Protocol (OFP) conversations for profanity and alerts conveners when violations occur.

Features

  • Silent Monitoring: Listens to conversations without disrupting flow
  • Keyword Detection: Uses simple, fast keyword matching with leetspeak support
  • Private Alerts: Sends violations only to convener (not public)
  • Real-time Dashboard: Monitor status, violations, and activity logs
  • Configurable: Custom word lists and whitelists

How It Works

  1. Sentinel joins OFP conversation as passive observer
  2. Monitors all utterance events for profanity using keyword matching
  3. Detects violations including leetspeak variants (sh1t, b*tch, etc.)
  4. Sends private alert to convener with severity and recommended action
  5. Convener decides enforcement (warn, revoke floor, or remove user)

Technology Stack

  • Profanity Detection: better-profanity (keyword-based with leetspeak)
  • Word Lists: Merged comprehensive dataset from multiple sources
    • LDNOOBW (List of Dirty, Naughty, Obscene, and Otherwise Bad Words) - CC-BY 4.0
    • Wash Your Mouth Out With Soap - MIT License
    • 58 languages supported with 6,936 additional words from WYMOSP
  • OFP Protocol: Custom Python implementation following v1.0.0 specs
  • Web Interface: Gradio 5.49.1
  • Background Service: APScheduler

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   User      β”‚ sends utterance
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
       β”‚
       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Convener          β”‚ broadcasts to floor
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚
       β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Ί
       β–Ό                β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Assistant  β”‚  β”‚  Sentinel   β”‚ monitors silently
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
                        β”‚ detects profanity
                        β”‚ sends PRIVATE alert
                        └──────────────────►
                 β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                 β”‚   Convener          β”‚ takes action
                 β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Project Structure

ofp-badword-sentinel/
β”œβ”€β”€ README.md                 # This file
β”œβ”€β”€ app.py                    # Gradio dashboard entry point
β”œβ”€β”€ requirements.txt          # Python dependencies
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ models.py             # OFP data structures
β”‚   β”œβ”€β”€ ofp_client.py         # OFP envelope handling
β”‚   β”œβ”€β”€ profanity_detector.py # Bad word detection logic
β”‚   └── sentinel.py           # Core sentinel monitoring
β”œβ”€β”€ config/
β”‚   β”œβ”€β”€ config.yaml           # Sentinel configuration
β”‚   └── wordlist.txt          # Custom bad words (optional)
└── tests/
    β”œβ”€β”€ test_profanity.py
    β”œβ”€β”€ test_ofp_client.py
    └── test_sentinel.py

Configuration

Edit config/config.yaml to customize:

sentinel:
  speaker_uri: 'tag:your-domain.com,2025:sentinel-01'
  service_url: 'https://your-sentinel-endpoint.com/ofp'
  convener_uri: 'tag:convener-domain.com,2025:convener'
  convener_url: 'https://convener-endpoint.com/ofp'

profanity:
  use_default: true
  custom_wordlist: 'config/wordlist.txt'
  whitelist:
    - scunthorpe
    - arsenal

monitoring:
  check_interval: 30
  auto_start: true

Local Setup

Prerequisites

  • Python 3.8 or higher
  • pip package manager

Installation

# Clone repository
git clone https://github.com/your-username/OFPBadWord.git
cd OFPBadWord

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Running Locally

# Standard launch
python app.py

# Development mode (auto-reload)
gradio app.py

# With public URL (temporary sharing)
gradio app.py --share

Access the dashboard at: http://localhost:7860

Running Tests

# Run all tests
python -m pytest tests/

# Run specific test file
python -m pytest tests/test_profanity.py

# Run with coverage
python -m pytest --cov=src tests/

Deployment to Hugging Face Spaces

Method 1: Web Interface

  1. Go to https://huggingface.co/new-space
  2. Name your Space: OFPBadWord
  3. Select SDK: Gradio
  4. Select License: apache-2.0
  5. Create Space
  6. Clone repository:
    git clone https://huggingface.co/spaces/YOUR_USERNAME/OFPBadWord
    cd OFPBadWord
    
  7. Copy all project files into the cloned directory
  8. Commit and push:
    git add .
    git commit -m "Initial deployment"
    git push
    
  9. Wait for automatic build (check Logs tab)
  10. Access your Space at: https://huggingface.co/spaces/YOUR_USERNAME/OFPBadWord

Method 2: Gradio CLI (Faster)

# From project directory
gradio deploy

# Follow prompts:
# - Log in to Hugging Face
# - Confirm Space name
# - Choose public/private

Usage

Dashboard Features

The dashboard displays:

  • Connection Status: Current monitoring state
  • Violations Detected: Total count of profanity detections
  • Alerts Sent: Number of alerts sent to convener
  • Messages Processed: Total messages analyzed
  • Activity Log: Real-time event log

Test Panel

Use the test panel to verify profanity detection:

  1. Enter text in the "Test Message" field
  2. Click "Detect" button
  3. View detection results including:
    • Whether profanity was detected
    • Severity level (low/medium/high)
    • List of violating words
    • Censored text

Simulating Violations

Click "Simulate Test Violation" to:

  • Generate a mock OFP envelope with profanity
  • Process it through the sentinel
  • Generate an alert to convener
  • Update dashboard statistics

OFP Implementation

This sentinel follows Open Floor Protocol specifications:

  • Dialog Event Object v1.0.2: Structure for text utterances
  • Inter-agent Message v1.0.0: Envelope format for communication
  • Assistant Manifest v1.0.0: Sentinel identification

Alert Structure

When profanity is detected, the sentinel sends a private alert to the convener:

{
  "alertType": "content_violation",
  "severity": "medium",
  "violatingMessage": {
    "messageId": "de:abc123",
    "speakerUri": "tag:user,2025:john",
    "timestamp": "2025-01-01T12:00:00Z",
    "excerpt": "[censored text]"
  },
  "detectedPatterns": ["word1", "word2"],
  "violationCount": 2,
  "recommendedAction": "revoke_floor_temporary",
  "context": {
    "conversationId": "conv:xyz789",
    "totalViolations": 5,
    "detectionTime": "2025-01-01T12:00:01Z",
    "sentinelUri": "tag:sentinel,2025:monitor"
  }
}

Recommended Actions by Severity

  • Low: warn_user - Send warning message
  • Medium: revoke_floor_temporary - Remove speaking privileges temporarily
  • High: uninvite_user - Remove from conversation

Customization

Adding Custom Bad Words

Edit config/wordlist.txt:

# Custom Bad Word List
spam
phishing
scam
inappropriate_term

Whitelisting False Positives

In config/config.yaml:

profanity:
  whitelist:
    - scunthorpe
    - arsenal
    - classic

Adjusting Monitoring Interval

In config/config.yaml:

monitoring:
  check_interval: 30  # seconds
  auto_start: true

Troubleshooting

Issue: Profanity not detected

Solution:

  • Verify word is in profanity list using test panel
  • Add to custom word list if needed
  • Check whitelist isn't excluding it

Issue: False positives

Solution:

  • Add words to whitelist in config.yaml
  • Common false positives: scunthorpe, arsenal, pussycat

Issue: Dashboard not updating

Solution:

  • Check background scheduler is running
  • Verify monitoring status is "Active"
  • Try manual refresh button

Issue: Alerts not sending

Solution:

  • Verify convener URL in config.yaml
  • Check network connectivity
  • Review logs for error messages

Production Deployment

Important: This is a demonstration interface. For production use:

  1. Connect to Real OFP Streams: Replace simulated monitoring with actual OFP websocket or HTTP endpoint listeners
  2. Secure Endpoints: Use HTTPS and authentication
  3. Database Storage: Store violation history for analytics
  4. Rate Limiting: Prevent alert spam
  5. Email Notifications: Alert admins of critical violations
  6. Horizontal Scaling: Deploy multiple sentinels for high-traffic conversations

Contributing

Contributions welcome! Areas for improvement:

  • Multi-language profanity detection
  • Context-aware detection to reduce false positives
  • ML-based detection as alternative to keyword matching
  • Dashboard analytics and trends
  • Integration with popular chat platforms

License

Apache 2.0 - See LICENSE file for details

Links

Support

For issues and feature requests, please open an issue on GitHub.


Note: This sentinel is designed as a passive monitoring layer that respects user privacy by sending alerts only to conveners who have enforcement authority. It never publicly announces violations or disrupts conversation flow.