# Simplified Data Sanitization Documentation ## Overview The simplified data sanitization module provides focused input validation and sanitization for the Recipe Recommendation Bot API. It's designed specifically for recipe chatbot context with essential security protection. ## Features ### 🛡️ **Essential Security Protection** - **XSS Prevention**: HTML encoding and basic script removal - **Input Validation**: Length limits and content validation - **Whitespace Normalization**: Clean formatting ### 🔧 **Simple Configuration** - **Maximum Message Length**: 1000 characters - **Minimum Message Length**: 1 character - **Single Method**: One sanitization method for all inputs ## Usage ### Basic Sanitization ```python from utils.sanitization import sanitize_user_input # Sanitize any user input (chat messages, demo prompts) clean_input = sanitize_user_input("What are some chicken recipes?") ``` ### Advanced Usage ```python from utils.sanitization import DataSanitizer # Direct class usage sanitizer = DataSanitizer() clean_text = sanitizer.sanitize_input("User input") ``` ## Security Patterns Handled ### Basic XSS Protection - `Tell me about pasta" → "Tell me about pasta" " How to cook rice? " → "How to cook rice?" "What about desserts & sweets?" → "What about desserts & sweets?" ``` ### Invalid Inputs (Rejected): ```python "" → ValueError: Input cannot be empty "a" * 1001 → ValueError: Input too long (maximum 1000 characters) ``` ## Best Practices 1. **Keep It Simple**: Focus on actual threats for recipe chatbot 2. **Context Appropriate**: Don't over-engineer for non-existent threats 3. **User Friendly**: Allow normal recipe-related punctuation 4. **Clear Errors**: Provide helpful error messages 5. **Test Regularly**: Verify with real recipe queries This simplified approach provides adequate protection while maintaining usability for a recipe recommendation chatbot context.