Spaces:

bobbyni819
/

HickeyLabSocialMedia

Sleeping

Extracts token counts from every Gemini API response
Calculates costs based on Gemini 2.5 Flash pricing ($0.075 per 1M input tokens, $0.30 per 1M output tokens)
Logs all usage to logs/usage.jsonl with timestamps
Tracks daily and monthly usage statistics
Enforces budget caps (blocks service when exceeded)
Generates usage reports

How to use it:

Set budget limits in config.py:
- DAILY_QUERY_LIMIT: Maximum queries per day (default: 200)
- MONTHLY_BUDGET_USD: Monthly budget cap (default: $50)
- DAILY_BUDGET_WARNING: Warning threshold (default: $5)
View usage stats in the sidebar by checking "📊 Show Usage Stats"

Generate reports manually:

from utils.cost_tracker import CostTracker
tracker = CostTracker()
print(tracker.generate_daily_report())
print(tracker.generate_monthly_report(2024, 12))

2. Rate Limiting System (`utils/rate_limiter.py`)

Prevents abuse through configurable rate limits.

What it does:

Tracks queries per session using sliding time windows
Enforces hourly limits (default: 20 queries per hour)
Enforces daily limits (default: 200 queries per 24 hours)
Shows warnings when approaching limits (at 80% by default)
Blocks queries when limits exceeded with friendly messages
Logs rate limit violations

How to use it:

Configure limits in config.py:
- RATE_LIMIT_PER_HOUR: Queries per hour (default: 20)
- RATE_LIMIT_PER_DAY: Queries per day (default: 200)
- RATE_LIMIT_WARNING_THRESHOLD: When to warn (default: 0.8 = 80%)
Users will automatically see warnings like:
- "⚠️ You have 4 questions remaining this hour"
- "🕐 Rate limit reached! Please wait 15 minutes..."

3. Security Module (`utils/security.py`)

Validates and sanitizes user input to prevent attacks.

What it does:

Checks input length (1-2000 characters by default)
Detects prompt injection attempts ("ignore previous instructions", etc.)
Blocks suspicious patterns (script tags, template injection, etc.)
Detects excessive special characters
Logs all security violations for review

How to use it:

Configure limits in config.py:
- MAX_INPUT_LENGTH: Maximum characters (default: 2000)
- MIN_INPUT_LENGTH: Minimum characters (default: 1)
Security is automatic - invalid inputs are rejected with user-friendly messages
Review security logs in logs/security.jsonl to monitor threats

4. Alert System (`utils/alerts.py`)

Sends push notifications for critical events using ntfy.sh.

What it does:

Sends push notifications to your phone/browser via ntfy.sh (free, no signup)
Alerts for rate limit violations
Alerts for cost threshold breaches
Alerts for suspicious activity
Alerts for error spikes
Supports priority levels (min, low, default, high, urgent)

How to set it up:

Subscribe to notifications:
- Option A (Browser): Go to https://ntfy.sh/YOUR-TOPIC-NAME and click "Subscribe"
- Option B (Mobile App):
  - Install ntfy app (iOS/Android)
  - Add subscription with your topic name
Choose a SECURE topic name:
- ⚠️ IMPORTANT: Use a random, hard-to-guess name for security!
- ✅ Good: hickeylab-alerts-x9k2m7a4
- ❌ Bad: hickeylab-alerts (anyone can subscribe)
Configure the topic:
- Set in config.py: NTFY_TOPIC = "your-topic-name"
- Or set environment variable: NTFY_TOPIC=your-topic-name

Test it:

python -c "from utils.alerts import AlertSystem; AlertSystem().test_alert()"

Or:

curl -d "Test alert" ntfy.sh/your-topic-name

What you'll be notified about:

⚠️ User hits rate limit
💰 Daily/monthly cost thresholds (80%, 100%)
🔍 Suspicious activity detected
🚨 Service paused due to budget limits

✅ Phase 2: Monitoring & Quality (Medium Priority 🟡)

5. Enhanced Logging

All queries are logged with metadata for analysis.

What's logged:

Timestamp
Session ID (truncated for privacy)
Question length
Token counts (prompt, response, total)
Estimated cost
Response time
Success/failure status
Error messages (if any)

Log files:

logs/usage.jsonl - All API usage
logs/rate_limits.jsonl - Rate limit violations
logs/security.jsonl - Security violations

6. Conversation Context

Maintains context across multiple messages for better responses.

What it does:

Includes last 5 exchanges in each query (configurable)
Allows follow-up questions to reference previous messages
Example:
- User: "What is CODEX?"
- Assistant: [explains CODEX]
- User: "How does it compare to IBEX?"
- Assistant: [compares CODEX (from context) to IBEX]

How to configure:

Adjust CONVERSATION_HISTORY_LENGTH in config.py (default: 5)

7. Enhanced System Prompt

Improved instructions for better response quality.

What's improved:

Conversation context awareness
Response structure guidelines (2-4 paragraphs for complex topics)
Specific citation instructions
Technical term explanation requirements
Grounding in knowledge base (no hallucinations)

✅ Phase 3: User Experience (Low Priority 🟢)

8. Suggested Questions

Shows starter questions when chat is empty.

What it does:

Displays 4 suggested questions as clickable buttons
Questions are configured in config.py
Helps new users get started

How to customize:

Edit SUGGESTED_QUESTIONS in config.py

9. Privacy Notice

Displays privacy and usage information.

What it shows:

Data processing information
Usage limits
Privacy policy

How to customize:

Edit PRIVACY_NOTICE in config.py

10. Usage Statistics Dashboard

Shows real-time usage stats in sidebar.

What it shows:

Today's query count and cost
This month's query count and cost
Optional display (checkbox in sidebar)

11. Mobile Responsive Design

Improved CSS for mobile devices.

What's improved:

Touch-friendly button sizes (44px minimum)
Appropriate font sizes
No iOS zoom on input focus
Responsive layout

🚀 Deployment Instructions

For HuggingFace Spaces:

Set up secrets:
- Go to Space Settings → Variables and secrets
- Add GEMINI_API_KEY as a Secret
- (Optional) Add NTFY_TOPIC for notifications
Upload files:
- Upload the entire outreach/pipelines/gemini_file_search/ directory
- Ensure all files are included:
  - app.py
  - config.py
  - requirements.txt
  - utils/ directory with all modules
The app will automatically:
- Install dependencies from requirements.txt
- Start the Streamlit app
- Create logs/ directory when first query is made

Environment Variables:

Variable	Required	Description
`GEMINI_API_KEY`	✅ Yes	Your Google Gemini API key
`NTFY_TOPIC`	❌ Optional	Your ntfy.sh topic for push notifications

First-Time Setup:

Test the app with a few queries
Subscribe to notifications if you set up ntfy.sh
Check logs in logs/ directory (if accessible)
Adjust limits in config.py if needed

📊 Monitoring & Maintenance

Daily Tasks:

Check usage stats in the sidebar
Watch for notification alerts on your phone/browser

Weekly Tasks:

Review logs/usage.jsonl for usage patterns
Check logs/security.jsonl for any threats
Adjust rate limits if needed

Monthly Tasks:

Generate monthly cost report
Review budget and adjust if needed
Update system prompt based on user feedback

Generating Reports:

from utils.cost_tracker import CostTracker

tracker = CostTracker()

# Daily report
print(tracker.generate_daily_report())

# Monthly report
print(tracker.generate_monthly_report(2024, 12))

# Custom date
from datetime import datetime
print(tracker.generate_daily_report(datetime(2024, 12, 15)))

⚙️ Configuration Reference

All configuration is in config.py. Key settings:

Cost Management:

DAILY_QUERY_LIMIT = 200           # Max queries per day
MONTHLY_BUDGET_USD = 50.0         # Hard budget cap
DAILY_BUDGET_WARNING = 5.0        # Alert threshold

Rate Limiting:

RATE_LIMIT_PER_HOUR = 20          # Queries per hour
RATE_LIMIT_PER_DAY = 200          # Queries per 24 hours
RATE_LIMIT_WARNING_THRESHOLD = 0.8  # Warn at 80%

Security:

MAX_INPUT_LENGTH = 2000           # Max characters
MIN_INPUT_LENGTH = 1              # Min characters

Alerts:

NTFY_TOPIC = ""                   # Your ntfy.sh topic
ALERTS_ENABLED = True             # Enable/disable

Response Quality:

CONVERSATION_HISTORY_LENGTH = 5   # Messages of context
ENHANCED_SYSTEM_PROMPT = "..."   # Full prompt in file

UI/UX:

SUGGESTED_QUESTIONS = [...]       # Starter questions
PRIVACY_NOTICE = "..."           # Privacy text

🔧 Troubleshooting

Logs not being created:

Check file permissions
Ensure logs/ directory is not in .gitignore for deployment
HuggingFace Spaces may not persist logs across restarts

Notifications not working:

Verify NTFY_TOPIC is set correctly
Test with: curl -d "test" ntfy.sh/your-topic
Check you're subscribed to the right topic
Ensure ALERTS_ENABLED = True in config

Rate limits too strict/lenient:

Adjust RATE_LIMIT_PER_HOUR and RATE_LIMIT_PER_DAY in config.py
Changes take effect on app restart

Budget exceeded too quickly:

Review logs/usage.jsonl for unusual activity
Check if there's an attack (many rapid queries)
Adjust MONTHLY_BUDGET_USD if legitimate traffic

Conversation context not working:

Verify CONVERSATION_HISTORY_LENGTH > 0
Check that messages are being stored in st.session_state.messages

📚 Additional Resources

Gemini API Pricing: https://ai.google.dev/pricing
ntfy.sh Documentation: https://ntfy.sh
HuggingFace Spaces: https://huggingface.co/docs/hub/spaces
Streamlit Documentation: https://docs.streamlit.io

🎯 What You Need to Do

Required:

✅ Deploy the updated code to HuggingFace Spaces
✅ Set GEMINI_API_KEY secret in HuggingFace
✅ Test with a few queries to verify it works

Optional but Recommended:

📱 Set up ntfy.sh notifications:
- Pick a random topic name
- Subscribe on your phone/browser
- Set NTFY_TOPIC in HuggingFace secrets
- Test it works
⚙️ Adjust configuration in config.py:
- Set appropriate rate limits
- Set monthly budget
- Customize suggested questions
📊 Monitor usage:
- Check sidebar stats regularly
- Watch for notification alerts
- Review logs if accessible

📞 Support

If you encounter any issues:

Check the troubleshooting section above
Review the logs (if accessible)
Check HuggingFace Spaces logs for errors
Verify environment variables are set correctly

That's it! All the production-ready features from the roadmap have been implemented. The system is now protected against cost overruns, abuse, and security threats, with monitoring and alerting in place.

Production Features Implementation Guide

📦 What Has Been Implemented

✅ Phase 1: Foundation - Cost & Security Controls (High Priority 🔴)

1. Cost Management Module (utils/cost_tracker.py)

2. Rate Limiting System (utils/rate_limiter.py)

3. Security Module (utils/security.py)

4. Alert System (utils/alerts.py)

✅ Phase 2: Monitoring & Quality (Medium Priority 🟡)

5. Enhanced Logging

6. Conversation Context

7. Enhanced System Prompt

✅ Phase 3: User Experience (Low Priority 🟢)

8. Suggested Questions

9. Privacy Notice

10. Usage Statistics Dashboard

11. Mobile Responsive Design

🚀 Deployment Instructions

For HuggingFace Spaces:

Environment Variables:

First-Time Setup:

📊 Monitoring & Maintenance

Daily Tasks:

Weekly Tasks:

Monthly Tasks:

Generating Reports:

⚙️ Configuration Reference

Cost Management:

Rate Limiting:

Security:

Alerts:

Response Quality:

UI/UX:

🔧 Troubleshooting

Logs not being created:

Notifications not working:

Rate limits too strict/lenient:

Budget exceeded too quickly:

Conversation context not working:

📚 Additional Resources

🎯 What You Need to Do

Required:

Optional but Recommended:

📞 Support

1. Cost Management Module (`utils/cost_tracker.py`)

2. Rate Limiting System (`utils/rate_limiter.py`)

3. Security Module (`utils/security.py`)

4. Alert System (`utils/alerts.py`)