Spaces:

arjitmat
/

crypto-compliance-agent

Sleeping

App Files Files Community

arjitmat commited on Oct 23, 2025

Commit

13e7acd

verified ·

1 Parent(s): 59b84e5

Upload 72 files

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

.gitattributes +1 -0
README.md +315 -8
app.py +17 -0
app/__init__.py +0 -0
app/__pycache__/__init__.cpython-313.pyc +0 -0
app/__pycache__/streamlit_app.cpython-313.pyc +0 -0
app/components/__init__.py +15 -0
app/components/__pycache__/__init__.cpython-313.pyc +0 -0
app/components/__pycache__/input_form.cpython-313.pyc +0 -0
app/components/__pycache__/monitoring_setup.cpython-313.pyc +0 -0
app/components/__pycache__/results_display.cpython-313.pyc +0 -0
app/components/input_form.py +223 -0
app/components/monitoring_setup.py +247 -0
app/components/results_display.py +374 -0
app/streamlit_app.py +208 -0
chroma_db/c30b23e7-6559-433a-b87d-7290f0978053/data_level0.bin +3 -0
chroma_db/c30b23e7-6559-433a-b87d-7290f0978053/header.bin +3 -0
chroma_db/c30b23e7-6559-433a-b87d-7290f0978053/length.bin +3 -0
chroma_db/c30b23e7-6559-433a-b87d-7290f0978053/link_lists.bin +3 -0
chroma_db/chroma.sqlite3 +3 -0
config.toml +15 -0
requirements.txt +32 -0
src/__init__.py +0 -0
src/__pycache__/__init__.cpython-313.pyc +0 -0
src/__pycache__/config.cpython-313.pyc +0 -0
src/agents/__init__.py +0 -0
src/agents/__pycache__/__init__.cpython-313.pyc +0 -0
src/agents/__pycache__/orchestrator.cpython-313.pyc +0 -0
src/agents/__pycache__/tools.cpython-313.pyc +0 -0
src/agents/orchestrator.py +621 -0
src/agents/tools.py +419 -0
src/analysis/__init__.py +0 -0
src/analysis/__pycache__/__init__.cpython-313.pyc +0 -0
src/analysis/__pycache__/compliance_engine.cpython-313.pyc +0 -0
src/analysis/__pycache__/cost_calculator.cpython-313.pyc +0 -0
src/analysis/__pycache__/risk_scorer.cpython-313.pyc +0 -0
src/analysis/__pycache__/token_classifier.cpython-313.pyc +0 -0
src/analysis/compliance_engine.py +486 -0
src/analysis/cost_calculator.py +544 -0
src/analysis/risk_scorer.py +454 -0
src/analysis/token_classifier.py +487 -0
src/config.py +70 -0
src/data/__init__.py +0 -0
src/data/__pycache__/__init__.cpython-313.pyc +0 -0
src/data/__pycache__/vectordb.cpython-313.pyc +0 -0
src/data/regulations/eu/mica-asset-referenced-tokens-2024.json +266 -0
src/data/regulations/schema.json +178 -0
src/data/regulations/singapore/mas-cmp-real-estate-tokens-2024.json +275 -0
src/data/regulations/uae/vara-sto-real-estate-2024.json +229 -0
src/data/regulations/uk/fca-cis-property-tokens-2024.json +282 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+chroma_db/chroma.sqlite3 filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,12 +1,319 @@
 ---
-title: Crypto Compliance Agent
-emoji: 📉
-colorFrom: yellow
-colorTo: green
-sdk: gradio
-sdk_version: 5.49.1
-app_file: app.py
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Crypto Compliance Intelligence Agent
+emoji: 🔒
+colorFrom: blue
+colorTo: indigo
+sdk: streamlit
+sdk_version: 1.29.0
+app_file: app/streamlit_app.py
 pinned: false
+license: mit
 ---
+# Crypto Compliance Intelligence Agent
+> Multi-jurisdiction crypto regulatory compliance analysis powered by AI agents and RAG
+---
+## Overview
+An AI-powered system that helps crypto businesses navigate complex regulatory requirements across multiple jurisdictions. Get instant compliance analysis, risk scoring, and cost estimates in under 60 seconds.
+**Supported Jurisdictions**: US (SEC + State MTLs) | EU (MiCA) | Singapore (MAS) | UK (FCA) | UAE (VARA)
+---
+## Features
+- **Multi-Jurisdiction Analysis**: Comprehensive compliance mapping across US, EU, Singapore, and UK
+- **AI Agent Architecture**: LangGraph-powered agent with tool use and reasoning
+- **RAG-Based Retrieval**: Semantic search over regulatory documents using ChromaDB
+- **Token Classification**: Automated Howey Test analysis for security determination
+- **Risk Scoring**: 0-100 weighted risk assessment based on gaps and severity
+- **Cost Estimation**: First-year and ongoing compliance cost breakdowns
+- **Regulatory Monitoring**: Automated scraping of SEC, MiCA, MAS, and FCA sources
+- **Explainable AI**: Full agent reasoning chain visible in results
+---
+## Tech Stack
+| Component | Technology |
+|-----------|------------|
+| **LLM** | Google Gemini Flash 2.5 (2M context window) |
+| **Agent Framework** | LangGraph (state machines + tool use) |
+| **Vector Database** | ChromaDB (local persistence) |
+| **Embeddings** | sentence-transformers/all-MiniLM-L6-v2 |
+| **NLP Models** | FinBERT, Legal-BERT |
+| **Web Framework** | Streamlit |
+| **Data Processing** | pandas, pdfplumber, BeautifulSoup4 |
+| **Deployment** | HuggingFace Spaces |
+---
+## Quick Start
+### Prerequisites
+- Python 3.11+ (tested on 3.13.5)
+- Google Gemini API key ([Get it here](https://makersuite.google.com/app/apikey))
+- HuggingFace token (optional, for model downloads)
+### Installation
+```bash
+# Clone repository
+git clone https://github.com/yourusername/crypto-compliance-agent.git
+cd crypto-compliance-agent
+# Create virtual environment
+python -m venv venv
+source venv/bin/activate  # Windows: venv\Scripts\activate
+# Install dependencies
+pip install -r requirements.txt
+# Configure environment
+cp .env.example .env
+# Edit .env and add your GEMINI_API_KEY
+```
+### Initialize Database
+```bash
+# Populate ChromaDB with regulatory data
+python scripts/setup_vectordb.py
+```
+### Run Locally
+```bash
+streamlit run app/streamlit_app.py
+```
+Navigate to `http://localhost:8501`
+---
+## Project Structure
+```
+crypto-compliance-agent/
+├── src/
+│   ├── config.py                # Configuration management
+│   ├── models/                  # LLM and embeddings
+│   │   ├── llm.py              # Gemini integration
+│   │   └── embeddings.py       # Sentence transformers
+│   ├── agents/                  # LangGraph agents
+│   │   ├── orchestrator.py     # Main agent coordinator
+│   │   ├── tools.py            # Agent tools (search, calculate, etc.)
+│   │   ├── classifier.py       # Activity/token classification
+│   │   └── analyzer.py         # Compliance analysis
+│   ├── data/                    # Data layer
+│   │   ├── vectordb.py         # ChromaDB interface
+│   │   ├── scrapers/           # Regulatory source scrapers
+│   │   │   ├── sec.py          # US SEC
+│   │   │   ├── mica.py         # EU MiCA
+│   │   │   ├── mas.py          # Singapore MAS
+│   │   │   └── fca.py          # UK FCA
+│   │   └── regulations/        # Regulatory knowledge base
+│   │       ├── us/
+│   │       ├── eu/
+│   │       ├── singapore/
+│   │       └── uk/
+│   ├── processors/              # Document processing
+│   │   ├── document_parser.py  # PDF/text extraction
+│   │   └── entity_extraction.py # FinBERT NER
+│   ├── analysis/                # Analysis engines
+│   │   ├── compliance_engine.py # Rule matching
+│   │   ├── risk_scorer.py      # Risk calculation
+│   │   ├── token_classifier.py # Howey Test
+│   │   └── cost_calculator.py  # Cost estimation
+│   └── utils/                   # Utilities
+├── app/
+│   ├── streamlit_app.py        # Main UI entry point
+│   └── components/              # UI components
+│       ├── input_form.py
+│       ├── results_display.py
+│       └── monitoring_setup.py
+├── docs/
+│   ├── system_index.md         # Technical documentation
+│   ├── navigation_guide.md     # Non-technical guide
+│   └── project_explanation.md  # Detailed explanation
+├── context/
+│   ├── development_log.md      # Project history
+│   ├── resume_context.md       # Session recovery (not committed)
+│   └── portfolio_doc.md        # Career documentation (not committed)
+├── tests/                       # Unit tests
+├── scripts/                     # Utility scripts
+│   ├── setup_vectordb.py
+│   ├── update_regulations.py
+│   └── test_agent.py
+└── requirements.txt
+```
+---
+## Usage Example
+### Input
+```python
+{
+  "jurisdictions": ["United States", "European Union"],
+  "activities": ["NFT marketplace", "Crypto payment processor"],
+  "description": "Users buy and sell NFTs using credit cards and crypto. We take a 2% fee.",
+  "token_info": None
+}
+```
+### Output
+```
+Risk Score: 65/100 (Medium-High)
+Compliance Gaps: 4
+Estimated First-Year Cost: $250,000 - $550,000
+US Compliance:
+❌ FinCEN MSB Registration Required ($5k-$10k)
+❌ State MTL Licenses (15 states targeted: $750k-$2.25M)
+✅ No SEC registration (NFTs not securities)
+EU Compliance:
+❌ MiCA CASP Authorization (€100k-€300k)
+❌ AML/KYC Compliance (€50k/year ongoing)
+Recommendations:
+1. File FinCEN MSB within 180 days
+2. Prioritize MTL applications for high-volume states (CA, NY, TX)
+3. Engage MiCA legal counsel in EU member state of choice
+4. Implement KYC/AML procedures before EU launch
+```
+---
+## Documentation
+- **[System Index](docs/system_index.md)**: Technical architecture and API reference
+- **[Navigation Guide](docs/navigation_guide.md)**: Non-technical codebase guide
+- **[Project Explanation](docs/project_explanation.md)**: Detailed project overview
+---
+## Development
+### Running Tests
+```bash
+pytest tests/
+```
+### Code Quality
+```bash
+# Format code
+black src/ app/ tests/
+# Lint
+flake8 src/ app/ tests/
+```
+### Update Regulations
+```bash
+# Run weekly to fetch latest regulations
+python scripts/update_regulations.py
+```
+---
+## Deployment
+### HuggingFace Spaces
+1. Create new Space: [https://huggingface.co/new-space](https://huggingface.co/new-space)
+2. Select **Streamlit** SDK
+3. Link GitHub repository
+4. Add secrets in Space settings:
+   - `GEMINI_API_KEY`
+   - `HF_TOKEN` (optional)
+5. Deploy
+### Environment Variables
+Required:
+- `GEMINI_API_KEY`: Google Gemini API key
+Optional:
+- `HF_TOKEN`: HuggingFace token
+- `GITHUB_TOKEN`: GitHub access (for scrapers)
+- `LOG_LEVEL`: Logging level (default: INFO)
+---
+## Important Disclaimers
+### ⚠️ NOT LEGAL ADVICE
+This tool provides **general information only** and does NOT constitute legal, financial, or regulatory advice.
+- **No Warranties**: Accuracy not guaranteed. Regulations change frequently.
+- **No Liability**: Users assume all risk. Creators not liable for compliance failures, fines, or legal issues.
+- **Consult Lawyers**: Always consult qualified legal counsel before making compliance decisions.
+Use at your own risk.
+---
+## Roadmap
+- [x] Phase 1: Foundation setup
+- [x] Phase 2: Regulatory data layer (5 jurisdictions, ChromaDB)
+- [x] Phase 3: NLP & entity extraction (Howey Test, entity recognition)
+- [x] Phase 4: Agent architecture (LangGraph, 6 nodes, 6 tools)
+- [x] Phase 5: Analysis & scoring engine (ComplianceEngine, RiskScorer, CostCalculator)
+- [x] Phase 6: Streamlit UI (3-tab interface, Plotly visualizations)
+- [x] Phase 7: Testing & optimization (Integration tests, performance validation)
+- [ ] Phase 8: Deployment to HuggingFace Spaces (IN PROGRESS)
+See `context/development_log.md` for detailed progress.
+---
+## Contributing
+Contributions welcome! Please:
+1. Fork the repository
+2. Create a feature branch (`git checkout -b feature/amazing-feature`)
+3. Commit changes (`git commit -m 'Add amazing feature'`)
+4. Push to branch (`git push origin feature/amazing-feature`)
+5. Open a Pull Request
+---
+## License
+[To be determined]
+---
+## Contact
+- **Issues**: [GitHub Issues](https://github.com/yourusername/crypto-compliance-agent/issues)
+- **Documentation**: `docs/` folder
+---
+## Acknowledgments
+- **Google Gemini**: LLM provider
+- **LangChain/LangGraph**: Agent framework
+- **ChromaDB**: Vector database
+- **HuggingFace**: Model hosting and deployment
+- **Regulatory Sources**: SEC, EU Official Journal, MAS, FCA
+---
+**Built with AI assistance** (Claude Code) | **Status**: Phases 1-7 Complete - Ready for Deployment | **Last Updated**: 2025-10-22

app.py ADDED Viewed

	@@ -0,0 +1,17 @@

+"""
+Entry point for HuggingFace Spaces deployment.
+Redirects to the main Streamlit app.
+"""
+import subprocess
+import sys
+if __name__ == "__main__":
+    # Run the Streamlit app
+    subprocess.run([
+        sys.executable, "-m", "streamlit", "run",
+        "app/streamlit_app.py",
+        "--server.headless=true",
+        "--server.port=7860",
+        "--server.enableCORS=false"
+    ])

app/__init__.py ADDED Viewed

File without changes

app/__pycache__/__init__.cpython-313.pyc ADDED Viewed

Binary file (191 Bytes). View file

app/__pycache__/streamlit_app.cpython-313.pyc ADDED Viewed

Binary file (8.81 kB). View file

app/components/__init__.py ADDED Viewed

	@@ -0,0 +1,15 @@

+"""
+UI Components for Streamlit app
+"""
+from app.components.input_form import render_input_form, get_jurisdiction_display_name, get_activity_display_name
+from app.components.results_display import render_results
+from app.components.monitoring_setup import render_monitoring_setup
+__all__ = [
+    'render_input_form',
+    'render_results',
+    'render_monitoring_setup',
+    'get_jurisdiction_display_name',
+    'get_activity_display_name'
+]

app/components/__pycache__/__init__.cpython-313.pyc ADDED Viewed

Binary file (608 Bytes). View file

app/components/__pycache__/input_form.cpython-313.pyc ADDED Viewed

Binary file (9.75 kB). View file

app/components/__pycache__/monitoring_setup.cpython-313.pyc ADDED Viewed

Binary file (10.4 kB). View file

app/components/__pycache__/results_display.cpython-313.pyc ADDED Viewed

Binary file (17.3 kB). View file

app/components/input_form.py ADDED Viewed

	@@ -0,0 +1,223 @@

+"""
+Input Form Component
+Renders the user input form for compliance analysis.
+"""
+import streamlit as st
+from typing import Dict, Optional
+def render_input_form() -> Optional[Dict]:
+    """
+    Render input form for compliance analysis.
+    Returns:
+        Dictionary with form data if submitted, None otherwise
+    """
+    with st.form("compliance_form"):
+        st.markdown("### Business Information")
+        # Jurisdiction selection
+        st.markdown("#### 1. Select Target Jurisdictions")
+        st.caption("Choose all jurisdictions where you plan to operate")
+        col1, col2 = st.columns(2)
+        with col1:
+            us_selected = st.checkbox("🇺🇸 United States (SEC)", value=False)
+            eu_selected = st.checkbox("🇪🇺 European Union (MiCA)", value=False)
+            singapore_selected = st.checkbox("🇸🇬 Singapore (MAS)", value=False)
+        with col2:
+            uk_selected = st.checkbox("🇬🇧 United Kingdom (FCA)", value=False)
+            uae_selected = st.checkbox("🇦🇪 UAE (VARA)", value=False)
+        # Build jurisdictions list
+        jurisdictions = []
+        if us_selected:
+            jurisdictions.append('us')
+        if eu_selected:
+            jurisdictions.append('eu')
+        if singapore_selected:
+            jurisdictions.append('singapore')
+        if uk_selected:
+            jurisdictions.append('uk')
+        if uae_selected:
+            jurisdictions.append('uae')
+        st.markdown("---")
+        # Activity selection
+        st.markdown("#### 2. Select Crypto Activities")
+        st.caption("Check all activities your business will perform")
+        col1, col2, col3 = st.columns(3)
+        with col1:
+            exchange = st.checkbox("💱 Exchange/Trading Platform", value=False)
+            custody = st.checkbox("🔐 Custody Services", value=False)
+            staking = st.checkbox("📊 Staking Services", value=False)
+            lending = st.checkbox("💰 Lending", value=False)
+            borrowing = st.checkbox("📈 Borrowing", value=False)
+        with col2:
+            payment = st.checkbox("💳 Payment Processing", value=False)
+            token_issuance = st.checkbox("🪙 Token Issuance", value=False)
+            mining = st.checkbox("⛏️ Mining", value=False)
+            nft = st.checkbox("🖼️ NFT Marketplace", value=False)
+        with col3:
+            defi = st.checkbox("🔄 DeFi Protocol", value=False)
+            derivatives = st.checkbox("📉 Derivatives", value=False)
+            otc = st.checkbox("🤝 OTC Trading", value=False)
+            wallet = st.checkbox("👛 Wallet Services", value=False)
+        # Build activities list
+        activities = []
+        if exchange:
+            activities.append('exchange')
+        if custody:
+            activities.append('custody')
+        if staking:
+            activities.append('staking')
+        if lending:
+            activities.append('lending')
+        if borrowing:
+            activities.append('borrowing')
+        if payment:
+            activities.append('payment_processing')
+        if token_issuance:
+            activities.append('token_issuance')
+        if mining:
+            activities.append('mining')
+        if nft:
+            activities.append('nft_marketplace')
+        if defi:
+            activities.append('defi_protocol')
+        if derivatives:
+            activities.append('derivatives')
+        if otc:
+            activities.append('otc_trading')
+        if wallet:
+            activities.append('wallet_services')
+        st.markdown("---")
+        # Business description
+        st.markdown("#### 3. Describe Your Business")
+        st.caption("Provide a detailed description of your business model, target users, and how your platform works")
+        business_description = st.text_area(
+            "Business Description",
+            height=150,
+            placeholder="Example: We're building a DeFi lending protocol where users can deposit USDC and earn 6% APY. "
+                       "Borrowers can take loans at 8% interest rate. We'll issue a governance token to early users.",
+            label_visibility="collapsed"
+        )
+        st.markdown("---")
+        # Token information (optional)
+        st.markdown("#### 4. Token Information (Optional)")
+        st.caption("If you're issuing a token, provide details for classification analysis")
+        issue_token = st.checkbox("I am issuing a token", value=False)
+        token_description = None
+        if issue_token:
+            col1, col2 = st.columns(2)
+            with col1:
+                token_name = st.text_input("Token Name", placeholder="e.g., MyToken")
+                token_symbol = st.text_input("Token Symbol", placeholder="e.g., MTK")
+            with col2:
+                token_use_case = st.selectbox(
+                    "Primary Use Case",
+                    ["Utility", "Governance", "Payment", "Security/Investment", "Hybrid"]
+                )
+            token_details = st.text_area(
+                "Token Details",
+                height=100,
+                placeholder="Describe token distribution, vesting, utility, and how holders benefit",
+                help="Include: Distribution model, vesting schedules, utility within platform, rights/benefits for holders"
+            )
+            # Build token description
+            if token_name or token_details:
+                token_description = f"Token Name: {token_name or 'N/A'}\n"
+                token_description += f"Token Symbol: {token_symbol or 'N/A'}\n"
+                token_description += f"Primary Use Case: {token_use_case}\n"
+                token_description += f"Details: {token_details or 'N/A'}"
+        st.markdown("---")
+        # Submit button
+        submitted = st.form_submit_button("🚀 Analyze Compliance", use_container_width=True)
+        if submitted:
+            # Validation
+            errors = []
+            if not jurisdictions:
+                errors.append("⚠️ Please select at least one jurisdiction")
+            if not activities:
+                errors.append("⚠️ Please select at least one crypto activity")
+            if not business_description or len(business_description) < 50:
+                errors.append("⚠️ Please provide a detailed business description (at least 50 characters)")
+            if issue_token and not token_details:
+                errors.append("⚠️ Please provide token details if you're issuing a token")
+            if errors:
+                for error in errors:
+                    st.error(error)
+                return None
+            # Return form data
+            return {
+                'jurisdictions': jurisdictions,
+                'activities': activities,
+                'business_description': business_description,
+                'token_description': token_description,
+                'issue_token': issue_token
+            }
+    return None
+def get_jurisdiction_display_name(jurisdiction: str) -> str:
+    """Get display name for jurisdiction."""
+    mapping = {
+        'us': '🇺🇸 United States',
+        'eu': '🇪🇺 European Union',
+        'singapore': '🇸🇬 Singapore',
+        'uk': '🇬🇧 United Kingdom',
+        'uae': '🇦🇪 UAE'
+    }
+    return mapping.get(jurisdiction, jurisdiction.upper())
+def get_activity_display_name(activity: str) -> str:
+    """Get display name for activity."""
+    mapping = {
+        'exchange': '💱 Exchange/Trading',
+        'custody': '🔐 Custody',
+        'staking': '📊 Staking',
+        'lending': '💰 Lending',
+        'borrowing': '📈 Borrowing',
+        'payment_processing': '💳 Payment Processing',
+        'token_issuance': '🪙 Token Issuance',
+        'mining': '⛏️ Mining',
+        'nft_marketplace': '🖼️ NFT Marketplace',
+        'defi_protocol': '🔄 DeFi Protocol',
+        'derivatives': '📉 Derivatives',
+        'otc_trading': '🤝 OTC Trading',
+        'wallet_services': '👛 Wallet Services'
+    }
+    return mapping.get(activity, activity.replace('_', ' ').title())

app/components/monitoring_setup.py ADDED Viewed

	@@ -0,0 +1,247 @@

+"""
+Monitoring Setup Component
+Allows users to set up alerts for regulatory changes.
+"""
+import streamlit as st
+from datetime import datetime
+def render_monitoring_setup():
+    """
+    Render monitoring and alert configuration interface.
+    """
+    st.markdown("""
+    Stay informed about regulatory changes that may affect your business.
+    Configure alerts to receive notifications about new regulations, deadlines, and compliance updates.
+    """)
+    st.markdown("---")
+    # Email alerts section
+    st.markdown("### 📧 Email Alerts")
+    with st.form("email_alerts_form"):
+        st.markdown("Receive email notifications for regulatory updates")
+        email = st.text_input(
+            "Email Address",
+            placeholder="your.email@company.com"
+        )
+        st.markdown("**Alert Frequency:**")
+        frequency = st.radio(
+            "How often would you like to receive alerts?",
+            ["Immediate (as updates occur)", "Daily digest", "Weekly summary"],
+            label_visibility="collapsed"
+        )
+        st.markdown("**Jurisdictions to Monitor:**")
+        col1, col2 = st.columns(2)
+        with col1:
+            monitor_us = st.checkbox("🇺🇸 United States", value=False)
+            monitor_eu = st.checkbox("🇪🇺 European Union", value=False)
+            monitor_singapore = st.checkbox("🇸🇬 Singapore", value=False)
+        with col2:
+            monitor_uk = st.checkbox("🇬🇧 United Kingdom", value=False)
+            monitor_uae = st.checkbox("🇦🇪 UAE", value=False)
+        st.markdown("**Activity Types:**")
+        activities = st.multiselect(
+            "Select activities to monitor",
+            [
+                "Exchange/Trading",
+                "Custody",
+                "Staking",
+                "Lending/Borrowing",
+                "Token Issuance",
+                "DeFi",
+                "NFTs",
+                "All Activities"
+            ],
+            label_visibility="collapsed"
+        )
+        submit_email = st.form_submit_button("🔔 Subscribe to Alerts", use_container_width=True)
+        if submit_email:
+            if email and '@' in email:
+                st.success(f"✅ Successfully subscribed {email} to regulatory alerts!")
+                st.info("📬 You'll receive a confirmation email shortly.")
+            else:
+                st.error("⚠️ Please enter a valid email address")
+    st.markdown("---")
+    # RSS Feed section
+    st.markdown("### 📰 RSS Feed")
+    st.markdown("""
+    Subscribe to our RSS feed to stay updated using your preferred feed reader.
+    The feed includes:
+    - New regulations and proposed rules
+    - Effective date changes
+    - Enforcement actions
+    - Compliance deadlines
+    """)
+    # Generate RSS feed URL (placeholder)
+    rss_url = "https://compliance-agent.example.com/feed/regulations.xml"
+    col1, col2 = st.columns([3, 1])
+    with col1:
+        st.code(rss_url, language=None)
+    with col2:
+        st.button("📋 Copy URL", use_container_width=True)
+    st.caption("Popular RSS readers: Feedly, Inoreader, NewsBlur, RSS Reader")
+    st.markdown("---")
+    # Webhook section
+    st.markdown("### 🔗 Webhook Integration")
+    st.markdown("""
+    Set up webhooks to receive real-time regulatory updates in your own systems.
+    Ideal for enterprise integrations and custom alerting workflows.
+    """)
+    with st.form("webhook_form"):
+        webhook_url = st.text_input(
+            "Webhook URL",
+            placeholder="https://your-domain.com/api/webhooks/compliance"
+        )
+        webhook_secret = st.text_input(
+            "Secret Key (optional)",
+            type="password",
+            placeholder="Used to verify webhook authenticity"
+        )
+        st.markdown("**Trigger Events:**")
+        col1, col2 = st.columns(2)
+        with col1:
+            trigger_new = st.checkbox("New regulations published", value=True)
+            trigger_updated = st.checkbox("Regulations updated", value=True)
+        with col2:
+            trigger_deadline = st.checkbox("Upcoming deadlines (7 days)", value=True)
+            trigger_enforcement = st.checkbox("Enforcement actions", value=False)
+        submit_webhook = st.form_submit_button("🔌 Configure Webhook", use_container_width=True)
+        if submit_webhook:
+            if webhook_url and webhook_url.startswith('http'):
+                st.success("✅ Webhook configured successfully!")
+                st.json({
+                    "url": webhook_url,
+                    "events": [
+                        e for e, enabled in [
+                            ("regulation.new", trigger_new),
+                            ("regulation.updated", trigger_updated),
+                            ("deadline.upcoming", trigger_deadline),
+                            ("enforcement.action", trigger_enforcement)
+                        ] if enabled
+                    ],
+                    "created_at": datetime.now().isoformat()
+                })
+            else:
+                st.error("⚠️ Please enter a valid HTTPS webhook URL")
+    st.markdown("---")
+    # Recent updates section
+    st.markdown("### 📊 Recent Regulatory Updates")
+    st.markdown("View the latest regulatory changes across all jurisdictions:")
+    # Mock recent updates (in production, this would query the database)
+    recent_updates = [
+        {
+            "date": "2025-10-20",
+            "jurisdiction": "🇪🇺 EU",
+            "title": "MiCA: Final technical standards published",
+            "type": "New Regulation",
+            "impact": "High"
+        },
+        {
+            "date": "2025-10-18",
+            "jurisdiction": "🇺🇸 US",
+            "title": "SEC updates custody rule guidance",
+            "type": "Guidance",
+            "impact": "Medium"
+        },
+        {
+            "date": "2025-10-15",
+            "jurisdiction": "🇸🇬 Singapore",
+            "title": "MAS clarifies staking service requirements",
+            "type": "Clarification",
+            "impact": "Medium"
+        },
+        {
+            "date": "2025-10-12",
+            "jurisdiction": "🇬🇧 UK",
+            "title": "FCA financial promotions regime deadline extended",
+            "type": "Deadline Change",
+            "impact": "Low"
+        },
+        {
+            "date": "2025-10-10",
+            "jurisdiction": "🇦🇪 UAE",
+            "title": "VARA issues marketing and promotions rulebook",
+            "type": "New Regulation",
+            "impact": "High"
+        }
+    ]
+    for update in recent_updates:
+        with st.container():
+            col1, col2, col3, col4 = st.columns([1, 3, 2, 1])
+            with col1:
+                st.markdown(f"**{update['date']}**")
+            with col2:
+                st.markdown(f"{update['jurisdiction']} • {update['title']}")
+            with col3:
+                st.markdown(f"*{update['type']}*")
+            with col4:
+                impact_colors = {
+                    'High': 'red',
+                    'Medium': 'orange',
+                    'Low': 'green'
+                }
+                color = impact_colors.get(update['impact'], 'blue')
+                st.markdown(f":{color}[{update['impact']}]")
+            st.markdown("---")
+    # Statistics
+    st.markdown("### 📈 Update Statistics")
+    col1, col2, col3, col4 = st.columns(4)
+    with col1:
+        st.metric("Updates This Week", "12")
+    with col2:
+        st.metric("Active Subscribers", "247")
+    with col3:
+        st.metric("Avg. Updates/Month", "48")
+    with col4:
+        st.metric("Coverage", "5 jurisdictions")
+    st.markdown("---")
+    st.caption("💡 Tip: Enable notifications for multiple jurisdictions to stay informed about regulatory arbitrage opportunities")

app/components/results_display.py ADDED Viewed

	@@ -0,0 +1,374 @@

+"""
+Results Display Component
+Renders analysis results with visualizations and detailed breakdowns.
+"""
+import streamlit as st
+import plotly.graph_objects as go
+from typing import Dict, Any
+from datetime import datetime, timedelta
+def render_results(result: Dict[Any, Any], form_data: Dict[str, Any]):
+    """
+    Render comprehensive analysis results.
+    Args:
+        result: Agent analysis result
+        form_data: Original form data
+    """
+    # Summary section
+    st.markdown("### 📊 Executive Summary")
+    col1, col2, col3, col4 = st.columns(4)
+    with col1:
+        risk_score = result.get('risk_score', 0)
+        if risk_score:
+            st.metric("Risk Score", f"{risk_score:.1f}/100")
+        else:
+            st.metric("Risk Score", "N/A")
+    with col2:
+        gaps = result.get('compliance_gaps', [])
+        st.metric("Compliance Gaps", len(gaps) if gaps else 0)
+    with col3:
+        cost_estimate = result.get('cost_estimate', {})
+        # Calculate total from jurisdiction costs
+        first_year_cost = 0
+        if cost_estimate:
+            for jur_cost in cost_estimate.values():
+                if isinstance(jur_cost, dict) and 'first_year' in jur_cost:
+                    first_year_cost += jur_cost['first_year'].get('estimate', 0)
+        st.metric("First Year Cost", f"${first_year_cost:,.0f}" if first_year_cost > 0 else "N/A")
+    with col4:
+        jurisdictions = form_data.get('jurisdictions', [])
+        st.metric("Jurisdictions", len(jurisdictions))
+    st.markdown("---")
+    # Risk gauge
+    render_risk_gauge(risk_score)
+    st.markdown("---")
+    # Compliance gaps
+    if gaps:
+        render_compliance_gaps(gaps, form_data.get('jurisdictions', []))
+    else:
+        st.success("✅ No major compliance gaps identified!")
+    st.markdown("---")
+    # Token analysis (if applicable)
+    if form_data.get('issue_token') and result.get('token_classification'):
+        render_token_analysis(result['token_classification'])
+        st.markdown("---")
+    # Cost breakdown
+    if cost_estimate:
+        render_cost_breakdown(cost_estimate, form_data.get('jurisdictions', []))
+        st.markdown("---")
+    # Recommendations
+    recommendations = result.get('recommendations', [])
+    if recommendations:
+        render_recommendations(recommendations)
+        st.markdown("---")
+    # Agent reasoning - hidden from user view (moved to logs only)
+    # Users don't need to see internal agent workflow
+    # reasoning = result.get('reasoning', [])
+    # if reasoning:
+    #     render_reasoning_chain(reasoning)
+def render_risk_gauge(risk_score: float):
+    """Render risk score gauge chart."""
+    st.markdown("### 🎯 Risk Assessment")
+    # Determine risk level and color
+    if risk_score >= 86:
+        risk_level = "CRITICAL"
+        color = "darkred"
+    elif risk_score >= 71:
+        risk_level = "HIGH"
+        color = "red"
+    elif risk_score >= 41:
+        risk_level = "MEDIUM"
+        color = "orange"
+    else:
+        risk_level = "LOW"
+        color = "green"
+    # Create gauge chart
+    fig = go.Figure(go.Indicator(
+        mode="gauge+number",
+        value=risk_score,
+        domain={'x': [0, 1], 'y': [0, 1]},
+        title={'text': f"Risk Level: {risk_level}", 'font': {'size': 24}},
+        gauge={
+            'axis': {'range': [None, 100], 'tickwidth': 1, 'tickcolor': "darkblue"},
+            'bar': {'color': color},
+            'bgcolor': "white",
+            'borderwidth': 2,
+            'bordercolor': "gray",
+            'steps': [
+                {'range': [0, 40], 'color': 'lightgreen'},
+                {'range': [40, 70], 'color': 'lightyellow'},
+                {'range': [70, 85], 'color': 'lightcoral'},
+                {'range': [85, 100], 'color': 'darkred'}
+            ],
+            'threshold': {
+                'line': {'color': "red", 'width': 4},
+                'thickness': 0.75,
+                'value': 85
+            }
+        }
+    ))
+    fig.update_layout(
+        height=300,
+        margin=dict(l=20, r=20, t=60, b=20)
+    )
+    st.plotly_chart(fig, use_container_width=True)
+    # Risk explanation
+    col1, col2 = st.columns([1, 3])
+    with col1:
+        if risk_level == "CRITICAL":
+            st.error("🚨 CRITICAL RISK")
+        elif risk_level == "HIGH":
+            st.warning("⚠️ HIGH RISK")
+        elif risk_level == "MEDIUM":
+            st.info("ℹ️ MEDIUM RISK")
+        else:
+            st.success("✅ LOW RISK")
+    with col2:
+        explanations = {
+            "CRITICAL": "Immediate action required. Multiple critical compliance gaps that could result in severe penalties or business shutdown.",
+            "HIGH": "Urgent compliance measures needed. Significant gaps that pose substantial legal and financial risk.",
+            "MEDIUM": "Compliance improvements recommended. Several gaps identified that should be addressed systematically.",
+            "LOW": "Minimal compliance concerns. Continue monitoring and address minor items as identified."
+        }
+        st.markdown(explanations[risk_level])
+def render_compliance_gaps(gaps: list, jurisdictions: list):
+    """Render compliance gaps table by jurisdiction."""
+    st.markdown("### ⚠️ Compliance Gaps")
+    # Agent returns gaps as simple list without jurisdiction field
+    # Display all gaps in a single section for now
+    if gaps:
+        st.markdown(f"**{len(gaps)} compliance gaps identified across all jurisdictions**")
+        for gap in gaps:
+            render_gap_card(gap)
+    else:
+        st.success("✅ No major compliance gaps identified!")
+def render_gap_card(gap: Dict[str, Any]):
+    """Render individual gap card - always expanded for immediate visibility."""
+    # Agent returns gaps as simple dicts with 'description' and 'severity'
+    severity = gap.get('severity', 'medium')
+    description = gap.get('description', 'Compliance requirement')
+    # Severity badge with icons
+    severity_config = {
+        'critical': {'color': 'red', 'icon': '🚨', 'bg': '#ffe6e6'},
+        'high': {'color': 'orange', 'icon': '⚠️', 'bg': '#fff3e0'},
+        'medium': {'color': 'blue', 'icon': 'ℹ️', 'bg': '#e3f2fd'},
+        'low': {'color': 'green', 'icon': '✓', 'bg': '#e8f5e9'}
+    }
+    config = severity_config.get(severity, severity_config['medium'])
+    # Parse description - Gemini returns multi-line format:
+    # Line 1: **[SEVERITY] Requirement Name (Jurisdiction) - Deadline: X, Cost: $Y**
+    # Line 2+: Description: [details]
+    lines = description.strip().split('\n', 1)
+    title_line = lines[0].strip()
+    detail_text = lines[1].strip() if len(lines) > 1 else ""
+    # Remove markdown bold from title if present
+    title_clean = title_line.replace('**', '').replace('*', '')
+    # Render as card (no expander - always visible)
+    st.markdown(f"""
+    <div style="background-color: {config['bg']}; padding: 1rem; border-left: 5px solid {config['color']}; border-radius: 5px; margin-bottom: 1rem;">
+        <div style="font-weight: bold; margin-bottom: 0.5rem; font-size: 1.1em;">
+            {config['icon']} {title_clean}
+        </div>
+        <div style="color: #666; margin-bottom: 0.5rem;">
+            <strong>Severity:</strong> <span style="color: {config['color']};">{severity.upper()}</span>
+        </div>
+        <div style="margin-top: 0.5rem; line-height: 1.6;">
+            {detail_text if detail_text else title_clean}
+        </div>
+    </div>
+    """, unsafe_allow_html=True)
+def render_token_analysis(token_classification: Dict[str, Any]):
+    """Render token classification analysis."""
+    st.markdown("### 🪙 Token Classification")
+    st.info("Token classification determines applicable regulatory frameworks and compliance requirements.")
+    # Create columns for different jurisdictions
+    for jurisdiction, classification in token_classification.items():
+        st.markdown(f"#### {get_jurisdiction_name(jurisdiction)}")
+        col1, col2 = st.columns([2, 1])
+        with col1:
+            classification_type = classification.get('classification', 'Unknown')
+            confidence = classification.get('confidence', 0) * 100
+            if classification_type in ['security', 'capital markets product']:
+                st.error(f"**Classification:** {classification_type.title()}")
+                st.warning("⚠️ Securities regulations apply. SEC/MAS registration may be required.")
+            else:
+                st.success(f"**Classification:** {classification_type.title()}")
+            st.progress(confidence / 100, text=f"Confidence: {confidence:.0f}%")
+        with col2:
+            # Howey Test results (if available)
+            howey = classification.get('howey_test', {})
+            if howey:
+                st.markdown("**Howey Test:**")
+                for prong, result in howey.items():
+                    icon = "✅" if result else "❌"
+                    st.markdown(f"{icon} {prong.replace('_', ' ').title()}")
+        # Implications
+        implications = classification.get('implications', [])
+        if implications:
+            st.markdown("**Regulatory Implications:**")
+            for implication in implications:
+                st.markdown(f"- {implication}")
+        st.markdown("---")
+def render_cost_breakdown(cost_estimate: Dict[str, Any], jurisdictions: list):
+    """Render cost breakdown and projections."""
+    st.markdown("### 💰 Cost Breakdown")
+    if not cost_estimate:
+        st.info("Cost estimation not available")
+        return
+    # Calculate grand totals from jurisdiction costs
+    total_first_year = 0
+    total_annual = 0
+    for jur_cost in cost_estimate.values():
+        if isinstance(jur_cost, dict):
+            if 'first_year' in jur_cost and isinstance(jur_cost['first_year'], dict):
+                total_first_year += jur_cost['first_year'].get('estimate', 0)
+            if 'annual' in jur_cost and isinstance(jur_cost['annual'], dict):
+                total_annual += jur_cost['annual'].get('estimate', 0)
+    # Grand totals
+    if total_first_year > 0 or total_annual > 0:
+        st.markdown("#### Total Estimated Costs")
+        col1, col2, col3 = st.columns(3)
+        with col1:
+            st.metric("First Year", f"${total_first_year:,.0f}")
+        with col2:
+            st.metric("Annual Ongoing", f"${total_annual:,.0f}")
+        with col3:
+            three_year = total_first_year + (total_annual * 2)
+            st.metric("3-Year Total", f"${three_year:,.0f}")
+        st.markdown("---")
+    # By jurisdiction
+    st.markdown("#### Cost by Jurisdiction")
+    for jurisdiction in jurisdictions:
+        jur_cost = cost_estimate.get(jurisdiction, {})
+        if jur_cost and isinstance(jur_cost, dict):
+            with st.expander(f"{get_jurisdiction_name(jurisdiction)}", expanded=True):
+                col1, col2 = st.columns(2)
+                with col1:
+                    first_year_dict = jur_cost.get('first_year', {})
+                    if isinstance(first_year_dict, dict):
+                        first_year = first_year_dict.get('estimate', 0)
+                        st.markdown(f"**First Year:** ${first_year:,.0f}")
+                with col2:
+                    annual_dict = jur_cost.get('annual', {})
+                    if isinstance(annual_dict, dict):
+                        annual = annual_dict.get('estimate', 0)
+                        st.markdown(f"**Annual:** ${annual:,.0f}")
+def render_recommendations(recommendations: list):
+    """Render prioritized recommendations."""
+    st.markdown("### 📋 Recommendations")
+    st.info("Prioritized action items to achieve compliance")
+    for i, rec in enumerate(recommendations, 1):
+        if isinstance(rec, dict):
+            title = rec.get('title', f'Recommendation {i}')
+            description = rec.get('description', '')
+            priority = rec.get('priority', 'medium')
+            timeline = rec.get('timeline', '')
+            priority_icons = {'high': '🔴', 'medium': '🟡', 'low': '🟢'}
+            icon = priority_icons.get(priority, '🔵')
+            with st.expander(f"{icon} **{i}. {title}**", expanded=(i <= 3)):
+                if description:
+                    st.markdown(description)
+                if timeline:
+                    st.markdown(f"**Timeline:** {timeline}")
+        else:
+            # Simple string recommendation
+            st.markdown(f"{i}. {rec}")
+def render_reasoning_chain(reasoning: list):
+    """Render agent reasoning chain for transparency."""
+    with st.expander("🧠 Agent Reasoning Chain (Explainability)", expanded=False):
+        st.markdown("The AI agent's decision-making process:")
+        for i, step in enumerate(reasoning, 1):
+            st.markdown(f"**Step {i}:** {step}")
+def get_jurisdiction_name(jurisdiction: str) -> str:
+    """Get display name for jurisdiction."""
+    mapping = {
+        'us': '🇺🇸 United States',
+        'eu': '🇪🇺 European Union',
+        'singapore': '🇸🇬 Singapore',
+        'uk': '🇬🇧 United Kingdom',
+        'uae': '🇦🇪 UAE'
+    }
+    return mapping.get(jurisdiction, jurisdiction.upper())

app/streamlit_app.py ADDED Viewed

	@@ -0,0 +1,208 @@

+"""
+Crypto Compliance Intelligence Agent - Streamlit Web Interface
+Main application entry point for the compliance analysis web app.
+"""
+import streamlit as st
+import sys
+from pathlib import Path
+# Add src to path
+sys.path.insert(0, str(Path(__file__).parent.parent))
+from src.config import Config
+from src.agents.orchestrator import ComplianceAgent
+from app.components.input_form import render_input_form
+from app.components.results_display import render_results
+from app.components.monitoring_setup import render_monitoring_setup
+# Page configuration
+st.set_page_config(
+    page_title="Crypto Compliance Intelligence Agent",
+    page_icon="🔒",
+    layout="wide",
+    initial_sidebar_state="expanded"
+)
+# Custom CSS
+st.markdown("""
+<style>
+    .main-header {
+        font-size: 2.5rem;
+        font-weight: 700;
+        color: #1f77b4;
+        margin-bottom: 0.5rem;
+    }
+    .sub-header {
+        font-size: 1.2rem;
+        color: #666;
+        margin-bottom: 2rem;
+    }
+    .disclaimer-box {
+        background-color: #fff3cd;
+        border-left: 5px solid #ffc107;
+        padding: 1rem;
+        margin: 1rem 0;
+        border-radius: 0.25rem;
+    }
+    .stButton>button {
+        width: 100%;
+        background-color: #1f77b4;
+        color: white;
+        font-weight: 600;
+        padding: 0.75rem;
+        border-radius: 0.5rem;
+    }
+    .stButton>button:hover {
+        background-color: #1557a0;
+    }
+</style>
+""", unsafe_allow_html=True)
+@st.cache_resource(ttl=600)  # Cache for 10 minutes, then refresh
+def initialize_agent():
+    """Initialize the compliance agent (cached for performance)."""
+    try:
+        agent = ComplianceAgent()
+        return agent
+    except Exception as e:
+        st.error(f"Failed to initialize agent: {e}")
+        return None
+def render_header():
+    """Render application header."""
+    st.markdown('<div class="main-header">🔒 Crypto Compliance Intelligence Agent</div>', unsafe_allow_html=True)
+    st.markdown(
+        '<div class="sub-header">Multi-jurisdiction compliance analysis powered by AI</div>',
+        unsafe_allow_html=True
+    )
+def render_disclaimer():
+    """Render legal disclaimer."""
+    st.markdown("""
+    <div class="disclaimer-box">
+        <h4>⚠️ Important Disclaimer</h4>
+        <p>This tool provides <strong>general information only</strong> and does <strong>NOT</strong> constitute legal, financial, or regulatory advice. The analysis is based on publicly available information and AI models, which may contain errors or be outdated.</p>
+        <p><strong>Always consult qualified legal counsel</strong> before making compliance decisions. The creators assume no liability for decisions made based on this tool.</p>
+    </div>
+    """, unsafe_allow_html=True)
+def render_sidebar():
+    """Render sidebar with information and settings."""
+    with st.sidebar:
+        st.image("https://via.placeholder.com/300x100/1f77b4/ffffff?text=Compliance+Agent", use_column_width=True)
+        st.markdown("### About")
+        st.markdown("""
+        This AI-powered system analyzes crypto compliance requirements across:
+        - 🇺🇸 United States (SEC)
+        - 🇪🇺 European Union (MiCA)
+        - 🇸🇬 Singapore (MAS)
+        - 🇬🇧 United Kingdom (FCA)
+        - 🇦🇪 UAE (VARA)
+        """)
+        st.markdown("### Features")
+        st.markdown("""
+        ✅ Multi-jurisdiction analysis
+        ✅ Token classification (Howey Test)
+        ✅ Risk scoring (0-100)
+        ✅ Cost estimation
+        ✅ Compliance gap identification
+        ✅ Actionable recommendations
+        """)
+        st.markdown("### How It Works")
+        st.markdown("""
+        1. Describe your business and activities
+        2. Select target jurisdictions
+        3. AI agent analyzes compliance requirements
+        4. Receive comprehensive report (30-60s)
+        """)
+        st.markdown("---")
+        st.markdown("**Powered by:**")
+        st.markdown("- Google Gemini Flash 2.5")
+        st.markdown("- LangGraph Agents")
+        st.markdown("- ChromaDB Vector Search")
+        st.markdown("---")
+        st.caption("Version 1.0 | Phase 6")
+def main():
+    """Main application logic."""
+    # Render header and disclaimer
+    render_header()
+    render_disclaimer()
+    # Render sidebar
+    render_sidebar()
+    # Initialize agent
+    agent = initialize_agent()
+    if agent is None:
+        st.error("Failed to initialize compliance agent. Please check your configuration and API keys.")
+        st.stop()
+    # Create tabs
+    tab1, tab2, tab3 = st.tabs(["📝 Analysis", "📊 Results", "🔔 Monitoring"])
+    with tab1:
+        st.markdown("## Compliance Analysis")
+        st.markdown("Provide information about your crypto business to get a comprehensive compliance analysis.")
+        # Render input form
+        form_data = render_input_form()
+        if form_data:
+            # User submitted the form
+            st.markdown("---")
+            with st.spinner("🤖 AI Agent is analyzing your compliance requirements... This may take 30-60 seconds."):
+                try:
+                    # Run agent analysis
+                    result = agent.run(
+                        user_input=form_data['business_description'],
+                        jurisdictions=form_data['jurisdictions'],
+                        activities=form_data['activities'],
+                        token_description=form_data.get('token_description')
+                    )
+                    # Store result in session state
+                    st.session_state['analysis_result'] = result
+                    st.session_state['form_data'] = form_data
+                    st.success("✅ Analysis complete! View results in the **Results** tab.")
+                except Exception as e:
+                    st.error(f"❌ Analysis failed: {str(e)}")
+                    st.exception(e)
+    with tab2:
+        st.markdown("## Analysis Results")
+        if 'analysis_result' in st.session_state:
+            result = st.session_state['analysis_result']
+            form_data = st.session_state.get('form_data', {})
+            render_results(result, form_data)
+        else:
+            st.info("👈 Complete the analysis in the **Analysis** tab to see results here.")
+    with tab3:
+        st.markdown("## Compliance Monitoring")
+        st.markdown("Set up alerts to stay informed about regulatory changes.")
+        render_monitoring_setup()
+if __name__ == "__main__":
+    main()

chroma_db/c30b23e7-6559-433a-b87d-7290f0978053/data_level0.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:571c95cfab76b4001dd6f0a0510f8b5741632272159d49f94759e4a66493ed90
+size 1676000

chroma_db/c30b23e7-6559-433a-b87d-7290f0978053/header.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e87a1dc8bcae6f2c4bea6d5dd5005454d4dace8637dae29bff3c037ea771411e
+size 100

chroma_db/c30b23e7-6559-433a-b87d-7290f0978053/length.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fc19b1997119425765295aeab72d76faa6927d4f83985d328c26f20468d6cc76
+size 4000

chroma_db/c30b23e7-6559-433a-b87d-7290f0978053/link_lists.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
+size 0

chroma_db/chroma.sqlite3 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:dd2c9e8fc0f11da0446bdf60f58c0bd05fbc048dc036cb5717c62d9ed9f10ce5
+size 819200

config.toml ADDED Viewed

	@@ -0,0 +1,15 @@

+[theme]
+primaryColor = "#1f77b4"
+backgroundColor = "#FFFFFF"
+secondaryBackgroundColor = "#F0F2F6"
+textColor = "#262730"
+font = "sans serif"
+[server]
+headless = true
+port = 7860
+enableCORS = false
+enableXsrfProtection = true
+[browser]
+gatherUsageStats = false

requirements.txt ADDED Viewed

	@@ -0,0 +1,32 @@

+# Web Framework
+streamlit==1.29.0
+# LLM and Agent Framework
+langchain>=0.1.0
+langgraph>=0.0.20
+langchain-google-genai>=1.0.0
+google-generativeai>=0.4.0
+# Vector Database and Embeddings (lightweight)
+chromadb==0.4.22
+sentence-transformers==2.3.1
+# Data Processing (minimal)
+pandas>=2.2.0
+numpy>=1.26.0
+pdfplumber>=0.10.0
+beautifulsoup4>=4.12.0
+lxml>=5.1.0
+# HTTP and Web Scraping
+requests==2.31.0
+# Utilities
+python-dotenv==1.0.0
+python-dateutil==2.8.2
+# Plotting
+plotly>=5.18.0
+# Note: FinBERT and Legal-BERT accessed via HuggingFace Inference API
+# No need to download torch/transformers locally - saves storage!

src/__init__.py ADDED Viewed

File without changes

src/__pycache__/__init__.cpython-313.pyc ADDED Viewed

Binary file (191 Bytes). View file

src/__pycache__/config.cpython-313.pyc ADDED Viewed

Binary file (3.17 kB). View file

src/agents/__init__.py ADDED Viewed

File without changes

src/agents/__pycache__/__init__.cpython-313.pyc ADDED Viewed

Binary file (198 Bytes). View file

src/agents/__pycache__/orchestrator.cpython-313.pyc ADDED Viewed

Binary file (27.2 kB). View file

src/agents/__pycache__/tools.cpython-313.pyc ADDED Viewed

Binary file (15 kB). View file

src/agents/orchestrator.py ADDED Viewed

	@@ -0,0 +1,621 @@

+"""
+LangGraph orchestrator for compliance analysis workflow.
+Implements a state machine with 6 nodes for end-to-end analysis.
+"""
+import logging
+from typing import Dict, List, Optional, TypedDict, Annotated
+from datetime import datetime
+import operator
+from langgraph.graph import StateGraph, END
+from langchain_google_genai import ChatGoogleGenerativeAI
+from src.config import Config
+from src.agents.tools import (
+    search_regulations,
+    calculate_compliance_cost,
+    analyze_token_security,
+    extract_entities_from_text
+)
+logger = logging.getLogger(__name__)
+# Agent State Definition
+class ComplianceState(TypedDict):
+    """State that flows through the agent workflow."""
+    # Input
+    user_input: str
+    jurisdictions: List[str]
+    activities: List[str]
+    token_description: Optional[str]
+    # Intermediate results
+    business_type: Optional[str]
+    extracted_entities: Optional[Dict]
+    token_classification: Optional[Dict]
+    relevant_regulations: Optional[List[Dict]]
+    compliance_gaps: Optional[List[Dict]]
+    risk_score: Optional[float]
+    cost_estimate: Optional[Dict]
+    # Output
+    recommendations: Optional[List[str]]
+    reasoning: Annotated[List[str], operator.add]  # Accumulate reasoning steps
+    # Metadata
+    timestamp: str
+    errors: Annotated[List[str], operator.add]
+class ComplianceAgent:
+    """
+    LangGraph-based compliance agent orchestrator.
+    Workflow:
+    1. classify_input -> Determine business model type
+    2. extract_entities -> Extract key information from description
+    3. retrieve_regulations -> Search vector DB for relevant rules
+    4. analyze_compliance -> Identify gaps and requirements
+    5. calculate_risk -> Generate risk score
+    6. generate_recommendations -> Create actionable steps
+    """
+    def __init__(self):
+        """Initialize the compliance agent."""
+        # Initialize LLM
+        self.llm = ChatGoogleGenerativeAI(
+            model=Config.GEMINI_MODEL,
+            google_api_key=Config.GEMINI_API_KEY,
+            temperature=Config.GEMINI_TEMPERATURE,
+            max_tokens=Config.GEMINI_MAX_TOKENS
+        )
+        # Initialize VectorDB (once, reused for all searches)
+        from src.data.vectordb import RegulatoryVectorDB
+        self.vectordb = RegulatoryVectorDB()
+        logger.info(f"VectorDB initialized with collection: {self.vectordb.collection_name}")
+        # Build workflow graph
+        self.workflow = self._build_workflow()
+        self.app = self.workflow.compile()
+        logger.info("ComplianceAgent initialized with LangGraph workflow")
+    def _build_workflow(self) -> StateGraph:
+        """Build the LangGraph state machine workflow."""
+        workflow = StateGraph(ComplianceState)
+        # Add nodes
+        workflow.add_node("classify_input", self.classify_input_node)
+        workflow.add_node("extract_entities", self.extract_entities_node)
+        workflow.add_node("retrieve_regulations", self.retrieve_regulations_node)
+        workflow.add_node("analyze_compliance", self.analyze_compliance_node)
+        workflow.add_node("calculate_risk", self.calculate_risk_node)
+        workflow.add_node("generate_recommendations", self.generate_recommendations_node)
+        # Define edges
+        workflow.set_entry_point("classify_input")
+        workflow.add_edge("classify_input", "extract_entities")
+        workflow.add_edge("extract_entities", "retrieve_regulations")
+        workflow.add_edge("retrieve_regulations", "analyze_compliance")
+        workflow.add_edge("analyze_compliance", "calculate_risk")
+        workflow.add_edge("calculate_risk", "generate_recommendations")
+        workflow.add_edge("generate_recommendations", END)
+        return workflow
+    def classify_input_node(self, state: ComplianceState) -> ComplianceState:
+        """
+        Node 1: Classify the business model type from user input.
+        Args:
+            state: Current agent state
+        Returns:
+            Updated state with business_type
+        """
+        logger.info("Node 1: Classifying business model...")
+        try:
+            prompt = f"""
+            Analyze this crypto business description and classify the business model type.
+            Description: {state['user_input']}
+            Activities: {', '.join(state['activities'])}
+            Classify into ONE of these categories:
+            - Exchange/Trading Platform
+            - Custody/Wallet Service
+            - DeFi Protocol (lending, staking, yield)
+            - Token/NFT Platform
+            - Payment Processor
+            - Mining/Validator Service
+            - Other (specify)
+            Respond with just the category name.
+            """
+            response = self.llm.invoke(prompt)
+            business_type = response.content.strip()
+            state['business_type'] = business_type
+            state['reasoning'].append(f"Classified business as: {business_type}")
+            logger.info(f"Business classified as: {business_type}")
+        except Exception as e:
+            logger.error(f"Error in classify_input_node: {e}")
+            state['errors'].append(f"Classification error: {str(e)}")
+            state['business_type'] = "Unknown"
+        return state
+    def extract_entities_node(self, state: ComplianceState) -> ComplianceState:
+        """
+        Node 2: Extract entities and classify token if applicable.
+        Args:
+            state: Current agent state
+        Returns:
+            Updated state with extracted_entities and token_classification
+        """
+        logger.info("Node 2: Extracting entities...")
+        try:
+            # Extract entities from user input
+            entities = extract_entities_from_text.invoke({"text": state['user_input']})
+            state['extracted_entities'] = entities
+            reasoning = f"Extracted entities: {entities['summary'].get('total_entities', 0)} total"
+            state['reasoning'].append(reasoning)
+            # Classify token if description provided
+            if state.get('token_description'):
+                # Classify in each requested jurisdiction
+                classifications = {}
+                for jurisdiction in state['jurisdictions']:
+                    if jurisdiction in ['us', 'eu', 'singapore']:
+                        result = analyze_token_security.invoke({
+                            "token_description": state['token_description'],
+                            "jurisdiction": jurisdiction
+                        })
+                        classifications[jurisdiction] = result
+                state['token_classification'] = classifications
+                # Check if token is security in any jurisdiction
+                is_security_anywhere = any(
+                    c.get('classification') == 'security' or
+                    c.get('classification') == 'capital markets product'
+                    for c in classifications.values()
+                )
+                reasoning = f"Token classified - Security: {is_security_anywhere}"
+                state['reasoning'].append(reasoning)
+                logger.info(f"Token classified - Security anywhere: {is_security_anywhere}")
+        except Exception as e:
+            logger.error(f"Error in extract_entities_node: {e}")
+            state['errors'].append(f"Entity extraction error: {str(e)}")
+        return state
+    def retrieve_regulations_node(self, state: ComplianceState) -> ComplianceState:
+        """
+        Node 3: Retrieve relevant regulations from vector database.
+        Args:
+            state: Current agent state
+        Returns:
+            Updated state with relevant_regulations
+        """
+        logger.info("Node 3: Retrieving regulations...")
+        try:
+            all_regulations = []
+            # Build search query from activities and entities
+            activities_str = ', '.join(state['activities'])
+            query = f"{activities_str} compliance requirements regulations"
+            # Search each jurisdiction using agent's VectorDB instance
+            for jurisdiction in state['jurisdictions']:
+                results = self.vectordb.search_relevant_regulations(
+                    query=query,
+                    jurisdiction=jurisdiction,
+                    top_k=5
+                )
+                logger.info(f"VectorDB search ({jurisdiction}): {len(results)} results")
+                all_regulations.extend(results)
+            state['relevant_regulations'] = all_regulations
+            reasoning = f"Retrieved {len(all_regulations)} relevant regulations across {len(state['jurisdictions'])} jurisdictions"
+            state['reasoning'].append(reasoning)
+            logger.info(f"Retrieved {len(all_regulations)} regulations total")
+        except Exception as e:
+            logger.error(f"Error in retrieve_regulations_node: {e}")
+            state['errors'].append(f"Regulation retrieval error: {str(e)}")
+            state['relevant_regulations'] = []
+        return state
+    def analyze_compliance_node(self, state: ComplianceState) -> ComplianceState:
+        """
+        Node 4: Analyze compliance gaps by matching activities to regulations.
+        Args:
+            state: Current agent state
+        Returns:
+            Updated state with compliance_gaps
+        """
+        logger.info("Node 4: Analyzing compliance...")
+        try:
+            # Build detailed regulations context with specific requirements
+            regulations_detail = []
+            for reg in state.get('relevant_regulations', [])[:5]:  # Top 5 most relevant
+                reg_text = f"\n**{reg.get('title', 'Unknown')} ({reg.get('jurisdiction', '').upper()})**\n"
+                reg_text += f"Summary: {reg.get('summary', '')}\n"
+                # Include specific requirements from regulation
+                requirements = reg.get('requirements', [])
+                if requirements:
+                    reg_text += f"\nKey Requirements ({len(requirements)} total):\n"
+                    for i, req in enumerate(requirements[:8], 1):  # Top 8 requirements per regulation
+                        req_name = req.get('requirement', 'Unknown requirement')
+                        req_desc = req.get('description', '')[:200]
+                        severity = req.get('severity', 'medium')
+                        cost = req.get('estimated_cost_usd', {})
+                        cost_range = f"${cost.get('min', 0):,.0f}-${cost.get('max', 0):,.0f}" if cost else "N/A"
+                        deadline = req.get('deadline_days', 'N/A')
+                        reg_text += f"{i}. {req_name} [{severity.upper()}]\n"
+                        reg_text += f"   Description: {req_desc}...\n"
+                        reg_text += f"   Cost: {cost_range} | Deadline: {deadline} days\n"
+                regulations_detail.append(reg_text)
+            regulations_context = "\n".join(regulations_detail)
+            # Get token classification for relevant jurisdiction
+            token_class = "N/A"
+            if state.get('token_classification'):
+                # Get first jurisdiction's classification
+                first_jur = state['jurisdictions'][0] if state['jurisdictions'] else None
+                if first_jur and first_jur in state['token_classification']:
+                    token_class = state['token_classification'][first_jur].get('classification', 'N/A')
+            prompt = f"""
+            You are a crypto compliance expert. Analyze compliance gaps for this specific business:
+            **Business Details:**
+            - Business Type: {state.get('business_type', 'Unknown')}
+            - Activities: {', '.join(state['activities'])}
+            - Jurisdictions: {', '.join(state['jurisdictions'])}
+            - Description: {state['user_input'][:300]}
+            - Token Classification: {token_class}
+            **Relevant Regulatory Requirements:**
+            {regulations_context}
+            **Task:**
+            Based on the SPECIFIC requirements listed above, identify compliance gaps for this business.
+            For each gap, provide:
+            1. The EXACT requirement name from the regulations above
+            2. Specific details (costs, deadlines, severity) from the regulation
+            3. Why this applies to THIS specific business model
+            Be SPECIFIC - reference actual requirement names, costs, and timelines from the regulations provided.
+            Do NOT make up generic requirements.
+            Format each gap as:
+            [SEVERITY] Requirement Name (Jurisdiction) - Deadline: X days, Cost: $X-Y
+            Description: [Why this applies + key details from regulation]
+            Provide 5-10 most critical gaps.
+            """
+            response = self.llm.invoke(prompt)
+            gaps_text = response.content.strip()
+            # DEBUG: Log the raw response
+            logger.debug(f"LLM Response (first 500 chars): {gaps_text[:500]}")
+            # Parse into structured format - more flexible parsing
+            gaps = []
+            lines = gaps_text.split('\n')
+            for line in lines:
+                line = line.strip()
+                # Skip empty lines
+                if not line:
+                    continue
+                # Match lines that look like requirements (numbered, bulleted, or bracketed)
+                if (line[0].isdigit() and '. ' in line[:5]) or \
+                   line.startswith('[') or \
+                   line.startswith('-') or \
+                   line.startswith('*') or \
+                   ('CRITICAL' in line.upper() or 'HIGH' in line.upper() or 'MEDIUM' in line.upper()):
+                    gaps.append({
+                        'description': line,
+                        'severity': self._extract_severity(line)
+                    })
+            # If no gaps parsed but we have content, add full response as single gap for debugging
+            if not gaps and gaps_text:
+                logger.warning(f"Failed to parse gaps from LLM response. Adding full response as single gap.")
+                gaps.append({
+                    'description': gaps_text[:500],
+                    'severity': 'medium'
+                })
+            state['compliance_gaps'] = gaps
+            state['reasoning'].append(f"Identified {len(gaps)} compliance gaps from {len(regulations_detail)} regulations")
+            logger.info(f"Identified {len(gaps)} compliance gaps")
+        except Exception as e:
+            logger.error(f"Error in analyze_compliance_node: {e}")
+            state['errors'].append(f"Compliance analysis error: {str(e)}")
+            state['compliance_gaps'] = []
+        return state
+    def calculate_risk_node(self, state: ComplianceState) -> ComplianceState:
+        """
+        Node 5: Calculate overall risk score and cost estimates.
+        Args:
+            state: Current agent state
+        Returns:
+            Updated state with risk_score and cost_estimate
+        """
+        logger.info("Node 5: Calculating risk and costs...")
+        try:
+            # Calculate risk score based on gaps
+            gaps = state.get('compliance_gaps', [])
+            gap_count = len(gaps)
+            # Simple risk scoring (0-100)
+            # Base score from gap count
+            risk_from_gaps = min(gap_count * 15, 60)
+            # Add risk for security token
+            risk_from_token = 0
+            if state.get('token_classification'):
+                classifications = state['token_classification']
+                if any(c.get('classification') == 'security' for c in classifications.values()):
+                    risk_from_token = 25
+            # Add risk for severity
+            risk_from_severity = 0
+            critical_gaps = sum(1 for g in gaps if 'critical' in g.get('description', '').lower())
+            risk_from_severity = min(critical_gaps * 5, 15)
+            total_risk = min(risk_from_gaps + risk_from_token + risk_from_severity, 100)
+            state['risk_score'] = total_risk
+            state['reasoning'].append(f"Risk score: {total_risk}/100")
+            # Calculate costs
+            token_class = state.get('token_classification')
+            is_security = any(
+                c.get('classification') == 'security'
+                for c in token_class.values()
+            ) if token_class else False
+            costs = calculate_compliance_cost.invoke({
+                "jurisdictions": state['jurisdictions'],
+                "activities": state['activities'],
+                "is_security_token": is_security
+            })
+            state['cost_estimate'] = costs
+            total_first_year = sum(
+                c['first_year']['estimate']
+                for c in costs.values()
+            )
+            state['reasoning'].append(f"Estimated first-year cost: ${total_first_year:,}")
+            logger.info(f"Risk score: {total_risk}, Est. cost: ${total_first_year:,}")
+        except Exception as e:
+            logger.error(f"Error in calculate_risk_node: {e}")
+            state['errors'].append(f"Risk calculation error: {str(e)}")
+            state['risk_score'] = 50.0  # Default medium risk
+        return state
+    def generate_recommendations_node(self, state: ComplianceState) -> ComplianceState:
+        """
+        Node 6: Generate actionable recommendations.
+        Args:
+            state: Current agent state
+        Returns:
+            Updated state with recommendations
+        """
+        logger.info("Node 6: Generating recommendations...")
+        try:
+            # Use LLM to generate recommendations
+            gaps_summary = "\n".join([
+                f"- {gap['description']}"
+                for gap in state.get('compliance_gaps', [])[:10]
+            ])
+            prompt = f"""
+            Generate prioritized compliance recommendations for this crypto business:
+            Business Type: {state.get('business_type', 'Unknown')}
+            Risk Score: {state.get('risk_score', 'Unknown')}/100
+            Jurisdictions: {', '.join(state['jurisdictions'])}
+            Compliance Gaps:
+            {gaps_summary}
+            Provide 5-8 prioritized, actionable recommendations. Each should:
+            1. Be specific and actionable
+            2. Include estimated timeline
+            3. Indicate priority (P0/P1/P2)
+            Format as numbered list with priority labels.
+            """
+            response = self.llm.invoke(prompt)
+            recommendations_text = response.content.strip()
+            # Parse into list
+            recommendations = [
+                line.strip()
+                for line in recommendations_text.split('\n')
+                if line.strip() and (line[0].isdigit() or line.startswith('-') or line.startswith('P'))
+            ]
+            state['recommendations'] = recommendations
+            state['reasoning'].append(f"Generated {len(recommendations)} recommendations")
+            logger.info(f"Generated {len(recommendations)} recommendations")
+        except Exception as e:
+            logger.error(f"Error in generate_recommendations_node: {e}")
+            state['errors'].append(f"Recommendation generation error: {str(e)}")
+            state['recommendations'] = ["Consult with legal counsel for compliance guidance"]
+        return state
+    def _extract_severity(self, text: str) -> str:
+        """Extract severity level from text."""
+        text_lower = text.lower()
+        if 'critical' in text_lower:
+            return 'critical'
+        elif 'high' in text_lower:
+            return 'high'
+        elif 'medium' in text_lower:
+            return 'medium'
+        elif 'low' in text_lower:
+            return 'low'
+        return 'medium'
+    def run(
+        self,
+        user_input: str,
+        jurisdictions: List[str],
+        activities: List[str],
+        token_description: Optional[str] = None
+    ) -> ComplianceState:
+        """
+        Run the compliance analysis workflow.
+        Args:
+            user_input: Business description
+            jurisdictions: List of jurisdictions to analyze
+            activities: List of crypto activities
+            token_description: Optional token description for classification
+        Returns:
+            Final state with complete analysis
+        """
+        logger.info(f"Starting compliance analysis for {len(jurisdictions)} jurisdictions, {len(activities)} activities")
+        # Initialize state
+        initial_state: ComplianceState = {
+            'user_input': user_input,
+            'jurisdictions': jurisdictions,
+            'activities': activities,
+            'token_description': token_description,
+            'business_type': None,
+            'extracted_entities': None,
+            'token_classification': None,
+            'relevant_regulations': None,
+            'compliance_gaps': None,
+            'risk_score': None,
+            'cost_estimate': None,
+            'recommendations': None,
+            'reasoning': [],
+            'timestamp': datetime.now().isoformat(),
+            'errors': []
+        }
+        # Run workflow
+        try:
+            final_state = self.app.invoke(initial_state)
+            logger.info("Workflow completed successfully")
+            return final_state
+        except Exception as e:
+            logger.error(f"Workflow execution error: {e}")
+            initial_state['errors'].append(f"Workflow error: {str(e)}")
+            return initial_state
+# Convenience function
+def analyze_compliance(
+    user_input: str,
+    jurisdictions: List[str],
+    activities: List[str],
+    token_description: Optional[str] = None
+) -> Dict:
+    """
+    Quick compliance analysis.
+    Args:
+        user_input: Business description
+        jurisdictions: List of jurisdictions
+        activities: List of activities
+        token_description: Optional token description
+    Returns:
+        Analysis results dictionary
+    """
+    agent = ComplianceAgent()
+    result = agent.run(user_input, jurisdictions, activities, token_description)
+    return dict(result)
+if __name__ == "__main__":
+    # Test the agent
+    print("\n=== Testing Compliance Agent ===\n")
+    test_input = """
+    We are launching a crypto exchange platform that allows users to trade
+    Bitcoin, Ethereum, and other cryptocurrencies. We will provide custody
+    services and allow users to stake their tokens to earn yields. The platform
+    will operate in the US and EU.
+    """
+    result = analyze_compliance(
+        user_input=test_input,
+        jurisdictions=['us', 'eu'],
+        activities=['exchange', 'custody', 'staking'],
+        token_description=None
+    )
+    print(f"Business Type: {result['business_type']}")
+    print(f"Risk Score: {result['risk_score']}/100")
+    print(f"\nCompliance Gaps: {len(result.get('compliance_gaps', []))}")
+    print(f"\nRecommendations:")
+    for rec in result.get('recommendations', [])[:5]:
+        print(f"  - {rec}")
+    print(f"\nReasoning Chain:")
+    for step in result.get('reasoning', []):
+        print(f"  → {step}")
+    print("\n=== Agent test complete! ===\n")

src/agents/tools.py ADDED Viewed

	@@ -0,0 +1,419 @@

+"""
+Agent tools for LangChain integration.
+Tools that the compliance agent can use to perform tasks.
+"""
+import logging
+from typing import Dict, List, Optional
+from datetime import datetime, timedelta
+from langchain.tools import tool
+logger = logging.getLogger(__name__)
+@tool
+def search_regulations(
+    query: str,
+    jurisdiction: Optional[str] = None,
+    top_k: int = 5
+) -> List[Dict]:
+    """
+    Search for relevant regulations in the vector database.
+    Args:
+        query: Search query describing compliance requirements
+        jurisdiction: Filter by jurisdiction (us/eu/singapore/uk/uae) or None for all
+        top_k: Number of results to return
+    Returns:
+        List of relevant regulations with metadata
+    """
+    try:
+        from src.data.vectordb import RegulatoryVectorDB
+        db = RegulatoryVectorDB()
+        results = db.search_relevant_regulations(
+            query=query,
+            jurisdiction=jurisdiction,
+            top_k=top_k
+        )
+        logger.info(
+            f"Found {len(results)} regulations for query: '{query}' "
+            f"(jurisdiction: {jurisdiction or 'all'})"
+        )
+        return results
+    except Exception as e:
+        logger.error(f"Error searching regulations: {e}")
+        return []
+@tool
+def calculate_compliance_cost(
+    jurisdictions: List[str],
+    activities: List[str],
+    is_security_token: bool = False
+) -> Dict:
+    """
+    Calculate estimated compliance costs for given jurisdictions and activities.
+    Args:
+        jurisdictions: List of jurisdictions (e.g., ['us', 'eu'])
+        activities: List of crypto activities (e.g., ['exchange', 'custody'])
+        is_security_token: Whether token is classified as a security
+    Returns:
+        Dictionary with cost estimates per jurisdiction
+    """
+    # Cost database (approximate ranges in USD)
+    COST_DATABASE = {
+        'us': {
+            'exchange': {'first_year': (200000, 500000), 'annual': (100000, 250000)},
+            'custody': {'first_year': (100000, 300000), 'annual': (50000, 150000)},
+            'staking': {'first_year': (50000, 150000), 'annual': (25000, 75000)},
+            'lending': {'first_year': (100000, 250000), 'annual': (50000, 125000)},
+            'token_issuance': {'first_year': (50000, 200000), 'annual': (25000, 100000)},
+            'security_token_premium': {'first_year': (200000, 500000), 'annual': (100000, 250000)},
+            'base': {'first_year': (50000, 100000), 'annual': (25000, 50000)}
+        },
+        'eu': {
+            'exchange': {'first_year': (150000, 400000), 'annual': (75000, 200000)},
+            'custody': {'first_year': (100000, 250000), 'annual': (50000, 125000)},
+            'staking': {'first_year': (50000, 125000), 'annual': (25000, 60000)},
+            'lending': {'first_year': (75000, 200000), 'annual': (40000, 100000)},
+            'token_issuance': {'first_year': (100000, 300000), 'annual': (50000, 150000)},
+            'base': {'first_year': (50000, 100000), 'annual': (25000, 50000)}
+        },
+        'singapore': {
+            'exchange': {'first_year': (150000, 350000), 'annual': (75000, 175000)},
+            'custody': {'first_year': (75000, 200000), 'annual': (40000, 100000)},
+            'staking': {'first_year': (40000, 100000), 'annual': (20000, 50000)},
+            'lending': {'first_year': (60000, 150000), 'annual': (30000, 75000)},
+            'token_issuance': {'first_year': (75000, 250000), 'annual': (40000, 125000)},
+            'base': {'first_year': (40000, 80000), 'annual': (20000, 40000)}
+        },
+        'uk': {
+            'exchange': {'first_year': (125000, 300000), 'annual': (60000, 150000)},
+            'custody': {'first_year': (75000, 200000), 'annual': (40000, 100000)},
+            'staking': {'first_year': (40000, 100000), 'annual': (20000, 50000)},
+            'lending': {'first_year': (60000, 150000), 'annual': (30000, 75000)},
+            'token_issuance': {'first_year': (50000, 150000), 'annual': (25000, 75000)},
+            'base': {'first_year': (40000, 75000), 'annual': (20000, 40000)}
+        },
+        'uae': {
+            'exchange': {'first_year': (100000, 250000), 'annual': (50000, 125000)},
+            'custody': {'first_year': (60000, 150000), 'annual': (30000, 75000)},
+            'staking': {'first_year': (30000, 80000), 'annual': (15000, 40000)},
+            'lending': {'first_year': (50000, 125000), 'annual': (25000, 60000)},
+            'token_issuance': {'first_year': (50000, 150000), 'annual': (25000, 75000)},
+            'base': {'first_year': (30000, 60000), 'annual': (15000, 30000)}
+        }
+    }
+    results = {}
+    for jurisdiction in jurisdictions:
+        if jurisdiction not in COST_DATABASE:
+            logger.warning(f"Unknown jurisdiction: {jurisdiction}")
+            continue
+        costs = COST_DATABASE[jurisdiction]
+        first_year_min = 0
+        first_year_max = 0
+        annual_min = 0
+        annual_max = 0
+        # Base costs
+        first_year_min += costs['base']['first_year'][0]
+        first_year_max += costs['base']['first_year'][1]
+        annual_min += costs['base']['annual'][0]
+        annual_max += costs['base']['annual'][1]
+        # Activity-specific costs
+        for activity in activities:
+            if activity in costs:
+                first_year_min += costs[activity]['first_year'][0]
+                first_year_max += costs[activity]['first_year'][1]
+                annual_min += costs[activity]['annual'][0]
+                annual_max += costs[activity]['annual'][1]
+        # Security token premium (US)
+        if is_security_token and jurisdiction == 'us':
+            first_year_min += costs['security_token_premium']['first_year'][0]
+            first_year_max += costs['security_token_premium']['first_year'][1]
+            annual_min += costs['security_token_premium']['annual'][0]
+            annual_max += costs['security_token_premium']['annual'][1]
+        results[jurisdiction] = {
+            'first_year': {
+                'min': first_year_min,
+                'max': first_year_max,
+                'estimate': (first_year_min + first_year_max) // 2
+            },
+            'annual_ongoing': {
+                'min': annual_min,
+                'max': annual_max,
+                'estimate': (annual_min + annual_max) // 2
+            },
+            'breakdown': {
+                'base_compliance': costs['base'],
+                'activities': [
+                    {'activity': act, 'cost': costs.get(act, {'first_year': (0, 0), 'annual': (0, 0)})}
+                    for act in activities if act in costs
+                ]
+            }
+        }
+    logger.info(
+        f"Calculated costs for {len(results)} jurisdictions, "
+        f"{len(activities)} activities, security_token={is_security_token}"
+    )
+    return results
+@tool
+def check_effective_date(regulation_id: str) -> Dict:
+    """
+    Check if a regulation is currently effective or pending.
+    Args:
+        regulation_id: ID of the regulation to check
+    Returns:
+        Dictionary with status information
+    """
+    try:
+        from src.data.vectordb import RegulatoryVectorDB
+        db = RegulatoryVectorDB()
+        regulation = db.get_regulation_by_id(regulation_id)
+        if not regulation:
+            return {
+                'found': False,
+                'message': f"Regulation {regulation_id} not found"
+            }
+        status = regulation.get('status', 'unknown')
+        effective_date_str = regulation.get('effective_date', '')
+        announced_date_str = regulation.get('announced_date', '')
+        result = {
+            'found': True,
+            'regulation_id': regulation_id,
+            'title': regulation.get('title', ''),
+            'status': status,
+            'announced_date': announced_date_str,
+            'effective_date': effective_date_str,
+            'is_effective': status == 'effective',
+            'is_proposed': status == 'proposed',
+            'is_repealed': status == 'repealed'
+        }
+        # Calculate time until effective (if applicable)
+        if effective_date_str and status == 'proposed':
+            try:
+                effective_date = datetime.fromisoformat(effective_date_str.replace('Z', '+00:00'))
+                now = datetime.now(effective_date.tzinfo) if effective_date.tzinfo else datetime.now()
+                days_until = (effective_date - now).days
+                result['days_until_effective'] = days_until
+                result['time_to_comply'] = f"{days_until} days" if days_until > 0 else "Overdue"
+            except Exception as e:
+                logger.warning(f"Could not parse effective date: {e}")
+        logger.info(f"Checked regulation {regulation_id}: status={status}")
+        return result
+    except Exception as e:
+        logger.error(f"Error checking effective date: {e}")
+        return {
+            'found': False,
+            'error': str(e)
+        }
+@tool
+def analyze_token_security(token_description: str, jurisdiction: str = 'us') -> Dict:
+    """
+    Analyze whether a token is a security using the Howey Test or other frameworks.
+    Args:
+        token_description: Description of the token's functionality and economics
+        jurisdiction: Jurisdiction for classification (us/eu/singapore)
+    Returns:
+        Token classification result with confidence score
+    """
+    try:
+        from src.analysis.token_classifier import classify_token
+        result = classify_token(token_description, jurisdiction=jurisdiction)
+        # Format for agent consumption
+        formatted = {
+            'jurisdiction': jurisdiction,
+            'classification': result.get('classification', 'unknown'),
+            'confidence': result.get('confidence', 0.0),
+            'framework': result.get('framework', ''),
+            'implications': result.get('regulatory_implications', []),
+        }
+        # Add Howey Test details for US
+        if jurisdiction == 'us' and 'howey_test' in result:
+            howey = result['howey_test']
+            formatted['howey_test'] = {
+                'prongs_met': howey.get('prongs_met', 0),
+                'is_security': howey.get('is_security', False),
+                'prongs': {
+                    name: data.get('met', False)
+                    for name, data in howey.get('prongs', {}).items()
+                }
+            }
+        logger.info(
+            f"Token analysis ({jurisdiction}): {formatted['classification']} "
+            f"(confidence: {formatted['confidence']:.2f})"
+        )
+        return formatted
+    except Exception as e:
+        logger.error(f"Error analyzing token: {e}")
+        return {
+            'error': str(e),
+            'classification': 'error'
+        }
+@tool
+def extract_entities_from_text(text: str) -> Dict:
+    """
+    Extract financial and crypto entities from text.
+    Args:
+        text: Input text to analyze
+    Returns:
+        Dictionary of extracted entities
+    """
+    try:
+        from src.processors.entity_extraction import extract_entities
+        entities = extract_entities(text)
+        # Simplify for agent consumption
+        simplified = {
+            'summary': entities.get('summary', {}),
+            'amounts': [e['text'] for e in entities.get('financial', {}).get('amounts', [])],
+            'dates': [e['text'] for e in entities.get('financial', {}).get('dates', [])],
+            'institutions': [e['text'] for e in entities.get('financial', {}).get('institutions', [])],
+            'tokens': [e['name'] for e in entities.get('crypto', {}).get('tokens', [])],
+            'protocols': [e['name'] for e in entities.get('crypto', {}).get('protocols', [])],
+            'activities': list(set([e['activity'] for e in entities.get('crypto', {}).get('activities', [])]))
+        }
+        logger.info(f"Extracted {entities['summary']['total_entities']} entities from text")
+        return simplified
+    except Exception as e:
+        logger.error(f"Error extracting entities: {e}")
+        return {'error': str(e)}
+@tool
+def parse_document(file_path: str) -> Dict:
+    """
+    Parse a document and extract text and metadata.
+    Args:
+        file_path: Path to document file
+    Returns:
+        Parsed document with text, type, and metadata
+    """
+    try:
+        from src.processors.document_parser import parse_document as parse_doc
+        result = parse_doc(file_path)
+        # Simplify for agent
+        simplified = {
+            'text': result.get('text', ''),
+            'document_type': result.get('document_type', 'unknown'),
+            'confidence': result.get('type_confidence', 0.0),
+            'word_count': result.get('word_count', 0),
+            'char_count': result.get('char_count', 0),
+            'filename': result.get('metadata', {}).get('filename', '')
+        }
+        logger.info(
+            f"Parsed document: {simplified['filename']} "
+            f"(type: {simplified['document_type']}, {simplified['word_count']} words)"
+        )
+        return simplified
+    except Exception as e:
+        logger.error(f"Error parsing document: {e}")
+        return {'error': str(e)}
+# Tool list for LangChain agent
+COMPLIANCE_TOOLS = [
+    search_regulations,
+    calculate_compliance_cost,
+    check_effective_date,
+    analyze_token_security,
+    extract_entities_from_text,
+    parse_document
+]
+if __name__ == "__main__":
+    # Test tools
+    print("\n=== Testing Agent Tools ===\n")
+    # Test 1: Search regulations
+    print("1. Testing search_regulations...")
+    results = search_regulations.invoke({
+        "query": "crypto custody requirements",
+        "jurisdiction": "us",
+        "top_k": 3
+    })
+    print(f"   Found {len(results)} regulations")
+    # Test 2: Calculate costs
+    print("\n2. Testing calculate_compliance_cost...")
+    costs = calculate_compliance_cost.invoke({
+        "jurisdictions": ["us", "eu"],
+        "activities": ["exchange", "custody"],
+        "is_security_token": False
+    })
+    for jur, cost_data in costs.items():
+        print(f"   {jur.upper()}: ${cost_data['first_year']['estimate']:,} first year")
+    # Test 3: Token analysis
+    print("\n3. Testing analyze_token_security...")
+    token_result = analyze_token_security.invoke({
+        "token_description": "Investors buy tokens to earn profits from platform growth",
+        "jurisdiction": "us"
+    })
+    print(f"   Classification: {token_result['classification']}")
+    print(f"   Confidence: {token_result['confidence']:.2f}")
+    # Test 4: Entity extraction
+    print("\n4. Testing extract_entities_from_text...")
+    text = "SEC announced $10 million fine for Coinbase on January 15, 2024"
+    entities = extract_entities_from_text.invoke({"text": text})
+    print(f"   Amounts: {entities['amounts']}")
+    print(f"   Dates: {entities['dates']}")
+    print(f"   Institutions: {entities['institutions']}")
+    print("\n=== All tools tested successfully! ===\n")

src/analysis/__init__.py ADDED Viewed

File without changes

src/analysis/__pycache__/__init__.cpython-313.pyc ADDED Viewed

Binary file (200 Bytes). View file

src/analysis/__pycache__/compliance_engine.cpython-313.pyc ADDED Viewed

Binary file (15.7 kB). View file

src/analysis/__pycache__/cost_calculator.cpython-313.pyc ADDED Viewed

Binary file (11.8 kB). View file

src/analysis/__pycache__/risk_scorer.cpython-313.pyc ADDED Viewed

Binary file (17.2 kB). View file

src/analysis/__pycache__/token_classifier.cpython-313.pyc ADDED Viewed

Binary file (15.7 kB). View file

src/analysis/compliance_engine.py ADDED Viewed

	@@ -0,0 +1,486 @@

+"""
+Compliance Engine for structured rule matching and gap identification.
+Provides deterministic compliance analysis without relying solely on LLM prompts.
+"""
+import logging
+from typing import Dict, List, Optional, Set
+from datetime import datetime, timedelta
+from enum import Enum
+logger = logging.getLogger(__name__)
+class Severity(Enum):
+    """Severity levels for compliance gaps."""
+    CRITICAL = "critical"  # Immediate action required, high enforcement risk
+    HIGH = "high"          # Action needed within 3 months
+    MEDIUM = "medium"      # Action needed within 6 months
+    LOW = "low"            # Monitor, action within 1 year
+class Urgency(Enum):
+    """Urgency levels based on deadlines."""
+    IMMEDIATE = "immediate"    # < 30 days
+    URGENT = "urgent"          # 30-90 days
+    MODERATE = "moderate"      # 90-180 days
+    PLANNING = "planning"      # > 180 days
+class ComplianceGap:
+    """Represents a single compliance gap."""
+    def __init__(
+        self,
+        requirement: str,
+        jurisdiction: str,
+        activity: str,
+        severity: Severity,
+        urgency: Urgency,
+        regulation_id: Optional[str] = None,
+        deadline: Optional[str] = None,
+        cost_estimate: Optional[Dict] = None,
+        remediation_steps: Optional[List[str]] = None
+    ):
+        self.requirement = requirement
+        self.jurisdiction = jurisdiction
+        self.activity = activity
+        self.severity = severity
+        self.urgency = urgency
+        self.regulation_id = regulation_id
+        self.deadline = deadline
+        self.cost_estimate = cost_estimate
+        self.remediation_steps = remediation_steps or []
+    def to_dict(self) -> Dict:
+        """Convert to dictionary."""
+        return {
+            'requirement': self.requirement,
+            'jurisdiction': self.jurisdiction,
+            'activity': self.activity,
+            'severity': self.severity.value,
+            'urgency': self.urgency.value,
+            'regulation_id': self.regulation_id,
+            'deadline': self.deadline,
+            'cost_estimate': self.cost_estimate,
+            'remediation_steps': self.remediation_steps
+        }
+class ComplianceEngine:
+    """
+    Structured compliance engine for rule matching and gap analysis.
+    Maps activities to requirements per jurisdiction and identifies gaps.
+    """
+    def __init__(self):
+        """Initialize compliance engine with requirement mappings."""
+        self.requirements = self._build_requirements_database()
+        logger.info("ComplianceEngine initialized with requirements database")
+    def _build_requirements_database(self) -> Dict:
+        """
+        Build requirements database mapping activities to compliance requirements.
+        Structure: {jurisdiction: {activity: [requirements]}}
+        """
+        return {
+            'us': {
+                'exchange': [
+                    {
+                        'requirement': 'FinCEN MSB Registration',
+                        'severity': Severity.CRITICAL,
+                        'deadline_days': 180,
+                        'description': 'Register as Money Services Business with FinCEN',
+                        'cost_range': (10000, 25000),
+                        'steps': [
+                            'File FinCEN Form 107',
+                            'Implement AML/KYC program',
+                            'Appoint compliance officer'
+                        ]
+                    },
+                    {
+                        'requirement': 'State Money Transmitter Licenses',
+                        'severity': Severity.CRITICAL,
+                        'deadline_days': 365,
+                        'description': 'Obtain MTL in each state of operation',
+                        'cost_range': (50000, 150000),
+                        'steps': [
+                            'File applications per state',
+                            'Post surety bonds',
+                            'Meet minimum capital requirements'
+                        ]
+                    }
+                ],
+                'custody': [
+                    {
+                        'requirement': 'SEC Custody Rule Compliance',
+                        'severity': Severity.HIGH,
+                        'deadline_days': 180,
+                        'description': 'Comply with SEC custody and safeguarding rules',
+                        'cost_range': (75000, 200000),
+                        'steps': [
+                            'Implement qualified custody solution',
+                            'Annual surprise audits',
+                            'Insurance requirements'
+                        ]
+                    }
+                ],
+                'staking': [
+                    {
+                        'requirement': 'Securities Law Review',
+                        'severity': Severity.HIGH,
+                        'deadline_days': 90,
+                        'description': 'Determine if staking constitutes securities offering',
+                        'cost_range': (25000, 75000),
+                        'steps': [
+                            'Legal counsel review',
+                            'Howey Test analysis',
+                            'Consider SEC exemptions'
+                        ]
+                    }
+                ],
+                'token_issuance': [
+                    {
+                        'requirement': 'SEC Registration or Exemption',
+                        'severity': Severity.CRITICAL,
+                        'deadline_days': 180,
+                        'description': 'Register securities or qualify for exemption',
+                        'cost_range': (100000, 500000),
+                        'steps': [
+                            'Determine if token is security',
+                            'File Form D (Reg D) or Form 1-A (Reg A+)',
+                            'Accredited investor verification'
+                        ]
+                    }
+                ]
+            },
+            'eu': {
+                'exchange': [
+                    {
+                        'requirement': 'MiCA CASP Authorization',
+                        'severity': Severity.CRITICAL,
+                        'deadline_days': 365,
+                        'description': 'Obtain Crypto-Asset Service Provider authorization',
+                        'cost_range': (100000, 300000),
+                        'steps': [
+                            'Submit application to national regulator',
+                            'Meet capital requirements',
+                            'Implement governance framework'
+                        ]
+                    },
+                    {
+                        'requirement': 'AMLD5 Compliance',
+                        'severity': Severity.CRITICAL,
+                        'deadline_days': 180,
+                        'description': 'Anti-Money Laundering Directive compliance',
+                        'cost_range': (50000, 150000),
+                        'steps': [
+                            'Customer due diligence procedures',
+                            'Transaction monitoring',
+                            'Suspicious activity reporting'
+                        ]
+                    }
+                ],
+                'custody': [
+                    {
+                        'requirement': 'MiCA Custody Requirements',
+                        'severity': Severity.HIGH,
+                        'deadline_days': 180,
+                        'description': 'Safeguarding of client crypto-assets',
+                        'cost_range': (75000, 200000),
+                        'steps': [
+                            'Segregation of client assets',
+                            'Professional indemnity insurance',
+                            'Custodian arrangements'
+                        ]
+                    }
+                ],
+                'token_issuance': [
+                    {
+                        'requirement': 'MiCA White Paper',
+                        'severity': Severity.HIGH,
+                        'deadline_days': 90,
+                        'description': 'Publish and notify white paper to regulator',
+                        'cost_range': (25000, 75000),
+                        'steps': [
+                            'Draft comprehensive white paper',
+                            'Notify competent authority',
+                            'Ongoing disclosure obligations'
+                        ]
+                    }
+                ]
+            },
+            'singapore': {
+                'exchange': [
+                    {
+                        'requirement': 'MAS DPT License',
+                        'severity': Severity.CRITICAL,
+                        'deadline_days': 365,
+                        'description': 'Digital Payment Token service license',
+                        'cost_range': (75000, 250000),
+                        'steps': [
+                            'Submit MAS license application',
+                            'Meet fit and proper criteria',
+                            'Implement technology risk management'
+                        ]
+                    }
+                ],
+                'custody': [
+                    {
+                        'requirement': 'DPT Custody Standards',
+                        'severity': Severity.HIGH,
+                        'deadline_days': 180,
+                        'description': 'MAS guidelines for DPT custody',
+                        'cost_range': (50000, 150000),
+                        'steps': [
+                            'Segregation of customer DPTs',
+                            'Cold storage requirements',
+                            'Insurance or capital reserves'
+                        ]
+                    }
+                ]
+            },
+            'uk': {
+                'exchange': [
+                    {
+                        'requirement': 'FCA Cryptoasset Registration',
+                        'severity': Severity.CRITICAL,
+                        'deadline_days': 365,
+                        'description': 'Register with FCA for AML/CTF',
+                        'cost_range': (50000, 150000),
+                        'steps': [
+                            'FCA registration application',
+                            'AML/CTF procedures',
+                            'Senior management approval'
+                        ]
+                    },
+                    {
+                        'requirement': 'Financial Promotions Compliance',
+                        'severity': Severity.HIGH,
+                        'deadline_days': 90,
+                        'description': 'Comply with cryptoasset promotions regime',
+                        'cost_range': (15000, 50000),
+                        'steps': [
+                            'Ensure promotions are fair, clear, not misleading',
+                            'Risk warnings required',
+                            'Approval by authorized firm'
+                        ]
+                    }
+                ]
+            },
+            'uae': {
+                'exchange': [
+                    {
+                        'requirement': 'VARA VASP License',
+                        'severity': Severity.CRITICAL,
+                        'deadline_days': 365,
+                        'description': 'Virtual Asset Service Provider license from VARA',
+                        'cost_range': (75000, 200000),
+                        'steps': [
+                            'Submit VARA application',
+                            'Meet minimum capital (AED 50k)',
+                            'Comply with VARA rulebook'
+                        ]
+                    }
+                ]
+            }
+        }
+    def analyze_compliance(
+        self,
+        jurisdictions: List[str],
+        activities: List[str],
+        is_security_token: bool = False
+    ) -> Dict:
+        """
+        Analyze compliance requirements and identify gaps.
+        Args:
+            jurisdictions: List of jurisdictions to analyze
+            activities: List of crypto activities
+            is_security_token: Whether token is classified as security
+        Returns:
+            Dictionary with gaps, compliant items, and summary
+        """
+        gaps = []
+        compliant = []
+        warnings = []
+        for jurisdiction in jurisdictions:
+            if jurisdiction not in self.requirements:
+                warnings.append(f"No requirements database for jurisdiction: {jurisdiction}")
+                continue
+            jurisdiction_reqs = self.requirements[jurisdiction]
+            for activity in activities:
+                if activity not in jurisdiction_reqs:
+                    # No specific requirements for this activity in this jurisdiction
+                    warnings.append(
+                        f"No specific requirements found for {activity} in {jurisdiction}"
+                    )
+                    continue
+                requirements = jurisdiction_reqs[activity]
+                for req in requirements:
+                    # Create compliance gap
+                    urgency = self._calculate_urgency(req.get('deadline_days', 365))
+                    gap = ComplianceGap(
+                        requirement=req['requirement'],
+                        jurisdiction=jurisdiction,
+                        activity=activity,
+                        severity=req['severity'],
+                        urgency=urgency,
+                        deadline=self._calculate_deadline(req.get('deadline_days', 365)),
+                        cost_estimate={
+                            'min': req['cost_range'][0],
+                            'max': req['cost_range'][1],
+                            'estimate': sum(req['cost_range']) // 2
+                        },
+                        remediation_steps=req.get('steps', [])
+                    )
+                    gaps.append(gap)
+        # Additional check for security tokens
+        if is_security_token and 'us' in jurisdictions:
+            if 'token_issuance' not in activities:
+                # Add securities registration requirement
+                sec_gap = ComplianceGap(
+                    requirement='SEC Securities Registration',
+                    jurisdiction='us',
+                    activity='token_issuance',
+                    severity=Severity.CRITICAL,
+                    urgency=Urgency.IMMEDIATE,
+                    deadline=self._calculate_deadline(60),
+                    cost_estimate={'min': 200000, 'max': 500000, 'estimate': 350000},
+                    remediation_steps=[
+                        'Immediate legal counsel consultation',
+                        'Howey Test confirms security status',
+                        'File registration or seek exemption',
+                        'Consider Regulation D or A+'
+                    ]
+                )
+                gaps.append(sec_gap)
+        # Sort gaps by severity and urgency
+        gaps.sort(key=lambda g: (
+            ['critical', 'high', 'medium', 'low'].index(g.severity.value),
+            ['immediate', 'urgent', 'moderate', 'planning'].index(g.urgency.value)
+        ))
+        summary = {
+            'total_gaps': len(gaps),
+            'critical_gaps': sum(1 for g in gaps if g.severity == Severity.CRITICAL),
+            'high_gaps': sum(1 for g in gaps if g.severity == Severity.HIGH),
+            'immediate_action': sum(1 for g in gaps if g.urgency == Urgency.IMMEDIATE),
+            'estimated_total_cost': sum(
+                g.cost_estimate['estimate'] for g in gaps if g.cost_estimate
+            ),
+            'jurisdictions_analyzed': len(jurisdictions),
+            'activities_analyzed': len(activities),
+            'warnings': warnings
+        }
+        logger.info(
+            f"Compliance analysis: {len(gaps)} gaps found across "
+            f"{len(jurisdictions)} jurisdictions, {len(activities)} activities"
+        )
+        return {
+            'gaps': [g.to_dict() for g in gaps],
+            'compliant': compliant,
+            'summary': summary,
+            'analyzed_at': datetime.now().isoformat()
+        }
+    def _calculate_urgency(self, deadline_days: int) -> Urgency:
+        """Calculate urgency based on deadline."""
+        if deadline_days <= 30:
+            return Urgency.IMMEDIATE
+        elif deadline_days <= 90:
+            return Urgency.URGENT
+        elif deadline_days <= 180:
+            return Urgency.MODERATE
+        else:
+            return Urgency.PLANNING
+    def _calculate_deadline(self, days: int) -> str:
+        """Calculate deadline date from days."""
+        deadline = datetime.now() + timedelta(days=days)
+        return deadline.strftime('%Y-%m-%d')
+    def get_requirement_details(
+        self,
+        jurisdiction: str,
+        activity: str
+    ) -> Optional[List[Dict]]:
+        """
+        Get detailed requirements for a specific jurisdiction and activity.
+        Args:
+            jurisdiction: Jurisdiction code
+            activity: Activity type
+        Returns:
+            List of requirement dictionaries or None
+        """
+        if jurisdiction not in self.requirements:
+            return None
+        if activity not in self.requirements[jurisdiction]:
+            return None
+        return self.requirements[jurisdiction][activity]
+# Convenience function
+def analyze_compliance(
+    jurisdictions: List[str],
+    activities: List[str],
+    is_security_token: bool = False
+) -> Dict:
+    """
+    Quick compliance analysis.
+    Args:
+        jurisdictions: List of jurisdictions
+        activities: List of activities
+        is_security_token: Whether token is a security
+    Returns:
+        Analysis results
+    """
+    engine = ComplianceEngine()
+    return engine.analyze_compliance(jurisdictions, activities, is_security_token)
+if __name__ == "__main__":
+    # Test the engine
+    print("\n=== Testing Compliance Engine ===\n")
+    result = analyze_compliance(
+        jurisdictions=['us', 'eu'],
+        activities=['exchange', 'custody'],
+        is_security_token=False
+    )
+    print(f"Total gaps: {result['summary']['total_gaps']}")
+    print(f"Critical gaps: {result['summary']['critical_gaps']}")
+    print(f"Estimated cost: ${result['summary']['estimated_total_cost']:,}")
+    print(f"\nTop 3 gaps:")
+    for i, gap in enumerate(result['gaps'][:3], 1):
+        print(f"\n{i}. {gap['requirement']}")
+        print(f"   Jurisdiction: {gap['jurisdiction'].upper()}")
+        print(f"   Severity: {gap['severity']}")
+        print(f"   Urgency: {gap['urgency']}")
+        print(f"   Deadline: {gap['deadline']}")
+        print(f"   Cost: ${gap['cost_estimate']['estimate']:,}")
+    print("\n=== Engine test complete! ===\n")

src/analysis/cost_calculator.py ADDED Viewed

	@@ -0,0 +1,544 @@

+"""
+Cost Calculator for compliance cost estimation.
+Migrated from tools.py to provide dedicated class with advanced features.
+"""
+import logging
+from typing import Dict, List, Optional
+from datetime import datetime
+logger = logging.getLogger(__name__)
+class CostCalculator:
+    """
+    Calculate estimated compliance costs for crypto businesses.
+    Provides cost estimates across jurisdictions, activities, and time periods.
+    """
+    def __init__(self):
+        """Initialize cost calculator with comprehensive cost database."""
+        self.cost_database = self._build_cost_database()
+        logger.info("CostCalculator initialized with cost database")
+    def _build_cost_database(self) -> Dict:
+        """
+        Build comprehensive cost database.
+        Structure: {jurisdiction: {activity: {first_year, annual}}}
+        All costs in USD.
+        """
+        return {
+            'us': {
+                'exchange': {
+                    'first_year': (200000, 500000),
+                    'annual': (100000, 250000),
+                    'breakdown': {
+                        'FinCEN MSB registration': (10000, 25000),
+                        'State MTL licenses': (150000, 400000),
+                        'Legal counsel': (25000, 50000),
+                        'Compliance staff': (75000, 150000),
+                        'AML/KYC systems': (50000, 100000)
+                    }
+                },
+                'custody': {
+                    'first_year': (100000, 300000),
+                    'annual': (50000, 150000),
+                    'breakdown': {
+                        'Custody infrastructure': (50000, 150000),
+                        'Insurance': (25000, 75000),
+                        'Audits': (15000, 50000),
+                        'Compliance program': (10000, 25000)
+                    }
+                },
+                'staking': {
+                    'first_year': (50000, 150000),
+                    'annual': (25000, 75000),
+                    'breakdown': {
+                        'Legal analysis': (25000, 75000),
+                        'Compliance monitoring': (15000, 50000),
+                        'Disclosure requirements': (10000, 25000)
+                    }
+                },
+                'lending': {
+                    'first_year': (100000, 250000),
+                    'annual': (50000, 125000),
+                    'breakdown': {
+                        'Securities review': (50000, 125000),
+                        'State lending licenses': (30000, 75000),
+                        'Compliance program': (20000, 50000)
+                    }
+                },
+                'token_issuance': {
+                    'first_year': (50000, 200000),
+                    'annual': (25000, 100000),
+                    'breakdown': {
+                        'Legal counsel': (30000, 100000),
+                        'SEC filing (if required)': (15000, 75000),
+                        'Ongoing reporting': (5000, 25000)
+                    }
+                },
+                'security_token_premium': {
+                    'first_year': (200000, 500000),
+                    'annual': (100000, 250000),
+                    'breakdown': {
+                        'SEC registration': (100000, 250000),
+                        'Transfer agent': (30000, 75000),
+                        'Legal opinions': (50000, 125000),
+                        'Compliance officer': (20000, 50000)
+                    }
+                },
+                'payment_processing': {
+                    'first_year': (75000, 200000),
+                    'annual': (40000, 100000),
+                    'breakdown': {
+                        'FinCEN registration': (10000, 20000),
+                        'State licenses': (50000, 150000),
+                        'AML compliance': (15000, 30000)
+                    }
+                },
+                'mining': {
+                    'first_year': (25000, 75000),
+                    'annual': (15000, 40000),
+                    'breakdown': {
+                        'Energy regulation compliance': (15000, 50000),
+                        'Tax reporting': (10000, 25000)
+                    }
+                },
+                'nft_marketplace': {
+                    'first_year': (40000, 100000),
+                    'annual': (20000, 50000),
+                    'breakdown': {
+                        'IP compliance': (15000, 40000),
+                        'Consumer protection': (15000, 40000),
+                        'Payment processing': (10000, 20000)
+                    }
+                },
+                'defi_protocol': {
+                    'first_year': (75000, 250000),
+                    'annual': (40000, 125000),
+                    'breakdown': {
+                        'Securities analysis': (50000, 150000),
+                        'Smart contract audits': (15000, 75000),
+                        'Legal counsel': (10000, 25000)
+                    }
+                },
+                'base': {
+                    'first_year': (50000, 100000),
+                    'annual': (25000, 50000),
+                    'breakdown': {
+                        'General counsel': (25000, 50000),
+                        'Compliance monitoring': (15000, 30000),
+                        'Record keeping': (10000, 20000)
+                    }
+                }
+            },
+            'eu': {
+                'exchange': {
+                    'first_year': (150000, 400000),
+                    'annual': (75000, 200000),
+                    'breakdown': {
+                        'MiCA CASP authorization': (100000, 300000),
+                        'AMLD5 compliance': (30000, 75000),
+                        'Legal counsel': (20000, 25000)
+                    }
+                },
+                'custody': {
+                    'first_year': (100000, 250000),
+                    'annual': (50000, 125000),
+                    'breakdown': {
+                        'Safeguarding requirements': (50000, 150000),
+                        'Insurance': (30000, 75000),
+                        'Compliance program': (20000, 25000)
+                    }
+                },
+                'staking': {
+                    'first_year': (50000, 125000),
+                    'annual': (25000, 60000),
+                    'breakdown': {
+                        'MiCA classification': (25000, 75000),
+                        'Disclosure requirements': (15000, 35000),
+                        'Ongoing monitoring': (10000, 15000)
+                    }
+                },
+                'lending': {
+                    'first_year': (75000, 200000),
+                    'annual': (40000, 100000),
+                    'breakdown': {
+                        'MiCA compliance': (50000, 125000),
+                        'Consumer credit rules': (15000, 50000),
+                        'Legal counsel': (10000, 25000)
+                    }
+                },
+                'token_issuance': {
+                    'first_year': (100000, 300000),
+                    'annual': (50000, 150000),
+                    'breakdown': {
+                        'White paper preparation': (50000, 150000),
+                        'Regulatory notification': (30000, 100000),
+                        'Ongoing disclosures': (20000, 50000)
+                    }
+                },
+                'base': {
+                    'first_year': (50000, 100000),
+                    'annual': (25000, 50000),
+                    'breakdown': {
+                        'General compliance': (30000, 60000),
+                        'Data protection (GDPR)': (20000, 40000)
+                    }
+                }
+            },
+            'singapore': {
+                'exchange': {
+                    'first_year': (150000, 350000),
+                    'annual': (75000, 175000),
+                    'breakdown': {
+                        'MAS DPT license': (100000, 250000),
+                        'Technology risk management': (30000, 75000),
+                        'Compliance program': (20000, 25000)
+                    }
+                },
+                'custody': {
+                    'first_year': (75000, 200000),
+                    'annual': (40000, 100000),
+                    'breakdown': {
+                        'Custody standards': (50000, 125000),
+                        'Insurance/reserves': (15000, 50000),
+                        'Audits': (10000, 25000)
+                    }
+                },
+                'staking': {
+                    'first_year': (40000, 100000),
+                    'annual': (20000, 50000),
+                    'breakdown': {
+                        'MAS guidelines': (25000, 60000),
+                        'Disclosure requirements': (10000, 30000),
+                        'Ongoing compliance': (5000, 10000)
+                    }
+                },
+                'lending': {
+                    'first_year': (60000, 150000),
+                    'annual': (30000, 75000),
+                    'breakdown': {
+                        'Regulatory assessment': (30000, 75000),
+                        'Compliance program': (20000, 50000),
+                        'Legal counsel': (10000, 25000)
+                    }
+                },
+                'token_issuance': {
+                    'first_year': (75000, 250000),
+                    'annual': (40000, 125000),
+                    'breakdown': {
+                        'MAS exemption/license': (50000, 175000),
+                        'Legal counsel': (15000, 50000),
+                        'Prospectus preparation': (10000, 25000)
+                    }
+                },
+                'base': {
+                    'first_year': (40000, 80000),
+                    'annual': (20000, 40000),
+                    'breakdown': {
+                        'General compliance': (25000, 50000),
+                        'Tax advisory': (15000, 30000)
+                    }
+                }
+            },
+            'uk': {
+                'exchange': {
+                    'first_year': (125000, 300000),
+                    'annual': (60000, 150000),
+                    'breakdown': {
+                        'FCA registration': (75000, 200000),
+                        'AML/CTF compliance': (30000, 75000),
+                        'Financial promotions': (20000, 25000)
+                    }
+                },
+                'custody': {
+                    'first_year': (75000, 200000),
+                    'annual': (40000, 100000),
+                    'breakdown': {
+                        'Custody requirements': (50000, 125000),
+                        'Client money rules': (15000, 50000),
+                        'Audits': (10000, 25000)
+                    }
+                },
+                'staking': {
+                    'first_year': (40000, 100000),
+                    'annual': (20000, 50000),
+                    'breakdown': {
+                        'FCA guidance': (20000, 50000),
+                        'Disclosure requirements': (15000, 35000),
+                        'Monitoring': (5000, 15000)
+                    }
+                },
+                'lending': {
+                    'first_year': (60000, 150000),
+                    'annual': (30000, 75000),
+                    'breakdown': {
+                        'Consumer credit rules': (35000, 90000),
+                        'FCA compliance': (15000, 40000),
+                        'Legal counsel': (10000, 20000)
+                    }
+                },
+                'token_issuance': {
+                    'first_year': (50000, 150000),
+                    'annual': (25000, 75000),
+                    'breakdown': {
+                        'Regulatory assessment': (30000, 90000),
+                        'Promotions compliance': (15000, 45000),
+                        'Legal opinions': (5000, 15000)
+                    }
+                },
+                'base': {
+                    'first_year': (40000, 75000),
+                    'annual': (20000, 40000),
+                    'breakdown': {
+                        'General compliance': (25000, 50000),
+                        'AML monitoring': (15000, 25000)
+                    }
+                }
+            },
+            'uae': {
+                'exchange': {
+                    'first_year': (100000, 250000),
+                    'annual': (50000, 125000),
+                    'breakdown': {
+                        'VARA VASP license': (60000, 150000),
+                        'Rulebook compliance': (25000, 75000),
+                        'Compliance program': (15000, 25000)
+                    }
+                },
+                'custody': {
+                    'first_year': (60000, 150000),
+                    'annual': (30000, 75000),
+                    'breakdown': {
+                        'VARA custody rules': (40000, 100000),
+                        'Insurance': (15000, 35000),
+                        'Monitoring': (5000, 15000)
+                    }
+                },
+                'staking': {
+                    'first_year': (30000, 80000),
+                    'annual': (15000, 40000),
+                    'breakdown': {
+                        'VARA guidance': (20000, 50000),
+                        'Disclosure': (10000, 25000)
+                    }
+                },
+                'lending': {
+                    'first_year': (50000, 125000),
+                    'annual': (25000, 60000),
+                    'breakdown': {
+                        'Regulatory compliance': (30000, 75000),
+                        'Legal counsel': (15000, 40000),
+                        'Monitoring': (5000, 10000)
+                    }
+                },
+                'token_issuance': {
+                    'first_year': (50000, 150000),
+                    'annual': (25000, 75000),
+                    'breakdown': {
+                        'VARA token rules': (35000, 100000),
+                        'White paper': (10000, 40000),
+                        'Ongoing disclosure': (5000, 10000)
+                    }
+                },
+                'base': {
+                    'first_year': (30000, 60000),
+                    'annual': (15000, 30000),
+                    'breakdown': {
+                        'General compliance': (20000, 40000),
+                        'AML/CTF': (10000, 20000)
+                    }
+                }
+            }
+        }
+    def calculate_costs(
+        self,
+        jurisdictions: List[str],
+        activities: List[str],
+        is_security_token: bool = False,
+        years: int = 3
+    ) -> Dict:
+        """
+        Calculate comprehensive cost estimates.
+        Args:
+            jurisdictions: List of jurisdictions
+            activities: List of activities
+            is_security_token: Whether token is a security
+            years: Number of years to project (default 3)
+        Returns:
+            Dictionary with cost breakdown
+        """
+        results = {}
+        for jurisdiction in jurisdictions:
+            if jurisdiction not in self.cost_database:
+                logger.warning(f"No cost data for jurisdiction: {jurisdiction}")
+                continue
+            costs = self.cost_database[jurisdiction]
+            # Initialize totals
+            first_year_min = 0
+            first_year_max = 0
+            annual_min = 0
+            annual_max = 0
+            all_breakdowns = []
+            # Base costs
+            if 'base' in costs:
+                first_year_min += costs['base']['first_year'][0]
+                first_year_max += costs['base']['first_year'][1]
+                annual_min += costs['base']['annual'][0]
+                annual_max += costs['base']['annual'][1]
+                all_breakdowns.append({
+                    'category': 'Base Compliance',
+                    'first_year': costs['base']['first_year'],
+                    'annual': costs['base']['annual'],
+                    'breakdown': costs['base'].get('breakdown', {})
+                })
+            # Activity-specific costs
+            for activity in activities:
+                if activity in costs:
+                    first_year_min += costs[activity]['first_year'][0]
+                    first_year_max += costs[activity]['first_year'][1]
+                    annual_min += costs[activity]['annual'][0]
+                    annual_max += costs[activity]['annual'][1]
+                    all_breakdowns.append({
+                        'category': activity.replace('_', ' ').title(),
+                        'first_year': costs[activity]['first_year'],
+                        'annual': costs[activity]['annual'],
+                        'breakdown': costs[activity].get('breakdown', {})
+                    })
+            # Security token premium (US only)
+            if is_security_token and jurisdiction == 'us' and 'security_token_premium' in costs:
+                premium = costs['security_token_premium']
+                first_year_min += premium['first_year'][0]
+                first_year_max += premium['first_year'][1]
+                annual_min += premium['annual'][0]
+                annual_max += premium['annual'][1]
+                all_breakdowns.append({
+                    'category': 'Security Token Premium',
+                    'first_year': premium['first_year'],
+                    'annual': premium['annual'],
+                    'breakdown': premium.get('breakdown', {})
+                })
+            # Multi-year projection
+            multi_year = []
+            for year in range(1, years + 1):
+                if year == 1:
+                    year_min = first_year_min
+                    year_max = first_year_max
+                else:
+                    year_min = annual_min
+                    year_max = annual_max
+                multi_year.append({
+                    'year': year,
+                    'min': year_min,
+                    'max': year_max,
+                    'estimate': (year_min + year_max) // 2
+                })
+            # Calculate totals
+            total_min = first_year_min + (annual_min * (years - 1))
+            total_max = first_year_max + (annual_max * (years - 1))
+            results[jurisdiction] = {
+                'first_year': {
+                    'min': first_year_min,
+                    'max': first_year_max,
+                    'estimate': (first_year_min + first_year_max) // 2
+                },
+                'annual_ongoing': {
+                    'min': annual_min,
+                    'max': annual_max,
+                    'estimate': (annual_min + annual_max) // 2
+                },
+                f'total_{years}_years': {
+                    'min': total_min,
+                    'max': total_max,
+                    'estimate': (total_min + total_max) // 2
+                },
+                'breakdown': all_breakdowns,
+                'multi_year_projection': multi_year
+            }
+        # Calculate grand total
+        grand_total_first = sum(
+            j['first_year']['estimate'] for j in results.values()
+        )
+        grand_total_annual = sum(
+            j['annual_ongoing']['estimate'] for j in results.values()
+        )
+        grand_total_multi = sum(
+            j[f'total_{years}_years']['estimate'] for j in results.values()
+        )
+        logger.info(
+            f"Cost calculation: {len(results)} jurisdictions, "
+            f"first year: ${grand_total_first:,}, "
+            f"{years}-year total: ${grand_total_multi:,}"
+        )
+        return {
+            'by_jurisdiction': results,
+            'grand_totals': {
+                'first_year': grand_total_first,
+                'annual_ongoing': grand_total_annual,
+                f'total_{years}_years': grand_total_multi
+            },
+            'parameters': {
+                'jurisdictions': jurisdictions,
+                'activities': activities,
+                'is_security_token': is_security_token,
+                'projection_years': years
+            },
+            'calculated_at': datetime.now().isoformat()
+        }
+# Convenience function
+def calculate_costs(
+    jurisdictions: List[str],
+    activities: List[str],
+    is_security_token: bool = False,
+    years: int = 3
+) -> Dict:
+    """Quick cost calculation."""
+    calculator = CostCalculator()
+    return calculator.calculate_costs(jurisdictions, activities, is_security_token, years)
+if __name__ == "__main__":
+    # Test the calculator
+    print("\n=== Testing Cost Calculator ===\n")
+    result = calculate_costs(
+        jurisdictions=['us', 'eu', 'singapore'],
+        activities=['exchange', 'custody'],
+        is_security_token=False,
+        years=3
+    )
+    print(f"Grand Totals:")
+    print(f"  First year: ${result['grand_totals']['first_year']:,}")
+    print(f"  Annual ongoing: ${result['grand_totals']['annual_ongoing']:,}")
+    print(f"  3-year total: ${result['grand_totals']['total_3_years']:,}")
+    print(f"\nBy Jurisdiction:")
+    for jur, data in result['by_jurisdiction'].items():
+        print(f"\n{jur.upper()}:")
+        print(f"  First year: ${data['first_year']['estimate']:,}")
+        print(f"  Annual: ${data['annual_ongoing']['estimate']:,}")
+        print(f"  3-year total: ${data['total_3_years']['estimate']:,}")
+    print("\n=== Calculator test complete! ===\n")

src/analysis/risk_scorer.py ADDED Viewed

	@@ -0,0 +1,454 @@

+"""
+Risk Scorer with configurable weights for compliance risk assessment.
+Provides structured, explainable risk scoring from 0-100.
+"""
+import logging
+from typing import Dict, List, Optional
+from datetime import datetime
+from enum import Enum
+logger = logging.getLogger(__name__)
+class RiskLevel(Enum):
+    """Risk level categories."""
+    LOW = "low"              # 0-40
+    MEDIUM = "medium"        # 41-70
+    HIGH = "high"            # 71-85
+    CRITICAL = "critical"    # 86-100
+class RiskFactor:
+    """Represents a single risk factor contribution."""
+    def __init__(
+        self,
+        name: str,
+        score: float,
+        weight: float,
+        weighted_score: float,
+        description: str
+    ):
+        self.name = name
+        self.score = score  # 0-100 for this factor
+        self.weight = weight  # Weight in final score
+        self.weighted_score = weighted_score  # score * weight
+        self.description = description
+    def to_dict(self) -> Dict:
+        """Convert to dictionary."""
+        return {
+            'name': self.name,
+            'score': round(self.score, 2),
+            'weight': round(self.weight, 2),
+            'weighted_score': round(self.weighted_score, 2),
+            'description': self.description
+        }
+class RiskScorer:
+    """
+    Configurable risk scorer for compliance analysis.
+    Calculates risk score (0-100) based on multiple weighted factors.
+    """
+    def __init__(self, custom_weights: Optional[Dict[str, float]] = None):
+        """
+        Initialize risk scorer.
+        Args:
+            custom_weights: Optional custom weights for factors
+                Format: {'gap_severity': 0.35, 'gap_count': 0.25, ...}
+        """
+        # Default weights (must sum to 1.0)
+        self.weights = {
+            'gap_severity': 0.35,      # Severity of compliance gaps
+            'gap_count': 0.25,         # Number of gaps
+            'token_classification': 0.15,  # Security token risk
+            'urgency': 0.15,           # Deadline urgency
+            'enforcement_history': 0.10    # Jurisdiction enforcement risk
+        }
+        # Override with custom weights if provided
+        if custom_weights:
+            self.weights.update(custom_weights)
+        # Normalize weights to sum to 1.0
+        total_weight = sum(self.weights.values())
+        if total_weight != 1.0:
+            logger.warning(f"Weights sum to {total_weight}, normalizing to 1.0")
+            self.weights = {k: v / total_weight for k, v in self.weights.items()}
+        # Enforcement risk scores by jurisdiction (0-100)
+        self.enforcement_risk = {
+            'us': 85,       # US: High enforcement, especially SEC
+            'eu': 70,       # EU: Moderate-high, MiCA implementation
+            'singapore': 60,  # Singapore: Moderate, clear framework
+            'uk': 65,       # UK: Moderate, post-Brexit evolution
+            'uae': 50       # UAE: Moderate-low, emerging framework
+        }
+        logger.info(f"RiskScorer initialized with weights: {self.weights}")
+    def calculate_risk(
+        self,
+        gaps: List[Dict],
+        token_classification: Optional[Dict] = None,
+        jurisdictions: Optional[List[str]] = None
+    ) -> Dict:
+        """
+        Calculate overall risk score from compliance gaps.
+        Args:
+            gaps: List of compliance gaps from ComplianceEngine
+            token_classification: Token classification result (if applicable)
+            jurisdictions: List of jurisdictions being analyzed
+        Returns:
+            Dictionary with risk score, level, and breakdown
+        """
+        factors = []
+        # Factor 1: Gap Severity
+        severity_score, severity_desc = self._score_gap_severity(gaps)
+        factors.append(RiskFactor(
+            name='Gap Severity',
+            score=severity_score,
+            weight=self.weights['gap_severity'],
+            weighted_score=severity_score * self.weights['gap_severity'],
+            description=severity_desc
+        ))
+        # Factor 2: Gap Count
+        count_score, count_desc = self._score_gap_count(gaps)
+        factors.append(RiskFactor(
+            name='Gap Count',
+            score=count_score,
+            weight=self.weights['gap_count'],
+            weighted_score=count_score * self.weights['gap_count'],
+            description=count_desc
+        ))
+        # Factor 3: Token Classification
+        token_score, token_desc = self._score_token_classification(token_classification)
+        factors.append(RiskFactor(
+            name='Token Classification',
+            score=token_score,
+            weight=self.weights['token_classification'],
+            weighted_score=token_score * self.weights['token_classification'],
+            description=token_desc
+        ))
+        # Factor 4: Urgency
+        urgency_score, urgency_desc = self._score_urgency(gaps)
+        factors.append(RiskFactor(
+            name='Urgency',
+            score=urgency_score,
+            weight=self.weights['urgency'],
+            weighted_score=urgency_score * self.weights['urgency'],
+            description=urgency_desc
+        ))
+        # Factor 5: Enforcement History
+        enforcement_score, enforcement_desc = self._score_enforcement(jurisdictions or [])
+        factors.append(RiskFactor(
+            name='Enforcement Risk',
+            score=enforcement_score,
+            weight=self.weights['enforcement_history'],
+            weighted_score=enforcement_score * self.weights['enforcement_history'],
+            description=enforcement_desc
+        ))
+        # Calculate total weighted score
+        total_score = sum(f.weighted_score for f in factors)
+        total_score = min(max(total_score, 0), 100)  # Clamp to 0-100
+        # Determine risk level
+        risk_level = self._get_risk_level(total_score)
+        result = {
+            'risk_score': round(total_score, 2),
+            'risk_level': risk_level.value,
+            'risk_category': risk_level.name,
+            'factors': [f.to_dict() for f in factors],
+            'summary': self._generate_summary(total_score, risk_level, factors),
+            'recommendations': self._generate_risk_recommendations(total_score, factors),
+            'calculated_at': datetime.now().isoformat()
+        }
+        logger.info(
+            f"Risk calculated: {total_score:.2f} ({risk_level.value}) from "
+            f"{len(gaps)} gaps, {len(jurisdictions or [])} jurisdictions"
+        )
+        return result
+    def _score_gap_severity(self, gaps: List[Dict]) -> tuple[float, str]:
+        """Score based on gap severity distribution."""
+        if not gaps:
+            return 0.0, "No compliance gaps identified"
+        severity_weights = {
+            'critical': 100,
+            'high': 75,
+            'medium': 50,
+            'low': 25
+        }
+        # Calculate weighted average severity
+        total_weight = 0
+        weighted_sum = 0
+        severity_counts = {'critical': 0, 'high': 0, 'medium': 0, 'low': 0}
+        for gap in gaps:
+            severity = gap.get('severity', 'medium')
+            severity_counts[severity] = severity_counts.get(severity, 0) + 1
+            weight = severity_weights.get(severity, 50)
+            weighted_sum += weight
+            total_weight += 1
+        score = weighted_sum / total_weight if total_weight > 0 else 0
+        desc = f"{severity_counts['critical']} critical, {severity_counts['high']} high, " \
+               f"{severity_counts['medium']} medium, {severity_counts['low']} low severity gaps"
+        return score, desc
+    def _score_gap_count(self, gaps: List[Dict]) -> tuple[float, str]:
+        """Score based on number of gaps."""
+        count = len(gaps)
+        # Score curve: 0 gaps = 0, 1-2 = 20, 3-4 = 40, 5-6 = 60, 7-8 = 80, 9+ = 100
+        if count == 0:
+            score = 0
+        elif count <= 2:
+            score = count * 10
+        elif count <= 4:
+            score = 20 + (count - 2) * 10
+        elif count <= 6:
+            score = 40 + (count - 4) * 10
+        elif count <= 8:
+            score = 60 + (count - 6) * 10
+        else:
+            score = min(80 + (count - 8) * 5, 100)
+        desc = f"{count} compliance gaps require remediation"
+        return score, desc
+    def _score_token_classification(
+        self,
+        classification: Optional[Dict]
+    ) -> tuple[float, str]:
+        """Score based on token classification."""
+        if not classification:
+            return 0.0, "No token issuance"
+        # Check if classified as security in any jurisdiction
+        is_security = False
+        security_jurisdictions = []
+        for jurisdiction, result in classification.items():
+            if result.get('classification') in ['security', 'capital markets product']:
+                is_security = True
+                security_jurisdictions.append(jurisdiction.upper())
+        if is_security:
+            score = 90.0  # Very high risk for unregistered securities
+            desc = f"Token classified as security in {', '.join(security_jurisdictions)}"
+        else:
+            score = 20.0  # Still some risk for utility tokens
+            desc = "Token classified as utility/non-security"
+        return score, desc
+    def _score_urgency(self, gaps: List[Dict]) -> tuple[float, str]:
+        """Score based on deadline urgency."""
+        if not gaps:
+            return 0.0, "No urgent deadlines"
+        urgency_weights = {
+            'immediate': 100,
+            'urgent': 75,
+            'moderate': 50,
+            'planning': 25
+        }
+        # Calculate weighted average urgency
+        total_weight = 0
+        weighted_sum = 0
+        urgency_counts = {'immediate': 0, 'urgent': 0, 'moderate': 0, 'planning': 0}
+        for gap in gaps:
+            urgency = gap.get('urgency', 'moderate')
+            urgency_counts[urgency] = urgency_counts.get(urgency, 0) + 1
+            weight = urgency_weights.get(urgency, 50)
+            weighted_sum += weight
+            total_weight += 1
+        score = weighted_sum / total_weight if total_weight > 0 else 0
+        desc = f"{urgency_counts['immediate']} immediate, {urgency_counts['urgent']} urgent deadlines"
+        return score, desc
+    def _score_enforcement(self, jurisdictions: List[str]) -> tuple[float, str]:
+        """Score based on enforcement risk of jurisdictions."""
+        if not jurisdictions:
+            return 0.0, "No jurisdictions specified"
+        # Average enforcement risk across jurisdictions
+        enforcement_scores = [
+            self.enforcement_risk.get(j, 50) for j in jurisdictions
+        ]
+        score = sum(enforcement_scores) / len(enforcement_scores)
+        highest_risk = max(jurisdictions, key=lambda j: self.enforcement_risk.get(j, 0))
+        desc = f"Operating in {len(jurisdictions)} jurisdictions, highest risk: {highest_risk.upper()}"
+        return score, desc
+    def _get_risk_level(self, score: float) -> RiskLevel:
+        """Determine risk level from score."""
+        if score >= 86:
+            return RiskLevel.CRITICAL
+        elif score >= 71:
+            return RiskLevel.HIGH
+        elif score >= 41:
+            return RiskLevel.MEDIUM
+        else:
+            return RiskLevel.LOW
+    def _generate_summary(
+        self,
+        score: float,
+        level: RiskLevel,
+        factors: List[RiskFactor]
+    ) -> str:
+        """Generate human-readable summary."""
+        top_factor = max(factors, key=lambda f: f.weighted_score)
+        if level == RiskLevel.CRITICAL:
+            return f"CRITICAL RISK ({score:.0f}/100): Immediate action required. " \
+                   f"Primary risk: {top_factor.name} ({top_factor.score:.0f}/100)."
+        elif level == RiskLevel.HIGH:
+            return f"HIGH RISK ({score:.0f}/100): Urgent compliance measures needed. " \
+                   f"Primary concern: {top_factor.name}."
+        elif level == RiskLevel.MEDIUM:
+            return f"MEDIUM RISK ({score:.0f}/100): Compliance improvements recommended. " \
+                   f"Focus on: {top_factor.name}."
+        else:
+            return f"LOW RISK ({score:.0f}/100): Minimal compliance concerns. " \
+                   f"Continue monitoring: {top_factor.name}."
+    def _generate_risk_recommendations(
+        self,
+        score: float,
+        factors: List[RiskFactor]
+    ) -> List[str]:
+        """Generate recommendations based on risk factors."""
+        recommendations = []
+        # Sort factors by weighted contribution
+        sorted_factors = sorted(factors, key=lambda f: f.weighted_score, reverse=True)
+        for factor in sorted_factors:
+            if factor.weighted_score > 15:  # Significant contributor
+                if factor.name == 'Gap Severity':
+                    recommendations.append(
+                        "Address critical severity gaps immediately with legal counsel"
+                    )
+                elif factor.name == 'Gap Count':
+                    recommendations.append(
+                        "Develop systematic compliance roadmap to address multiple gaps"
+                    )
+                elif factor.name == 'Token Classification':
+                    recommendations.append(
+                        "Consult securities lawyer immediately for token registration strategy"
+                    )
+                elif factor.name == 'Urgency':
+                    recommendations.append(
+                        "Prioritize urgent deadlines to avoid enforcement action"
+                    )
+                elif factor.name == 'Enforcement Risk':
+                    recommendations.append(
+                        "Consider jurisdiction risk in business strategy and timeline"
+                    )
+        # Overall recommendation based on score
+        if score >= 71:
+            recommendations.insert(
+                0,
+                "URGENT: Engage compliance counsel and consider delaying launch until gaps addressed"
+            )
+        elif score >= 41:
+            recommendations.insert(
+                0,
+                "Develop 90-day compliance action plan with clear milestones"
+            )
+        return recommendations[:5]  # Top 5 recommendations
+# Convenience function
+def calculate_risk(
+    gaps: List[Dict],
+    token_classification: Optional[Dict] = None,
+    jurisdictions: Optional[List[str]] = None
+) -> Dict:
+    """Quick risk calculation."""
+    scorer = RiskScorer()
+    return scorer.calculate_risk(gaps, token_classification, jurisdictions)
+if __name__ == "__main__":
+    # Test the scorer
+    print("\n=== Testing Risk Scorer ===\n")
+    sample_gaps = [
+        {
+            'requirement': 'FinCEN MSB Registration',
+            'severity': 'critical',
+            'urgency': 'immediate',
+            'jurisdiction': 'us'
+        },
+        {
+            'requirement': 'State MTL Licenses',
+            'severity': 'critical',
+            'urgency': 'urgent',
+            'jurisdiction': 'us'
+        },
+        {
+            'requirement': 'MiCA CASP Authorization',
+            'severity': 'high',
+            'urgency': 'moderate',
+            'jurisdiction': 'eu'
+        }
+    ]
+    result = calculate_risk(
+        gaps=sample_gaps,
+        token_classification={'us': {'classification': 'security'}},
+        jurisdictions=['us', 'eu']
+    )
+    print(f"Risk Score: {result['risk_score']}/100")
+    print(f"Risk Level: {result['risk_level'].upper()}")
+    print(f"\nSummary: {result['summary']}")
+    print(f"\nRisk Factors:")
+    for factor in result['factors']:
+        print(f"  - {factor['name']}: {factor['score']:.0f}/100 "
+              f"(weight: {factor['weight']:.0%}, contribution: {factor['weighted_score']:.1f})")
+    print(f"\nRecommendations:")
+    for i, rec in enumerate(result['recommendations'], 1):
+        print(f"  {i}. {rec}")
+    print("\n=== Scorer test complete! ===\n")

src/analysis/token_classifier.py ADDED Viewed

	@@ -0,0 +1,487 @@

+"""
+Token classification using the Howey Test and other regulatory frameworks.
+Determines if a crypto token is a security or utility token.
+"""
+import logging
+from typing import Dict, List, Optional, Tuple
+from datetime import datetime
+import re
+logger = logging.getLogger(__name__)
+class HoweyTestAnalyzer:
+    """
+    Analyzes tokens using the SEC's Howey Test.
+    The Howey Test has 4 prongs:
+    1. Investment of money
+    2. In a common enterprise
+    3. With an expectation of profits
+    4. Derived from the efforts of others
+    If all 4 are met, the token is likely a security.
+    """
+    def __init__(self):
+        """Initialize Howey Test analyzer."""
+        self.test_criteria = {
+            'investment_of_money': {
+                'keywords': [
+                    'purchase', 'buy', 'invest', 'sale', 'ico', 'token sale',
+                    'presale', 'crowdsale', 'fundraising', 'payment', 'contribute'
+                ],
+                'weight': 0.25
+            },
+            'common_enterprise': {
+                'keywords': [
+                    'pool', 'pooled', 'combined', 'collective', 'together',
+                    'treasury', 'ecosystem', 'platform', 'network', 'protocol'
+                ],
+                'weight': 0.25
+            },
+            'expectation_of_profits': {
+                'keywords': [
+                    'profit', 'returns', 'gains', 'appreciation', 'yield',
+                    'rewards', 'earnings', 'income', 'dividend', 'interest',
+                    'roi', 'return on investment', 'price increase'
+                ],
+                'weight': 0.25
+            },
+            'efforts_of_others': {
+                'keywords': [
+                    'team', 'development', 'management', 'founders', 'developers',
+                    'operated by', 'managed by', 'governance', 'roadmap',
+                    'build', 'create', 'maintain', 'improve', 'update'
+                ],
+                'weight': 0.25
+            }
+        }
+    def analyze_prong(self, text: str, prong_name: str) -> Tuple[bool, float, List[str]]:
+        """
+        Analyze a single Howey Test prong.
+        Args:
+            text: Token description/whitepaper text
+            prong_name: Name of the prong to analyze
+        Returns:
+            Tuple of (prong_met, confidence, evidence_keywords)
+        """
+        if prong_name not in self.test_criteria:
+            raise ValueError(f"Invalid prong: {prong_name}")
+        criteria = self.test_criteria[prong_name]
+        keywords = criteria['keywords']
+        text_lower = text.lower()
+        # Find matching keywords
+        matches = []
+        for keyword in keywords:
+            pattern = r'\b' + re.escape(keyword) + r'\b'
+            if re.search(pattern, text_lower):
+                matches.append(keyword)
+        # Calculate confidence based on match density
+        match_count = len(matches)
+        word_count = len(text_lower.split())
+        match_density = (match_count / (word_count / 100)) if word_count > 0 else 0
+        # Prong is "met" if we have multiple keyword matches
+        prong_met = match_count >= 2
+        confidence = min(match_density / 5, 1.0)  # Normalize to 0-1
+        return prong_met, confidence, matches
+    def run_howey_test(self, text: str) -> Dict:
+        """
+        Run full Howey Test analysis on token description.
+        Args:
+            text: Token description/whitepaper text
+        Returns:
+            Dictionary with test results
+        """
+        results = {
+            'prongs': {},
+            'prongs_met': 0,
+            'is_security': False,
+            'overall_confidence': 0.0,
+            'evidence': {},
+            'analysis_timestamp': datetime.now().isoformat()
+        }
+        # Analyze each prong
+        for prong_name in self.test_criteria.keys():
+            met, confidence, evidence = self.analyze_prong(text, prong_name)
+            results['prongs'][prong_name] = {
+                'met': met,
+                'confidence': confidence,
+                'evidence_count': len(evidence)
+            }
+            results['evidence'][prong_name] = evidence
+            if met:
+                results['prongs_met'] += 1
+        # Token is a security if all 4 prongs are met
+        results['is_security'] = results['prongs_met'] == 4
+        # Calculate overall confidence (average of prong confidences)
+        confidences = [p['confidence'] for p in results['prongs'].values()]
+        results['overall_confidence'] = sum(confidences) / len(confidences)
+        # Adjust confidence based on prongs met
+        if results['prongs_met'] < 4:
+            # Reduce confidence if not all prongs met
+            results['overall_confidence'] *= (results['prongs_met'] / 4)
+        logger.info(
+            f"Howey Test: {results['prongs_met']}/4 prongs met, "
+            f"is_security={results['is_security']}, "
+            f"confidence={results['overall_confidence']:.2f}"
+        )
+        return results
+class TokenClassifier:
+    """
+    Comprehensive token classifier using multiple frameworks.
+    - US: Howey Test
+    - EU: MiCA classification
+    - Singapore: DPT classification
+    """
+    def __init__(self):
+        """Initialize token classifier."""
+        self.howey_analyzer = HoweyTestAnalyzer()
+        logger.info("TokenClassifier initialized")
+    def classify_us(self, token_description: str) -> Dict:
+        """
+        Classify token under US law (SEC Howey Test).
+        Args:
+            token_description: Description of token mechanics
+        Returns:
+            Classification result
+        """
+        howey_result = self.howey_analyzer.run_howey_test(token_description)
+        classification = {
+            'jurisdiction': 'us',
+            'framework': 'SEC Howey Test',
+            'classification': 'security' if howey_result['is_security'] else 'utility',
+            'confidence': howey_result['overall_confidence'],
+            'howey_test': howey_result,
+            'regulatory_implications': self._get_us_implications(howey_result)
+        }
+        return classification
+    def classify_eu(self, token_description: str) -> Dict:
+        """
+        Classify token under EU MiCA framework.
+        Args:
+            token_description: Token description
+        Returns:
+            Classification result
+        """
+        text_lower = token_description.lower()
+        # MiCA categories
+        is_utility_token = any([
+            'access' in text_lower,
+            'usage' in text_lower,
+            'service' in text_lower,
+            'platform access' in text_lower
+        ])
+        is_asset_referenced = any([
+            'backed' in text_lower,
+            'pegged' in text_lower,
+            'collateralized' in text_lower,
+            'reserve' in text_lower
+        ])
+        is_e_money = any([
+            'fiat' in text_lower,
+            'currency' in text_lower,
+            'stablecoin' in text_lower,
+            'payment' in text_lower
+        ])
+        # Determine primary category
+        if is_e_money:
+            category = 'e-money token'
+        elif is_asset_referenced:
+            category = 'asset-referenced token'
+        elif is_utility_token:
+            category = 'utility token'
+        else:
+            category = 'crypto-asset'  # Default MiCA category
+        classification = {
+            'jurisdiction': 'eu',
+            'framework': 'MiCA',
+            'classification': category,
+            'confidence': 0.6,  # Lower confidence for heuristic classification
+            'mica_categories': {
+                'utility_token': is_utility_token,
+                'asset_referenced_token': is_asset_referenced,
+                'e_money_token': is_e_money
+            },
+            'regulatory_implications': self._get_eu_implications(category)
+        }
+        return classification
+    def classify_singapore(self, token_description: str) -> Dict:
+        """
+        Classify token under Singapore MAS framework.
+        Args:
+            token_description: Token description
+        Returns:
+            Classification result
+        """
+        text_lower = token_description.lower()
+        # MAS Payment Services Act - Digital Payment Token (DPT)
+        is_dpt = any([
+            'payment' in text_lower,
+            'medium of exchange' in text_lower,
+            'store of value' in text_lower,
+            'transfer' in text_lower
+        ])
+        is_capital_markets_product = any([
+            'security' in text_lower,
+            'investment' in text_lower,
+            'profit' in text_lower,
+            'return' in text_lower,
+            'dividend' in text_lower
+        ])
+        # Determine category
+        if is_capital_markets_product:
+            category = 'capital markets product'
+        elif is_dpt:
+            category = 'digital payment token'
+        else:
+            category = 'unregulated token'
+        classification = {
+            'jurisdiction': 'singapore',
+            'framework': 'MAS PSA',
+            'classification': category,
+            'confidence': 0.6,
+            'mas_categories': {
+                'digital_payment_token': is_dpt,
+                'capital_markets_product': is_capital_markets_product
+            },
+            'regulatory_implications': self._get_singapore_implications(category)
+        }
+        return classification
+    def classify_all_jurisdictions(self, token_description: str) -> Dict:
+        """
+        Classify token across all supported jurisdictions.
+        Args:
+            token_description: Token description/whitepaper text
+        Returns:
+            Dictionary of classifications per jurisdiction
+        """
+        return {
+            'us': self.classify_us(token_description),
+            'eu': self.classify_eu(token_description),
+            'singapore': self.classify_singapore(token_description),
+            'summary': self._generate_summary(token_description)
+        }
+    def _get_us_implications(self, howey_result: Dict) -> List[str]:
+        """Get regulatory implications for US classification."""
+        implications = []
+        if howey_result['is_security']:
+            implications.extend([
+                "Token is likely a security under US law",
+                "Must register with SEC or qualify for exemption",
+                "Consider Regulation D (private placement) or Regulation A+",
+                "Must comply with securities laws for trading",
+                "May need broker-dealer registration for exchanges"
+            ])
+        else:
+            implications.extend([
+                "Token may be a utility token (not a security)",
+                "Still subject to FinCEN MSB registration if used for payments",
+                "State money transmitter licenses may be required",
+                "Consumer protection laws still apply",
+                "Monitor SEC guidance - classification can change"
+            ])
+        return implications
+    def _get_eu_implications(self, category: str) -> List[str]:
+        """Get regulatory implications for EU classification."""
+        implications_map = {
+            'e-money token': [
+                "Subject to strict MiCA e-money token requirements",
+                "Need authorization as e-money institution",
+                "Must maintain 1:1 backing with fiat reserves",
+                "Enhanced consumer protection requirements",
+                "Effective from June 2024"
+            ],
+            'asset-referenced token': [
+                "Subject to MiCA asset-referenced token regime",
+                "Must maintain reserve of referenced assets",
+                "Requires authorization from regulator",
+                "Ongoing reporting and transparency requirements",
+                "White paper must be approved"
+            ],
+            'utility token': [
+                "Lower regulatory burden under MiCA",
+                "Still requires white paper publication",
+                "Consumer protection rules apply",
+                "Marketing restrictions apply",
+                "Effective from July 2024"
+            ],
+            'crypto-asset': [
+                "General MiCA crypto-asset rules apply",
+                "CASP authorization needed for services",
+                "White paper required for public offerings",
+                "AML/CTF compliance mandatory"
+            ]
+        }
+        return implications_map.get(category, ["MiCA framework applies"])
+    def _get_singapore_implications(self, category: str) -> List[str]:
+        """Get regulatory implications for Singapore classification."""
+        implications_map = {
+            'digital payment token': [
+                "Requires DPT service provider license from MAS",
+                "Must comply with Payment Services Act",
+                "AML/CFT requirements apply",
+                "Technology risk management guidelines",
+                "Fit and proper criteria for operators"
+            ],
+            'capital markets product': [
+                "Subject to Securities and Futures Act",
+                "Requires CMS license from MAS",
+                "Prospectus or exemption required",
+                "Ongoing reporting obligations",
+                "Higher regulatory scrutiny"
+            ],
+            'unregulated token': [
+                "May not require MAS licensing",
+                "Still subject to general laws",
+                "Monitor for regulatory changes",
+                "Consumer protection laws apply"
+            ]
+        }
+        return implications_map.get(category, ["Review MAS guidelines"])
+    def _generate_summary(self, token_description: str) -> Dict:
+        """Generate summary across jurisdictions."""
+        us_result = self.classify_us(token_description)
+        eu_result = self.classify_eu(token_description)
+        sg_result = self.classify_singapore(token_description)
+        is_security_anywhere = (
+            us_result['classification'] == 'security' or
+            sg_result['classification'] == 'capital markets product'
+        )
+        return {
+            'is_security_anywhere': is_security_anywhere,
+            'most_restrictive_jurisdiction': 'us' if us_result['classification'] == 'security' else 'eu',
+            'classifications': {
+                'us': us_result['classification'],
+                'eu': eu_result['classification'],
+                'singapore': sg_result['classification']
+            },
+            'recommendation': (
+                "Consult securities lawyer immediately - token appears to be a security"
+                if is_security_anywhere else
+                "Token may qualify as utility token, but verify with legal counsel"
+            )
+        }
+# Convenience function
+def classify_token(token_description: str, jurisdiction: Optional[str] = None) -> Dict:
+    """
+    Quick classify a token.
+    Args:
+        token_description: Token description/whitepaper
+        jurisdiction: Specific jurisdiction ('us', 'eu', 'singapore') or None for all
+    Returns:
+        Classification result
+    """
+    classifier = TokenClassifier()
+    if jurisdiction:
+        if jurisdiction == 'us':
+            return classifier.classify_us(token_description)
+        elif jurisdiction == 'eu':
+            return classifier.classify_eu(token_description)
+        elif jurisdiction == 'singapore':
+            return classifier.classify_singapore(token_description)
+        else:
+            raise ValueError(f"Unsupported jurisdiction: {jurisdiction}")
+    else:
+        return classifier.classify_all_jurisdictions(token_description)
+if __name__ == "__main__":
+    # Example usage
+    sample_token = """
+    Our governance token allows holders to vote on protocol upgrades and earn
+    rewards from transaction fees. Tokens are sold in a public sale at $0.50 each.
+    The development team will use funds to build the platform and market the product.
+    Early investors expect significant returns as the platform grows and token value
+    appreciates. The team manages the treasury and executes the roadmap.
+    """
+    print("\n=== Token Classification ===\n")
+    # Full analysis
+    results = classify_token(sample_token)
+    print("US Classification:")
+    us = results['us']
+    print(f"  Classification: {us['classification']}")
+    print(f"  Confidence: {us['confidence']:.2f}")
+    print(f"  Howey Test: {us['howey_test']['prongs_met']}/4 prongs met")
+    print("\nEU Classification:")
+    eu = results['eu']
+    print(f"  Classification: {eu['classification']}")
+    print(f"  Confidence: {eu['confidence']:.2f}")
+    print("\nSingapore Classification:")
+    sg = results['singapore']
+    print(f"  Classification: {sg['classification']}")
+    print(f"  Confidence: {sg['confidence']:.2f}")
+    print("\nSummary:")
+    summary = results['summary']
+    print(f"  Is security anywhere: {summary['is_security_anywhere']}")
+    print(f"  Recommendation: {summary['recommendation']}")

src/config.py ADDED Viewed

	@@ -0,0 +1,70 @@

+"""
+Configuration module for the Crypto Compliance Intelligence Agent.
+Loads environment variables and provides centralized configuration.
+"""
+import os
+from dotenv import load_dotenv
+from pathlib import Path
+# Load environment variables from .env file
+load_dotenv()
+class Config:
+    """
+    Central configuration class for the application.
+    All configuration parameters are loaded from environment variables.
+    """
+    # API Keys
+    GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")
+    HF_TOKEN = os.getenv("HF_TOKEN")
+    GITHUB_TOKEN = os.getenv("GITHUB_TOKEN")
+    # Model Configuration
+    EMBEDDING_MODEL = os.getenv("EMBEDDING_MODEL", "sentence-transformers/all-MiniLM-L6-v2")
+    FINBERT_MODEL = os.getenv("FINBERT_MODEL", "ProsusAI/finbert")
+    LEGAL_BERT_MODEL = os.getenv("LEGAL_BERT_MODEL", "nlpaueb/legal-bert-base-uncased")
+    # ChromaDB Configuration
+    CHROMA_PERSIST_DIR = os.getenv("CHROMA_PERSIST_DIR", "./data/chroma_db")
+    COLLECTION_NAME = os.getenv("COLLECTION_NAME", "regulations_kb")
+    # Gemini Configuration
+    GEMINI_MODEL = os.getenv("GEMINI_MODEL", "gemini-2.0-flash-exp")
+    GEMINI_TEMPERATURE = float(os.getenv("GEMINI_TEMPERATURE", "0.1"))
+    GEMINI_MAX_TOKENS = int(os.getenv("GEMINI_MAX_TOKENS", "8192"))
+    # Application Settings
+    PROJECT_ROOT = Path(__file__).parent.parent
+    DATA_DIR = PROJECT_ROOT / "data"
+    REGULATIONS_DIR = DATA_DIR / "regulations"
+    # Jurisdictions supported
+    SUPPORTED_JURISDICTIONS = ["us", "eu", "singapore", "uk", "uae"]
+    # Logging Configuration
+    LOG_LEVEL = os.getenv("LOG_LEVEL", "INFO")
+    LOG_FORMAT = "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
+    @classmethod
+    def validate(cls):
+        """
+        Validate that all required configuration is present.
+        Raises ValueError if critical configuration is missing.
+        """
+        if not cls.GEMINI_API_KEY:
+            raise ValueError("GEMINI_API_KEY is required but not set in environment variables")
+        # Ensure required directories exist
+        cls.DATA_DIR.mkdir(exist_ok=True, parents=True)
+        cls.REGULATIONS_DIR.mkdir(exist_ok=True, parents=True)
+        for jurisdiction in cls.SUPPORTED_JURISDICTIONS:
+            jurisdiction_dir = cls.REGULATIONS_DIR / jurisdiction
+            jurisdiction_dir.mkdir(exist_ok=True, parents=True)
+        return True
+# Validate configuration on import
+Config.validate()

src/data/__init__.py ADDED Viewed

File without changes

src/data/__pycache__/__init__.cpython-313.pyc ADDED Viewed

Binary file (196 Bytes). View file

src/data/__pycache__/vectordb.cpython-313.pyc ADDED Viewed

Binary file (14.7 kB). View file

src/data/regulations/eu/mica-asset-referenced-tokens-2024.json ADDED Viewed

	@@ -0,0 +1,266 @@

+{
+  "id": "eu-mica-asset-referenced-tokens-2024",
+  "jurisdiction": "eu",
+  "agency": "ESMA (European Securities and Markets Authority) / MiCA Framework",
+  "title": "MiCA Regulation - Asset-Referenced Tokens (ARTs) for Real Estate",
+  "summary": "EU Markets in Crypto-Assets (MiCA) regulation requirements for asset-referenced tokens backed by real estate or other assets. Covers authorization requirements, reserve management, white paper publication, and ongoing supervision for ARTs. Effective June 30, 2024 with transitional period through December 2024.",
+  "status": "effective",
+  "announced_date": "2023-05-31",
+  "effective_date": "2024-06-30",
+  "last_updated": "2024-09-01",
+  "source_url": "https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32023R1114",
+  "full_text_url": "https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32023R1114",
+  "crypto_activities_affected": [
+    "tokenization",
+    "asset-referenced-tokens",
+    "issuance",
+    "custody",
+    "secondary-markets"
+  ],
+  "tags": [
+    "mica",
+    "asset-referenced-tokens",
+    "real-estate",
+    "authorization",
+    "white-paper",
+    "reserve-requirements",
+    "esma"
+  ],
+  "requirements": [
+    {
+      "requirement": "ART Issuer Authorization from National Competent Authority",
+      "description": "Must obtain authorization as ART issuer from national competent authority (NCA) in home member state (e.g., BaFin in Germany, AMF in France). Application requires: business plan, governance arrangements, risk management framework, IT systems description, recovery plan. Authorization process: 6-12 months. Application fee varies by country (€10K-50K). Issuers established in third countries must appoint legal representative in EU.",
+      "mandatory": true,
+      "deadline_days": 0,
+      "estimated_cost_usd": {
+        "min": 75000,
+        "max": 150000,
+        "currency": "EUR",
+        "notes": "NCA application fee (€10K-50K) + legal preparation (€50K-100K) + consultancy fees"
+      },
+      "severity": "critical",
+      "exemptions": ["Credit institutions and electronic money institutions already authorized under existing EU framework"]
+    },
+    {
+      "requirement": "Minimum Own Funds Requirement",
+      "description": "ART issuers must maintain own funds equal to greater of: (1) €350,000, OR (2) 2% of average amount of reserve assets, OR (3) 25% of fixed overheads of preceding year. Own funds must be held in cash or highly liquid financial instruments. For real estate-backed ARTs exceeding €100M circulation, requirement may increase to 3% of reserves. Annual audit required.",
+      "mandatory": true,
+      "deadline_days": 0,
+      "estimated_cost_usd": {
+        "min": 350000,
+        "max": 2000000,
+        "currency": "EUR",
+        "notes": "Minimum €350K but scales with reserve size - real estate ARTs typically €500K-2M"
+      },
+      "severity": "critical",
+      "exemptions": []
+    },
+    {
+      "requirement": "Crypto-Asset White Paper Publication and Notification",
+      "description": "Must prepare and publish white paper containing: (1) issuer information and governance, (2) detailed description of real estate assets backing tokens, (3) rights and obligations of token holders, (4) stabilization mechanism, (5) reserve management policy, (6) comprehensive risk warnings (minimum 15 categories), (7) environmental/climate impact. White paper must be notified to NCA 20 business days before publication. NCA may prohibit or suspend offering. White paper valid for 12 months. Filing fee: €5K-20K per jurisdiction.",
+      "mandatory": true,
+      "deadline_days": 60,
+      "estimated_cost_usd": {
+        "min": 40000,
+        "max": 100000,
+        "currency": "EUR",
+        "notes": "Legal drafting (€30K-70K) + NCA notification fees (€5K-20K) + translations for multi-country offerings"
+      },
+      "severity": "critical",
+      "exemptions": ["Offerings to qualified investors only (but must still file white paper with NCA)"]
+    },
+    {
+      "requirement": "Reserve of Assets Management and Segregation",
+      "description": "Must maintain reserve of assets with value equal to at least 100% of circulation value of ARTs. Reserve composition: (1) at least 30% in cash deposits at credit institutions, (2) remainder in highly liquid financial instruments with minimal credit risk. For real estate ARTs: property valuations must be updated quarterly by independent appraiser. Reserve assets must be legally segregated and held by authorized custodian. Daily reconciliation and monthly reports to NCA required. Reserve cannot be pledged or encumbered.",
+      "mandatory": true,
+      "deadline_days": 90,
+      "estimated_cost_usd": {
+        "min": 60000,
+        "max": 150000,
+        "currency": "EUR",
+        "notes": "Custodian setup + quarterly property valuations (€15K-40K each) + compliance systems + legal segregation structure"
+      },
+      "severity": "critical",
+      "exemptions": []
+    },
+    {
+      "requirement": "Redemption Rights and Mechanisms",
+      "description": "Token holders must have right to redeem ARTs for reserve assets or fiat equivalent at any time, at least once per month. Redemption price based on fair market value of underlying real estate (minus reasonable fees up to 5%). Must maintain liquid reserves (30% of total) to meet redemption requests. If redemptions exceed 25% of reserves in 7-day period, issuer may suspend redemptions for up to 3 months with NCA approval. Smart contract must enforce redemption rights automatically.",
+      "mandatory": true,
+      "deadline_days": 90,
+      "estimated_cost_usd": {
+        "min": 25000,
+        "max": 60000,
+        "currency": "EUR",
+        "notes": "Smart contract development for redemption mechanism + liquidity management systems"
+      },
+      "severity": "high",
+      "exemptions": []
+    },
+    {
+      "requirement": "Conflicts of Interest and Governance Requirements",
+      "description": "Must establish management body with at least 2 members (4 if ART circulation exceeds €100M). Independent audit committee required. Conflict of interest policy must address: related party transactions, property valuations, reserve management. AML/CFT compliance officer required (separate from management). Internal audit function required for issuers exceeding €50M circulation. Fit and proper assessments for all board members.",
+      "mandatory": true,
+      "deadline_days": 120,
+      "estimated_cost_usd": {
+        "min": 80000,
+        "max": 200000,
+        "currency": "EUR",
+        "notes": "First year: board member recruitment + compliance officer salary (€60K-120K) + governance setup + fit-and-proper assessments"
+      },
+      "severity": "high",
+      "exemptions": []
+    },
+    {
+      "requirement": "Marketing Communications and Advertising Rules",
+      "description": "All marketing must: (1) be clearly identifiable as such, (2) be fair, clear, not misleading, (3) include risk warnings, (4) state ART issuer name and authorization status, (5) reference white paper. Prohibited claims: capital protection, guaranteed returns, comparison to securities unless fair. Marketing via social media influencers must disclose commercial relationship. NCA may require pre-approval of marketing materials. Non-compliant ads subject to €5M fines or 3% of annual turnover.",
+      "mandatory": true,
+      "deadline_days": 45,
+      "estimated_cost_usd": {
+        "min": 20000,
+        "max": 50000,
+        "currency": "EUR",
+        "notes": "Legal review of all marketing materials + compliance procedures + risk warning templates"
+      },
+      "severity": "medium",
+      "exemptions": []
+    },
+    {
+      "requirement": "Orderly Wind-Down Plan and Recovery Arrangements",
+      "description": "Must maintain detailed plan for orderly wind-down in case of insolvency or license revocation. Plan must include: token redemption process, reserve asset liquidation timeline (realistic for real estate: 12-24 months), communication to token holders, appointment of liquidator. Recovery plan required for ARTs exceeding €100M circulation, including stress testing scenarios (property value decline, mass redemptions, custody failure). Annual plan updates required.",
+      "mandatory": true,
+      "deadline_days": 180,
+      "estimated_cost_usd": {
+        "min": 30000,
+        "max": 80000,
+        "currency": "EUR",
+        "notes": "Legal and financial advisors for wind-down and recovery planning + stress testing"
+      },
+      "severity": "high",
+      "exemptions": []
+    },
+    {
+      "requirement": "Reporting and Transparency Obligations",
+      "description": "Ongoing reporting requirements: (1) quarterly reports to NCA (reserve composition, token circulation, redemptions), (2) annual audited financial statements, (3) publish quarterly reserve attestation on website, (4) immediate notification of material events (property damage, valuation changes >10%, technical failures, security breaches). Token holder reporting: quarterly updates on property performance and valuations. Public disclosure of any changes to white paper within 24 hours.",
+      "mandatory": true,
+      "deadline_days": 0,
+      "estimated_cost_usd": {
+        "min": 40000,
+        "max": 100000,
+        "currency": "EUR",
+        "notes": "Annual ongoing: external audit (€20K-50K) + quarterly reporting + compliance monitoring + disclosure systems"
+      },
+      "severity": "medium",
+      "exemptions": []
+    },
+    {
+      "requirement": "Cybersecurity and Operational Resilience (DORA Compliance)",
+      "description": "Must comply with Digital Operational Resilience Act (DORA) requirements: (1) ICT risk management framework, (2) incident reporting (within 4 hours for major incidents), (3) digital operational resilience testing (including penetration tests), (4) third-party ICT risk management, (5) threat-led penetration testing (TLPT) for significant ARTs. DORA compliance deadline: January 17, 2025. Non-compliance penalties up to ���10M or 2% of global turnover.",
+      "mandatory": true,
+      "deadline_days": 365,
+      "estimated_cost_usd": {
+        "min": 75000,
+        "max": 200000,
+        "currency": "EUR",
+        "notes": "DORA compliance program + penetration testing + incident response systems + third-party risk management"
+      },
+      "severity": "high",
+      "exemptions": []
+    },
+    {
+      "requirement": "AML/CFT Compliance Under 6AMLD",
+      "description": "Must comply with EU's 6th Anti-Money Laundering Directive (6AMLD) and Transfer of Funds Regulation (TFR). Requirements: (1) customer due diligence (CDD) for all token holders, (2) enhanced due diligence (EDD) for high-risk customers (PEPs, high-value transactions >€1000), (3) transaction monitoring and suspicious activity reporting (SAR) to FIU, (4) Travel Rule compliance for transfers >€1000, (5) sanctions screening (EU, OFAC, UN), (6) record retention 5 years. AML officer and independent audit required.",
+      "mandatory": true,
+      "deadline_days": 120,
+      "estimated_cost_usd": {
+        "min": 60000,
+        "max": 150000,
+        "currency": "EUR",
+        "notes": "First year: AML/KYC system (€30K-70K) + compliance officer + training + transaction monitoring tools"
+      },
+      "severity": "critical",
+      "exemptions": []
+    }
+  ],
+  "penalties": [
+    {
+      "violation": "Operating Without ART Issuer Authorization",
+      "penalty_type": "Administrative Fines + Criminal Prosecution",
+      "amount_usd": {
+        "min": 500000,
+        "max": 5000000,
+        "notes": "Administrative fines up to €5M or 3% of total annual turnover + criminal prosecution in some member states (imprisonment up to 5 years) + disgorgement of profits"
+      },
+      "additional_consequences": [
+        "Immediate cease-and-desist order",
+        "Token holder redemption rights at issuance price",
+        "Criminal prosecution (varies by member state)",
+        "Permanent ban from crypto-asset activities in EU",
+        "Asset seizure and freezing orders"
+      ]
+    },
+    {
+      "violation": "Inadequate Reserve Management or Commingling",
+      "penalty_type": "Administrative Fines + License Revocation",
+      "amount_usd": {
+        "min": 250000,
+        "max": 2500000,
+        "notes": "Fines up to €2.5M or 2% of annual turnover + license revocation + mandatory wind-down + civil liability to token holders"
+      },
+      "additional_consequences": [
+        "Authorization revocation and mandatory wind-down",
+        "Civil liability for token holder losses",
+        "Enhanced supervision and restrictions",
+        "NCA-appointed special administrator"
+      ]
+    },
+    {
+      "violation": "False or Misleading White Paper Information",
+      "penalty_type": "Administrative Fines + Civil Liability",
+      "amount_usd": {
+        "min": 100000,
+        "max": 1000000,
+        "notes": "Fines up to €1M or 1% of annual turnover + civil liability to investors who relied on false information + potential criminal fraud charges"
+      },
+      "additional_consequences": [
+        "Suspension of ART offerings",
+        "Civil lawsuits from token holders (rescission + damages)",
+        "Mandatory white paper corrections and re-publication",
+        "Reputational damage and loss of authorization"
+      ]
+    },
+    {
+      "violation": "Failure to Comply with AML/CFT Requirements",
+      "penalty_type": "Administrative Fines + Criminal Penalties",
+      "amount_usd": {
+        "min": 500000,
+        "max": 5000000,
+        "notes": "AML fines up to €5M or 10% of annual turnover (whichever higher) + criminal prosecution for money laundering facilitation + license revocation"
+      },
+      "additional_consequences": [
+        "Criminal prosecution for money laundering (imprisonment up to 10 years)",
+        "Immediate license suspension or revocation",
+        "Financial intelligence unit (FIU) investigation",
+        "Permanent industry ban",
+        "Sanctions screening violations may trigger additional EU/UN penalties"
+      ]
+    }
+  ],
+  "regulatory_guidance": [
+    "MiCA distinguishes ARTs from e-money tokens (EMTs) - real estate tokens typically classified as ARTs",
+    "If ART is also deemed financial instrument under MiFID II, additional requirements may apply",
+    "Issuers may choose home member state for authorization (jurisdiction shopping permitted)",
+    "Once authorized in one EU member state, passporting rights allow operations across entire EU/EEA",
+    "Real estate ARTs face challenges with reserve management due to property illiquidity - NCAs may require higher liquid reserves (40-50%)",
+    "ESMA developing regulatory technical standards (RTS) for reserve management - expected Q1 2025",
+    "Transitional period until December 30, 2024 for existing issuers to achieve compliance"
+  ],
+  "related_regulations": [
+    "eu-mica-casp-requirements-2024",
+    "eu-dora-operational-resilience-2025",
+    "eu-6amld-aml-2024",
+    "eu-tfr-travel-rule-2024"
+  ],
+  "confidence": 0.92,
+  "notes": "MiCA's ART framework presents significant challenges for real estate tokenization due to illiquid nature of property assets versus required liquid reserves and monthly redemption rights. Total first-year compliance costs for real estate ART: €800K-2M. Ongoing annual costs: €300K-600K. Many real estate tokenization projects may instead pursue MiFID II 'financial instrument' classification or limit offerings to qualified investors to reduce compliance burden."
+}

src/data/regulations/schema.json ADDED Viewed

	@@ -0,0 +1,178 @@

+{
+  "$schema": "http://json-schema.org/draft-07/schema#",
+  "title": "Regulatory Rule Schema",
+  "description": "Schema for crypto regulatory rules across jurisdictions",
+  "type": "object",
+  "required": ["id", "title", "jurisdiction", "agency", "status", "summary"],
+  "properties": {
+    "id": {
+      "type": "string",
+      "description": "Unique identifier: {jurisdiction}-{year}-{short-name}",
+      "pattern": "^(us|eu|singapore|uk|uae)-[0-9]{4}-[a-z0-9-]+$",
+      "examples": ["us-2024-custody", "eu-2024-mica-implementation"]
+    },
+    "title": {
+      "type": "string",
+      "description": "Official title of the regulation",
+      "examples": ["Crypto Custody Rule", "Markets in Crypto-Assets Regulation"]
+    },
+    "jurisdiction": {
+      "type": "string",
+      "enum": ["us", "eu", "singapore", "uk", "uae"],
+      "description": "Jurisdiction where regulation applies"
+    },
+    "agency": {
+      "type": "string",
+      "description": "Regulatory agency issuing the rule",
+      "examples": ["SEC", "MiCA", "MAS", "FCA", "VARA"]
+    },
+    "status": {
+      "type": "string",
+      "enum": ["proposed", "effective", "repealed", "under_review"],
+      "description": "Current status of the regulation"
+    },
+    "announced_date": {
+      "type": "string",
+      "format": "date",
+      "description": "Date regulation was announced (ISO 8601)"
+    },
+    "comment_deadline": {
+      "type": "string",
+      "format": "date",
+      "description": "Deadline for public comments (if applicable)"
+    },
+    "effective_date": {
+      "type": "string",
+      "description": "When regulation becomes effective (ISO date or 'Q1 2025')"
+    },
+    "summary": {
+      "type": "string",
+      "description": "Brief summary of the regulation (2-3 sentences)"
+    },
+    "full_text_url": {
+      "type": "string",
+      "format": "uri",
+      "description": "URL to official regulation text"
+    },
+    "crypto_activities_affected": {
+      "type": "array",
+      "items": {
+        "type": "string",
+        "enum": [
+          "exchange",
+          "custody",
+          "staking",
+          "lending",
+          "borrowing",
+          "nft_marketplace",
+          "defi_protocol",
+          "payment_processing",
+          "token_issuance",
+          "mining",
+          "wallet_services",
+          "otc_trading",
+          "derivatives",
+          "dao"
+        ]
+      },
+      "description": "List of crypto activities this regulation affects"
+    },
+    "requirements": {
+      "type": "array",
+      "items": {
+        "type": "object",
+        "properties": {
+          "requirement": {
+            "type": "string",
+            "description": "Specific requirement text"
+          },
+          "applies_to": {
+            "type": "array",
+            "items": { "type": "string" },
+            "description": "Which activities this requirement applies to"
+          },
+          "deadline": {
+            "type": "string",
+            "description": "Deadline for compliance"
+          }
+        }
+      },
+      "description": "Specific compliance requirements"
+    },
+    "penalties": {
+      "type": "object",
+      "properties": {
+        "fines_min": {
+          "type": "number",
+          "description": "Minimum fine amount (USD)"
+        },
+        "fines_max": {
+          "type": "number",
+          "description": "Maximum fine amount (USD)"
+        },
+        "other_penalties": {
+          "type": "array",
+          "items": { "type": "string" },
+          "description": "Non-monetary penalties (e.g., 'Business shutdown', 'Criminal charges')"
+        }
+      },
+      "description": "Penalties for non-compliance"
+    },
+    "exemptions": {
+      "type": "array",
+      "items": {
+        "type": "object",
+        "properties": {
+          "exemption_type": { "type": "string" },
+          "description": { "type": "string" },
+          "eligibility_criteria": {
+            "type": "array",
+            "items": { "type": "string" }
+          }
+        }
+      },
+      "description": "Available exemptions or safe harbors"
+    },
+    "confidence": {
+      "type": "number",
+      "minimum": 0.0,
+      "maximum": 1.0,
+      "description": "Confidence score for this regulation data (0.0-1.0)"
+    },
+    "source": {
+      "type": "object",
+      "properties": {
+        "type": {
+          "type": "string",
+          "enum": ["official", "news", "analysis", "scraped"],
+          "description": "Source type"
+        },
+        "url": {
+          "type": "string",
+          "format": "uri"
+        },
+        "retrieved_date": {
+          "type": "string",
+          "format": "date-time"
+        }
+      },
+      "description": "Data source information"
+    },
+    "related_regulations": {
+      "type": "array",
+      "items": {
+        "type": "string",
+        "description": "IDs of related regulations"
+      },
+      "description": "References to related regulatory rules"
+    },
+    "tags": {
+      "type": "array",
+      "items": {
+        "type": "string"
+      },
+      "description": "Keywords/tags for categorization",
+      "examples": [["kyc", "aml", "reporting"], ["securities", "howey-test"]]
+    }
+  }
+}

src/data/regulations/singapore/mas-cmp-real-estate-tokens-2024.json ADDED Viewed

	@@ -0,0 +1,275 @@

+{
+  "id": "singapore-mas-cmp-real-estate-tokens-2024",
+  "jurisdiction": "singapore",
+  "agency": "MAS (Monetary Authority of Singapore)",
+  "title": "Capital Markets Products Framework for Real Estate Security Tokens",
+  "summary": "MAS regulatory framework for digital tokens representing interests in real estate under Securities and Futures Act (SFA). Covers licensing requirements for offers, custody, and secondary trading of real estate security tokens. Includes guidelines on prospectus requirements, exemptions for sophisticated investors, and DPT service licensing.",
+  "status": "effective",
+  "announced_date": "2020-01-28",
+  "effective_date": "2020-01-28",
+  "last_updated": "2024-08-30",
+  "source_url": "https://www.mas.gov.sg/regulation/capital-markets/digital-tokens",
+  "full_text_url": "https://www.mas.gov.sg/-/media/mas/regulations-and-financial-stability/regulations-guidance-and-licensing/securities-futures-and-fund-management/guidelines/a-guide-to-digital-token-offerings.pdf",
+  "crypto_activities_affected": [
+    "tokenization",
+    "securities-offering",
+    "custody",
+    "secondary-trading",
+    "payment-services"
+  ],
+  "tags": [
+    "capital-markets-products",
+    "security-tokens",
+    "real-estate",
+    "sfa",
+    "prospectus",
+    "cms-license",
+    "dpt-services"
+  ],
+  "requirements": [
+    {
+      "requirement": "Capital Markets Product (CMP) Classification and Legal Opinion",
+      "description": "Real estate tokens representing ownership, profit-sharing, or derivative interests are regulated as 'capital markets products' (CMP) under SFA. Must obtain legal opinion from Singapore law firm confirming: (1) token is CMP (typically 'units in collective investment scheme' or 'debentures'), (2) appropriate exemptions apply, (3) compliance roadmap. Legal opinion required for MAS submissions. Cost: SGD $30K-60K ($22K-45K USD).",
+      "mandatory": true,
+      "deadline_days": 0,
+      "estimated_cost_usd": {
+        "min": 22000,
+        "max": 45000,
+        "currency": "USD",
+        "notes": "Singapore securities law firm legal opinion on CMP classification"
+      },
+      "severity": "critical",
+      "exemptions": ["Tokens that are pure payment tokens or utility tokens (not investment products)"]
+    },
+    {
+      "requirement": "Capital Markets Services (CMS) License for Issuance Activities",
+      "description": "Entities making offers of real estate security tokens must hold CMS license for 'dealing in capital markets products' OR rely on prospectus exemptions. CMS license application requires: (1) SGD $250K base capital + SGD $150K liquid assets, (2) fit and proper directors/shareholders, (3) business plan, (4) compliance arrangements, (5) office in Singapore. Application process: 6-9 months. Application fee: SGD $10K. Annual license fee: SGD $8K-15K based on activity.",
+      "mandatory": true,
+      "deadline_days": 0,
+      "estimated_cost_usd": {
+        "min": 300000,
+        "max": 400000,
+        "currency": "USD",
+        "notes": "SGD $250K base capital + SGD $150K liquid capital + SGD $10K application fee + legal/consultancy (SGD $50K-100K)"
+      },
+      "severity": "critical",
+      "exemptions": [
+        "Private placement to institutional/accredited investors only (up to 50 persons in 12 months)",
+        "Offers via licensed intermediary who holds CMS license"
+      ]
+    },
+    {
+      "requirement": "Prospectus Registration or Exemption Reliance",
+      "description": "Public offers of real estate security tokens require prospectus registered with MAS unless exempt. Prospectus must include: property details and valuations, financial forecasts, risk factors (minimum 20 factors), legal structure, use of proceeds, management bios, audited financials. Registration process: 3-6 months. Most issuers rely on exemptions: (1) offers to institutional/accredited investors only, (2) private placement exemption (up to 50 persons), (3) small offers exemption (up to SGD $5M in 12 months). Exemption reliance requires filing Form 45 with MAS.",
+      "mandatory": true,
+      "deadline_days": 14,
+      "estimated_cost_usd": {
+        "min": 8000,
+        "max": 80000,
+        "currency": "USD",
+        "notes": "If full prospectus: SGD $80K-120K. If exemption: Form 45 filing + legal (SGD $10K-15K)"
+      },
+      "severity": "critical",
+      "exemptions": [
+        "Offers to institutional investors only",
+        "Offers to accredited investors (individual net assets >SGD $2M OR income >SGD $300K)",
+        "Private placement (≤50 persons in 12 months)"
+      ]
+    },
+    {
+      "requirement": "Digital Payment Token (DPT) Service License (if applicable)",
+      "description": "If real estate tokens can be used for payment or exchange (not purely investment), may require DPT service license under Payment Services Act. Activities requiring DPT license: (1) operating exchange for DPT trading, (2) facilitating DPT transfers, (3) providing DPT custody wallet services. License requires: SGD $250K base capital (higher for exchange/custody: up to SGD $1.5M), fit and proper, AML/CFT compliance, technology risk management. Application: 6-12 months. Most real estate tokens qualify for exemption as they are purely securities, not payment instruments.",
+      "mandatory": false,
+      "deadline_days": 90,
+      "estimated_cost_usd": {
+        "min": 200000,
+        "max": 1200000,
+        "currency": "USD",
+        "notes": "If DPT license required: SGD $250K-1.5M capital + application and compliance costs (SGD $50K-100K)"
+      },
+      "severity": "high",
+      "exemptions": ["Security tokens used solely for investment (not payment) are exempt from PSA"]
+    },
+    {
+      "requirement": "Technology Risk Management (TRM) Guidelines Compliance",
+      "description": "Must comply with MAS Technology Risk Management Guidelines including: (1) cybersecurity controls and testing, (2) system availability targets (>99.5% for critical systems), (3) data security and encryption, (4) incident management and reporting (notify MAS within 1 hour for severe incidents), (5) business continuity plan (RTO <4 hours), (6) change management procedures, (7) third-party vendor risk management. Annual independent audit required. TRM non-compliance can result in license suspension.",
+      "mandatory": true,
+      "deadline_days": 180,
+      "estimated_cost_usd": {
+        "min": 40000,
+        "max": 100000,
+        "currency": "USD",
+        "notes": "First year: cybersecurity assessment + penetration testing + incident response + BCP development. Annual: SGD $30K-60K"
+      },
+      "severity": "high",
+      "exemptions": []
+    },
+    {
+      "requirement": "Anti-Money Laundering and Countering Financing of Terrorism (AML/CFT)",
+      "description": "Must comply with MAS Notice SFA04-N02 (AML/CFT for Capital Markets) including: (1) customer due diligence (CDD) - verify identity, source of funds, (2) enhanced due diligence (EDD) for high-risk customers (PEPs, countries on FATF list, transactions >SGD $20K), (3) ongoing monitoring and transaction screening, (4) suspicious transaction reporting (STR) to STRO within 15 days, (5) record keeping (6 years), (6) AML/CFT officer appointment, (7) regular staff training (minimum annually). Independent audit every 2 years.",
+      "mandatory": true,
+      "deadline_days": 90,
+      "estimated_cost_usd": {
+        "min": 35000,
+        "max": 85000,
+        "currency": "USD",
+        "notes": "First year: AML/CFT system setup (SGD $25K-50K) + compliance officer + training + screening tools. Annual ongoing: SGD $20K-40K"
+      },
+      "severity": "critical",
+      "exemptions": []
+    },
+    {
+      "requirement": "Custody and Safekeeping Arrangements (if providing custody)",
+      "description": "If issuer provides custody of tokens (holding private keys on behalf of investors), must either: (1) obtain CMS license for 'providing custodian services for securities', OR (2) appoint licensed custodian. Licensed custody requires: SGD $1M base capital + SGD $500K liquid capital, segregation of client assets, insurance coverage (minimum SGD $1M or 5% of AUM), annual audit, cybersecurity controls. Alternative: use third-party licensed custodian (fees 0.1-0.5% of AUM annually).",
+      "mandatory": true,
+      "deadline_days": 120,
+      "estimated_cost_usd": {
+        "min": 50000,
+        "max": 1200000,
+        "currency": "USD",
+        "notes": "If self-custody: SGD $1.5M capital + compliance infrastructure. If third-party: integration + fees (0.1-0.5% AUM)"
+      },
+      "severity": "high",
+      "exemptions": ["If investors hold their own private keys (non-custodial model) - but must provide clear disclosures"]
+    },
+    {
+      "requirement": "Approved Exchange or Recognized Market Operator (if secondary trading)",
+      "description": "Secondary trading of real estate security tokens must occur on: (1) Approved Exchange (AE) licensed by MAS, OR (2) Recognized Market Operator (RMO), OR (3) exempt organized market. Operating unlicensed exchange is criminal offense. AE/RMO license requires: SGD $5M base capital, technology infrastructure, market surveillance, clearing arrangements, business rules approved by MAS. Application: 12-24 months. Most issuers restrict secondary trading or partner with licensed exchanges like iSTOX, Fundnel.",
+      "mandatory": true,
+      "deadline_days": 0,
+      "estimated_cost_usd": {
+        "min": 30000,
+        "max": 5000000,
+        "currency": "USD",
+        "notes": "If operate own exchange: SGD $5M+ capital + infrastructure. If use third-party: integration fees (SGD $30K-100K) + listing fees"
+      },
+      "severity": "high",
+      "exemptions": ["Transfers restricted to original purchasers/affiliates", "Over-the-counter bilateral transfers (must still comply with securities laws)"]
+    },
+    {
+      "requirement": "Advertising and Marketing Guidelines",
+      "description": "All marketing materials must comply with MAS FAA Notice FAA-N03 (Advertising Guidelines). Requirements: (1) fair, balanced, not misleading, (2) risk warnings prominently displayed, (3) past performance disclaimers, (4) avoid unsubstantiated claims or guarantees, (5) specify target investor type (retail/accredited/institutional), (6) include license number and regulatory status. Marketing to retail investors requires additional disclosures and may require prospectus. Social media posts and influencer marketing subject to same rules. Non-compliant ads can result in SGD $50K-250K fines.",
+      "mandatory": true,
+      "deadline_days": 30,
+      "estimated_cost_usd": {
+        "min": 15000,
+        "max": 40000,
+        "currency": "USD",
+        "notes": "Legal review of all marketing materials + compliance procedures + staff training"
+      },
+      "severity": "medium",
+      "exemptions": []
+    },
+    {
+      "requirement": "Continuous Disclosure and Ongoing Reporting",
+      "description": "Ongoing obligations after offering: (1) material event disclosure within 24 hours (property damage, valuation changes >15%, management changes, breaches), (2) semi-annual unaudited financial statements, (3) annual audited financial statements within 5 months of FY-end, (4) annual property valuations by independent valuer, (5) quarterly updates to token holders on property performance. If token holders exceed 50, must appoint share registrar. Records must be kept for 5 years. Failure to disclose material events can trigger civil liability.",
+      "mandatory": true,
+      "deadline_days": 0,
+      "estimated_cost_usd": {
+        "min": 30000,
+        "max": 80000,
+        "currency": "USD",
+        "notes": "Annual ongoing: audit (SGD $20K-50K) + property valuation (SGD $10K-30K) + reporting systems + compliance staff"
+      },
+      "severity": "medium",
+      "exemptions": []
+    },
+    {
+      "requirement": "Product Due Diligence and Suitability Assessment",
+      "description": "If offering to non-institutional investors, must conduct: (1) product due diligence to assess risks and suitability, (2) customer knowledge assessment (KYA) to understand investor profile, (3) suitability assessment matching product to customer, (4) enhanced warnings for complex products or retail investors, (5) cooling-off period (7 days for retail investors). Must document all assessments. Mis-selling can result in investor restitution orders + MAS penalties (up to SGD $2M).",
+      "mandatory": true,
+      "deadline_days": 0,
+      "estimated_cost_usd": {
+        "min": 20000,
+        "max": 50000,
+        "currency": "USD",
+        "notes": "Suitability assessment system development + compliance procedures + staff training"
+      },
+      "severity": "high",
+      "exemptions": ["Offers to institutional and accredited investors only (exempt from suitability requirements)"]
+    }
+  ],
+  "penalties": [
+    {
+      "violation": "Unlicensed Capital Markets Services (Unauthorized Dealing)",
+      "penalty_type": "Criminal Prosecution + Civil Penalties",
+      "amount_usd": {
+        "min": 75000,
+        "max": 150000,
+        "notes": "Criminal fine up to SGD $150K OR imprisonment up to 3 years + civil penalties up to SGD $2M + disgorgement of profits + investor restitution"
+      },
+      "additional_consequences": [
+        "Criminal conviction and imprisonment (up to 3 years)",
+        "Civil penalty orders up to SGD $2M",
+        "Director disqualification orders (up to 5 years)",
+        "Investor rescission rights (full refund + interest)",
+        "Permanent ban from financial services industry in Singapore"
+      ]
+    },
+    {
+      "violation": "False or Misleading Prospectus/Offering Document",
+      "penalty_type": "Criminal Prosecution + Civil Liability",
+      "amount_usd": {
+        "min": 112500,
+        "max": 225000,
+        "notes": "Criminal fine up to SGD $150K OR imprisonment up to 2 years + civil compensation to investors + MAS enforcement action"
+      },
+      "additional_consequences": [
+        "Criminal prosecution (up to 2 years imprisonment)",
+        "Civil liability to all investors who relied on document",
+        "MAS prohibition orders preventing future fundraising",
+        "Director personal liability for losses",
+        "Reputational damage and business closure"
+      ]
+    },
+    {
+      "violation": "AML/CFT Breaches or Inadequate Controls",
+      "penalty_type": "Civil Penalties + License Revocation",
+      "amount_usd": {
+        "min": 75000,
+        "max": 750000,
+        "notes": "Civil penalties up to SGD $1M + license suspension or revocation + criminal prosecution for willful breaches (up to SGD $500K fine + 10 years imprisonment)"
+      },
+      "additional_consequences": [
+        "License suspension (30-90 days) or revocation",
+        "Criminal prosecution for willful AML violations",
+        "Enhanced supervision and remediation orders",
+        "Mandatory independent compliance audit at issuer's expense",
+        "Reputational damage and loss of banking relationships"
+      ]
+    },
+    {
+      "violation": "Operating Unlicensed Exchange or Market",
+      "penalty_type": "Criminal Prosecution + Shutdown Order",
+      "amount_usd": {
+        "min": 37500,
+        "max": 112500,
+        "notes": "Criminal fine up to SGD $150K OR imprisonment up to 2 years + immediate shutdown order + disgorgement of trading fees + investor restitution"
+      },
+      "additional_consequences": [
+        "Criminal conviction and imprisonment (up to 2 years)",
+        "Immediate cease-and-desist and platform shutdown",
+        "Asset freezing and seizure orders",
+        "Disgorgement of all trading fees and profits",
+        "Permanent industry ban"
+      ]
+    }
+  ],
+  "regulatory_guidance": [
+    "MAS 'Guide to Digital Token Offerings' (2020) is primary guidance for token classification",
+    "Real estate tokens typically classified as 'units in collective investment scheme' or 'debentures' under SFA",
+    "Most issuers rely on private placement exemption (≤50 investors) or accredited investor exemption to avoid prospectus",
+    "MAS takes substance-over-form approach - classification based on economic reality, not label",
+    "Secondary trading restrictions common - many issuers prohibit transfers or limit to accredited investors",
+    "Singapore has several licensed digital securities platforms: iSTOX, Fundnel, ADDX - recommended to partner rather than build own",
+    "MAS Project Guardian exploring use cases for tokenized assets including real estate - potential regulatory sandbox participation",
+    "Singapore-based issuers targeting global investors must comply with both SFA and foreign securities laws"
+  ],
+  "related_regulations": [
+    "singapore-mas-psa-dpt-2024",
+    "singapore-mas-cms-custody-2024",
+    "singapore-mas-trm-guidelines-2024"
+  ],
+  "confidence": 0.94,
+  "notes": "Singapore's regulatory framework for real estate security tokens is well-developed and clear. Total first-year costs for compliant real estate token offering: SGD $500K-1.5M (USD $375K-1.1M) depending on licensing path. Private placements to accredited investors offer most cost-effective route (SGD $100K-200K compliance costs). Retail offerings require full prospectus and CMS license (SGD $800K-1.5M). Singapore's Project Guardian and supportive regulatory approach make it attractive jurisdiction for tokenization pilots."
+}

src/data/regulations/uae/vara-sto-real-estate-2024.json ADDED Viewed

	@@ -0,0 +1,229 @@

+{
+  "id": "uae-vara-sto-real-estate-2024",
+  "jurisdiction": "uae",
+  "agency": "VARA (Virtual Asset Regulatory Authority)",
+  "title": "Security Token Offering (STO) Requirements for Real Estate Tokenization",
+  "summary": "VARA regulations for tokenizing real estate assets through security token offerings in Dubai. Covers licensing, capital requirements, property valuation, and investor protection measures for fractionalized real estate ownership.",
+  "status": "effective",
+  "announced_date": "2023-06-15",
+  "effective_date": "2023-10-01",
+  "last_updated": "2024-08-20",
+  "source_url": "https://www.vara.ae/en/regulations/sto-framework",
+  "full_text_url": "https://www.vara.ae/media/regulations/sto-guidance.pdf",
+  "crypto_activities_affected": [
+    "tokenization",
+    "custody",
+    "exchange",
+    "advisory"
+  ],
+  "tags": [
+    "security-tokens",
+    "real-estate",
+    "sto",
+    "licensing",
+    "capital-requirements",
+    "property-valuation"
+  ],
+  "requirements": [
+    {
+      "requirement": "VARA STO License Application",
+      "description": "Entities offering security tokens backed by real estate must obtain a VARA Security Token Offering (STO) License, not the standard Virtual Asset License. Application process takes 12-18 months and requires detailed business plan, technical infrastructure assessment, and compliance framework.",
+      "mandatory": true,
+      "deadline_days": 0,
+      "estimated_cost_usd": {
+        "min": 50000,
+        "max": 75000,
+        "currency": "USD",
+        "notes": "License application fee: AED 50,000 + legal preparation AED 100,000-200,000"
+      },
+      "severity": "critical",
+      "exemptions": []
+    },
+    {
+      "requirement": "Minimum Capital Requirement",
+      "description": "AED 2,000,000 (approximately USD $545,000) minimum paid-up capital required for STO license holders. Capital must be held in UAE-based bank and cannot be used for operational expenses during first 12 months.",
+      "mandatory": true,
+      "deadline_days": 0,
+      "estimated_cost_usd": {
+        "min": 545000,
+        "max": 545000,
+        "currency": "USD",
+        "notes": "AED 2M regulatory capital requirement"
+      },
+      "severity": "critical",
+      "exemptions": []
+    },
+    {
+      "requirement": "Independent Property Valuation",
+      "description": "All real estate assets must be independently valued by VARA-approved property valuers within 90 days of tokenization. Revaluation required annually. Valuation report must include: fair market value, rental income analysis, comparable sales, and tokenization suitability assessment.",
+      "mandatory": true,
+      "deadline_days": 90,
+      "estimated_cost_usd": {
+        "min": 25000,
+        "max": 75000,
+        "currency": "USD",
+        "notes": "Per property valuation: AED 25K-75K depending on property value. Annual revaluation: AED 15K-40K"
+      },
+      "severity": "high",
+      "exemptions": []
+    },
+    {
+      "requirement": "Token Structure Documentation",
+      "description": "Detailed token economics documentation required: rights attached to tokens (voting, dividend, redemption), smart contract audit by VARA-approved auditor, token supply management, and lock-up periods for founding team and insiders.",
+      "mandatory": true,
+      "deadline_days": 60,
+      "estimated_cost_usd": {
+        "min": 40000,
+        "max": 100000,
+        "currency": "USD",
+        "notes": "Smart contract audit: AED 50K-150K + legal documentation: AED 75K-200K"
+      },
+      "severity": "high",
+      "exemptions": []
+    },
+    {
+      "requirement": "Investor Protection Framework",
+      "description": "Implement investor protection measures including: minimum investment thresholds (AED 50,000 for retail, AED 500,000 for qualified investors), suitability assessments, risk disclosure documents (minimum 25 pages), cooling-off period (14 days), and investor complaint resolution mechanism.",
+      "mandatory": true,
+      "deadline_days": 90,
+      "estimated_cost_usd": {
+        "min": 30000,
+        "max": 60000,
+        "currency": "USD",
+        "notes": "Documentation preparation + compliance systems setup"
+      },
+      "severity": "high",
+      "exemptions": []
+    },
+    {
+      "requirement": "AML/CTF Compliance Program",
+      "description": "Comprehensive Anti-Money Laundering and Counter-Terrorist Financing program aligned with FATF recommendations. Must include: KYC procedures, transaction monitoring system, suspicious activity reporting (SAR), record retention (7 years), and annual independent audit.",
+      "mandatory": true,
+      "deadline_days": 120,
+      "estimated_cost_usd": {
+        "min": 75000,
+        "max": 150000,
+        "currency": "USD",
+        "notes": "First year: system setup + compliance officer + training. Annual ongoing: AED 100K-200K"
+      },
+      "severity": "critical",
+      "exemptions": []
+    },
+    {
+      "requirement": "Custody and Escrow Arrangements",
+      "description": "Property title deeds must be held in escrow by UAE-licensed escrow agent. Digital tokens must be custodied by VARA-licensed custodian with insurance coverage (minimum 125% of token value). Multi-signature wallet arrangements required with at least 3-of-5 key holders.",
+      "mandatory": true,
+      "deadline_days": 60,
+      "estimated_cost_usd": {
+        "min": 50000,
+        "max": 100000,
+        "currency": "USD",
+        "notes": "Escrow setup + custody fees (0.5-1% annually of AUM) + insurance premiums"
+      },
+      "severity": "high",
+      "exemptions": []
+    },
+    {
+      "requirement": "Marketing and Disclosure Requirements",
+      "description": "All marketing materials must be pre-approved by VARA (15-day review period). Required disclosures: property location and details, token economics, historical performance, fees and charges, liquidity risks, property management details, and regulatory status. False or misleading statements prohibited (penalties up to AED 10M).",
+      "mandatory": true,
+      "deadline_days": 45,
+      "estimated_cost_usd": {
+        "min": 20000,
+        "max": 40000,
+        "currency": "USD",
+        "notes": "Legal review + marketing compliance + VARA review fees"
+      },
+      "severity": "medium",
+      "exemptions": []
+    },
+    {
+      "requirement": "Ongoing Reporting Obligations",
+      "description": "Quarterly financial reports, semi-annual property performance updates, annual audited financial statements, token holder registry updates (within 5 business days of transfers), and material event notifications (within 24 hours). All reports submitted via VARA's digital portal.",
+      "mandatory": true,
+      "deadline_days": 0,
+      "estimated_cost_usd": {
+        "min": 40000,
+        "max": 80000,
+        "currency": "USD",
+        "notes": "Annual ongoing: compliance staff + audit fees + reporting systems"
+      },
+      "severity": "medium",
+      "exemptions": []
+    },
+    {
+      "requirement": "Technology and Cybersecurity Standards",
+      "description": "ISO 27001 certification required within 12 months of license grant. Systems must have: penetration testing (quarterly), disaster recovery plan (RTO < 4 hours), encryption of all customer data, and incident response procedures. Annual external cybersecurity audit mandatory.",
+      "mandatory": true,
+      "deadline_days": 365,
+      "estimated_cost_usd": {
+        "min": 60000,
+        "max": 120000,
+        "currency": "USD",
+        "notes": "ISO certification + penetration testing + ongoing security measures"
+      },
+      "severity": "high",
+      "exemptions": []
+    }
+  ],
+  "penalties": [
+    {
+      "violation": "Operating without STO License",
+      "penalty_type": "Administrative fine + Criminal prosecution",
+      "amount_usd": {
+        "min": 270000,
+        "max": 2700000,
+        "notes": "AED 1M-10M fine + possible imprisonment (up to 10 years) + asset seizure"
+      },
+      "additional_consequences": [
+        "Permanent ban from UAE virtual asset sector",
+        "Seizure of all tokenized assets",
+        "Investor restitution orders",
+        "Public disclosure of violation"
+      ]
+    },
+    {
+      "violation": "Inadequate AML/CTF controls",
+      "penalty_type": "Administrative fine + Remediation order",
+      "amount_usd": {
+        "min": 135000,
+        "max": 1350000,
+        "notes": "AED 500K-5M depending on severity + mandatory compliance officer replacement"
+      },
+      "additional_consequences": [
+        "Enhanced supervision (12-24 months)",
+        "License suspension (30-90 days)",
+        "Mandatory independent compliance review"
+      ]
+    },
+    {
+      "violation": "False or misleading marketing",
+      "penalty_type": "Administrative fine + Corrective disclosure",
+      "amount_usd": {
+        "min": 54000,
+        "max": 540000,
+        "notes": "AED 200K-2M + mandatory corrective advertising at issuer's expense"
+      },
+      "additional_consequences": [
+        "Marketing pre-approval required for 24 months",
+        "Investor compensation for losses",
+        "Public censure"
+      ]
+    }
+  ],
+  "regulatory_guidance": [
+    "VARA considers real estate tokenization as securities offering, not virtual asset trading",
+    "Tokens representing fractional property ownership are 'Capital Market Products' under UAE law",
+    "Cross-border offerings require additional approvals from UAE Securities and Commodities Authority (SCA)",
+    "Property ownership transfer must comply with Dubai Land Department (DLD) requirements",
+    "Rental income distribution to token holders subject to UAE corporate tax (9% from June 2023)"
+  ],
+  "related_regulations": [
+    "uae-vara-custody-2023",
+    "uae-sca-securities-2020",
+    "uae-dld-property-tokenization-2024",
+    "uae-aml-ctf-2022"
+  ],
+  "confidence": 0.95,
+  "notes": "Real estate tokenization in UAE requires coordination between VARA (digital asset regulation), SCA (securities regulation), and DLD (property registration). Most stringent requirements apply to retail investor offerings. Qualified investor offerings may have reduced requirements."
+}

src/data/regulations/uk/fca-cis-property-tokens-2024.json ADDED Viewed

	@@ -0,0 +1,282 @@

+{
+  "id": "uk-fca-cis-property-tokens-2024",
+  "jurisdiction": "uk",
+  "agency": "FCA (Financial Conduct Authority)",
+  "title": "Collective Investment Schemes Regime for Tokenized Property",
+  "summary": "FCA regulatory framework for tokenized real estate under collective investment schemes (CIS) regime and Financial Services and Markets Act 2000 (FSMA). Covers authorization requirements, prospectus rules, restrictions on retail promotion, and ongoing supervision. Applies to tokens representing fractional property ownership offered to UK investors.",
+  "status": "effective",
+  "announced_date": "2019-01-23",
+  "effective_date": "2019-01-23",
+  "last_updated": "2024-10-01",
+  "source_url": "https://www.fca.org.uk/publication/policy/ps19-22.pdf",
+  "full_text_url": "https://www.fca.org.uk/firms/financial-promotions-regime",
+  "crypto_activities_affected": [
+    "tokenization",
+    "collective-investment-schemes",
+    "securities-offering",
+    "custody",
+    "financial-promotions"
+  ],
+  "tags": [
+    "collective-investment-schemes",
+    "real-estate",
+    "security-tokens",
+    "specified-investments",
+    "financial-promotions",
+    "prospectus",
+    "fca-authorization"
+  ],
+  "requirements": [
+    {
+      "requirement": "Collective Investment Scheme (CIS) Classification Assessment",
+      "description": "Tokenized real estate structures representing pooled investments where investors do not have day-to-day control are likely 'collective investment schemes' (CIS) under FSMA Section 235. CIS characteristics: (1) participants pool contributions, (2) property acquired/managed as whole, (3) participants do not have day-to-day control, (4) profits/income pooled and shared. Must obtain legal opinion from UK solicitor confirming CIS status and regulatory treatment. Cost: £25K-50K. Alternative structures: direct property ownership tokens (not CIS) or unregulated alternative investment fund (but restricted to sophisticated/high net worth investors only).",
+      "mandatory": true,
+      "deadline_days": 0,
+      "estimated_cost_usd": {
+        "min": 31000,
+        "max": 62000,
+        "currency": "USD",
+        "notes": "UK law firm legal opinion on CIS classification and FSMA compliance roadmap (£25K-50K at £1.24/GBP)"
+      },
+      "severity": "critical",
+      "exemptions": ["Single property direct ownership tokens where investors have day-to-day control (not CIS)"]
+    },
+    {
+      "requirement": "FCA Authorization as CIS Operator or AIFM",
+      "description": "Operating a CIS requires FCA authorization as: (1) Operator of CIS (if UCITS-compliant fund), OR (2) Alternative Investment Fund Manager (AIFM) if Alternative Investment Fund (AIF). Real estate funds typically AIFs. AIFM authorization requires: £125K initial capital (full-scope) or £50K (sub-threshold), fit and proper senior managers, compliance and risk functions, depositaries, valuation procedures, AIFMD compliance. Application process: 6-12 months. Application fee: £25K. Annual fees: £10K-50K based on AUM.",
+      "mandatory": true,
+      "deadline_days": 0,
+      "estimated_cost_usd": {
+        "min": 155000,
+        "max": 310000,
+        "currency": "USD",
+        "notes": "£125K regulatory capital + £25K application fee + legal/consultancy (£50K-100K). Sub-threshold AIFM only £50K capital if AUM <£100M."
+      },
+      "severity": "critical",
+      "exemptions": [
+        "Sub-threshold AIFM (AUM <€100M for unleveraged funds OR <€500M for leveraged funds) - reduced capital and disclosure requirements",
+        "Marketing only to professional/high net worth investors (still need authorization but simplified process)"
+      ]
+    },
+    {
+      "requirement": "Prospectus or Exempted Document",
+      "description": "Offers of CIS units to UK public require prospectus approved by FCA unless exempt. Prospectus must comply with UK Prospectus Regulation including: property details and independent valuations, financial information (3 years audited financials), risk factors (minimum 20 categories), management team, use of proceeds, token structure and rights, taxation. Prospectus review and approval: 3-6 months. Most tokenized property offerings use exemptions: (1) qualified investors only, (2) offer to <150 persons (excluding qualified investors), (3) minimum investment ≥£100K, (4) total consideration <€8M in 12 months. Exempt offers still require clear and compliant information memorandum.",
+      "mandatory": true,
+      "deadline_days": 60,
+      "estimated_cost_usd": {
+        "min": 12400,
+        "max": 124000,
+        "currency": "USD",
+        "notes": "Full prospectus: £80K-100K. Exempt offer information memorandum: £10K-30K"
+      },
+      "severity": "critical",
+      "exemptions": [
+        "Offers to professional/qualified investors only",
+        "Offers to <150 persons (excluding qualified investors) per 12 months",
+        "Minimum investment ≥£100,000 per investor",
+        "Total consideration <€8M in 12 months"
+      ]
+    },
+    {
+      "requirement": "Financial Promotions Approval and Restrictions",
+      "description": "All financial promotions (marketing) for CIS units must be: (1) approved by FCA-authorized firm, OR (2) fall within exemption (e.g., to certified high net worth, sophisticated investors, or investment professionals only). Retail promotion of real estate tokens is PROHIBITED since October 2023 under FCA PS23/6 unless: (a) promoted as security token admitted to trading on recognized investment exchange, OR (b) issuer holds relevant FCA permissions. Promotion to retail without approval: criminal offense (up to 2 years imprisonment + unlimited fines). High net worth certification: individual net assets >£250K (excluding primary residence) OR income >£100K.",
+      "mandatory": true,
+      "deadline_days": 0,
+      "estimated_cost_usd": {
+        "min": 18600,
+        "max": 62000,
+        "currency": "USD",
+        "notes": "FCA-authorized firm approval fees (£5K-20K per campaign) + legal review (£10K-30K) + compliance procedures"
+      },
+      "severity": "critical",
+      "exemptions": [
+        "Promotions to certified high net worth individuals (net assets >£250K or income >£100K)",
+        "Promotions to certified sophisticated investors (self-certified knowledge and experience)",
+        "Promotions to investment professionals only"
+      ]
+    },
+    {
+      "requirement": "Depositary Appointment (AIFMD Requirement)",
+      "description": "AIFs must appoint eligible depositary (credit institution, MiFID investment firm, or authorized AIFM depositary) to: (1) hold scheme assets or verify ownership, (2) monitor cash flows, (3) oversight of valuation, (4) carry out depositary's instructions unless unlawful. Depositary liable for loss of financial instruments held in custody. For tokenized real estate: depositary holds property title deeds and oversees token ledger integrity. Depositary fees: 0.02-0.10% of NAV annually (minimum £25K-50K annually). Depositary agreement and appointment required before fund launch.",
+      "mandatory": true,
+      "deadline_days": 90,
+      "estimated_cost_usd": {
+        "min": 31000,
+        "max": 124000,
+        "currency": "USD",
+        "notes": "Annual depositary fees (£25K-100K depending on NAV) + setup and legal agreements (£10K-20K)"
+      },
+      "severity": "high",
+      "exemptions": ["Sub-threshold AIFMs with simplified AIFMD compliance may have reduced depositary requirements"]
+    },
+    {
+      "requirement": "Independent Valuation and Valuer Appointment",
+      "description": "Real estate assets must be independently valued: (1) before initial investment or property acquisition, (2) at least annually thereafter, (3) when material event affects valuation (damage, rezoning, market changes >10%). Valuer must be: (a) independent external valuer (e.g., RICS-qualified surveyor), OR (b) internal valuer functionally independent of portfolio management. Valuation must follow RICS Red Book standards. Valuation reports must be provided to depositary and disclosed to investors. Valuation frequency for daily-traded funds: monthly. Cost: £5K-30K per property per valuation.",
+      "mandatory": true,
+      "deadline_days": 90,
+      "estimated_cost_usd": {
+        "min": 12400,
+        "max": 62000,
+        "currency": "USD",
+        "notes": "Initial valuations (£10K-30K per property) + annual revaluations (£5K-20K). Multi-property portfolios: £50K-100K annually."
+      },
+      "severity": "high",
+      "exemptions": []
+    },
+    {
+      "requirement": "Senior Managers and Certification Regime (SMCR) Compliance",
+      "description": "AIFM must comply with SMCR including: (1) identify and allocate Senior Management Functions (SMFs) - e.g., CEO, compliance oversight, money laundering reporting officer (MLRO), (2) obtain regulatory approval for SMF appointments (3-6 months), (3) certify other staff performing Certification Functions annually, (4) implement Conduct Rules for all staff, (5) maintain records (responsibilities maps, handover procedures). Fit and proper assessments for all SMFs. Penalties for SMCR breaches: up to £1M per individual + prohibition orders.",
+      "mandatory": true,
+      "deadline_days": 120,
+      "estimated_cost_usd": {
+        "min": 31000,
+        "max": 93000,
+        "currency": "USD",
+        "notes": "First year: SMCR gap analysis + SMF approval applications (£10K-30K) + governance framework + training (£15K-45K)"
+      },
+      "severity": "high",
+      "exemptions": []
+    },
+    {
+      "requirement": "Anti-Money Laundering (AML) and Counter-Terrorism Financing (CTF)",
+      "description": "Must comply with Money Laundering Regulations 2017 (MLR 2017) including: (1) risk-based approach and enterprise-wide risk assessment, (2) customer due diligence (CDD) - verify identity and source of funds, (3) enhanced due diligence (EDD) for PEPs and high-risk customers (investments >£10K), (4) ongoing monitoring, (5) suspicious activity reports (SARs) to National Crime Agency (NCA), (6) sanctions screening (OFSI, EU, UN lists), (7) appoint nominated officer (Money Laundering Reporting Officer), (8) staff training (annual), (9) record retention (5 years). Independent AML audit every 2 years.",
+      "mandatory": true,
+      "deadline_days": 90,
+      "estimated_cost_usd": {
+        "min": 37200,
+        "max": 93000,
+        "currency": "USD",
+        "notes": "First year: AML/CTF system (£20K-50K) + MLRO appointment + training + screening tools. Annual ongoing: £15K-40K"
+      },
+      "severity": "critical",
+      "exemptions": []
+    },
+    {
+      "requirement": "Ongoing Reporting and Disclosure Obligations",
+      "description": "Authorized AIFMs must provide: (1) annual report to investors within 6 months of year-end (audited accounts, valuation reports, remuneration disclosure), (2) semi-annual reports (if required by fund rules), (3) quarterly investor updates on property performance, (4) AIFMD Annex IV reporting to FCA (annually or quarterly if AUM >€1B), (5) material event notifications within 14 days, (6) respond to investor information requests within reasonable time. Retail funds: additional COLL Sourcebook disclosure requirements. Failure to report triggers supervisory action.",
+      "mandatory": true,
+      "deadline_days": 0,
+      "estimated_cost_usd": {
+        "min": 31000,
+        "max": 93000,
+        "currency": "USD",
+        "notes": "Annual ongoing: external audit (£20K-50K) + valuations + investor reporting systems + compliance staff (£25K-75K)"
+      },
+      "severity": "medium",
+      "exemptions": []
+    },
+    {
+      "requirement": "Operational Resilience and Cybersecurity",
+      "description": "Must comply with FCA operational resilience requirements (effective March 2022) including: (1) identify important business services (e.g., token custody, investor reporting), (2) set impact tolerances (maximum disruption before unacceptable harm), (3) mapping and testing (scenario testing, including severe but plausible disruptions), (4) communication plans, (5) self-assessment annually. Cybersecurity: implement controls aligned with NIST/ISO 27001, penetration testing (annually), incident response plan, data encryption. Operational incidents must be reported to FCA within defined timeframes (critical: immediately).",
+      "mandatory": true,
+      "deadline_days": 180,
+      "estimated_cost_usd": {
+        "min": 37200,
+        "max": 124000,
+        "currency": "USD",
+        "notes": "First year: operational resilience framework (£15K-40K) + cybersecurity assessment + pen testing + BCP (£15K-60K)"
+      },
+      "severity": "high",
+      "exemptions": []
+    },
+    {
+      "requirement": "Secondary Market and Transfer Restrictions",
+      "description": "If providing secondary market for CIS units/tokens, must either: (1) list on FCA-recognized investment exchange (e.g., LSE, Aquis), OR (2) operate as Multilateral Trading Facility (MTF) or Organized Trading Facility (OTF) - requires MiFID investment firm authorization (£50K-500K costs). Most tokenized real estate offerings restrict transfers: (a) 12-month lock-up, (b) transfers only with manager approval, (c) minimum holding period. Smart contract must enforce transfer restrictions. Unlisted CIS units difficult to transfer - liquidity risk must be disclosed prominently.",
+      "mandatory": false,
+      "deadline_days": 0,
+      "estimated_cost_usd": {
+        "min": 18600,
+        "max": 620000,
+        "currency": "USD",
+        "notes": "If exchange listing: £50K-150K setup + ongoing fees. If operate MTF: £100K-500K+ for MiFID authorization. If restricted transfers: £15K legal documentation."
+      },
+      "severity": "medium",
+      "exemptions": ["Transfers restricted to original investors or with manager approval - no MTF/exchange needed"]
+    }
+  ],
+  "penalties": [
+    {
+      "violation": "Operating Unauthorized CIS or AIFM",
+      "penalty_type": "Criminal Prosecution + Unlimited Fines",
+      "amount_usd": {
+        "min": 0,
+        "max": 99999999,
+        "notes": "Criminal offense under FSMA Section 23: unlimited fines + imprisonment up to 2 years + investor restitution orders + disgorgement of all fees"
+      },
+      "additional_consequences": [
+        "Criminal conviction and imprisonment (up to 2 years)",
+        "Unlimited fines (no statutory cap)",
+        "Investor restitution orders (full refund of investments)",
+        "Director disqualification (2-15 years)",
+        "Permanent ban from UK financial services",
+        "Asset freezing and restraint orders"
+      ]
+    },
+    {
+      "violation": "Illegal Financial Promotion to Retail Investors",
+      "penalty_type": "Criminal Prosecution + Civil Penalties",
+      "amount_usd": {
+        "min": 0,
+        "max": 99999999,
+        "notes": "Criminal offense under FSMA Section 21: unlimited fines + imprisonment up to 2 years + FCA civil penalties up to £1M or higher of disgorgement"
+      },
+      "additional_consequences": [
+        "Criminal prosecution (up to 2 years imprisonment)",
+        "Unlimited criminal fines",
+        "FCA public censure and financial penalties",
+        "Investor compensation orders",
+        "Prohibition orders preventing industry participation"
+      ]
+    },
+    {
+      "violation": "Misleading Prospectus or Material Omissions",
+      "penalty_type": "Criminal Prosecution + Civil Liability",
+      "amount_usd": {
+        "min": 62000,
+        "max": 99999999,
+        "notes": "Criminal liability under FSMA: unlimited fines + up to 7 years imprisonment + civil compensation to investors who relied on prospectus"
+      },
+      "additional_consequences": [
+        "Criminal prosecution under FSMA s.90/s.397 (up to 7 years imprisonment)",
+        "Civil liability to all investors (rescission + damages)",
+        "FCA enforcement action and fines",
+        "Director personal liability",
+        "Fraud Act 2006 prosecution if dishonest"
+      ]
+    },
+    {
+      "violation": "AML/CTF Breaches or Inadequate Controls",
+      "penalty_type": "Criminal Prosecution + Civil Penalties",
+      "amount_usd": {
+        "min": 124000,
+        "max": 99999999,
+        "notes": "Criminal penalties under MLR 2017: unlimited fines + imprisonment up to 2 years + FCA financial penalties (up to £5M or higher for serious breaches) + authorization withdrawal"
+      },
+      "additional_consequences": [
+        "Criminal prosecution (up to 2 years imprisonment)",
+        "FCA authorization revocation or suspension",
+        "Serious Fraud Office (SFO) investigation if money laundering facilitated",
+        "SMCR prohibition orders for senior managers",
+        "Enhanced supervision and mandatory remediation"
+      ]
+    }
+  ],
+  "regulatory_guidance": [
+    "FCA PS19/22 (2019) confirms cryptoassets representing security tokens are regulated under FSMA",
+    "Most tokenized real estate structures are CIS under Section 235 FSMA - requires authorization",
+    "FCA PS23/6 (October 2023) banned retail promotion of most cryptoassets including real estate tokens - high net worth/sophisticated investors only",
+    "Real estate tokens typically classified as 'units in collective investment scheme' or 'alternative investment fund'",
+    "FCA Perimeter Guidance PERG 9 provides guidance on CIS classification",
+    "Direct property ownership tokens (where investor has day-to-day control) may escape CIS classification but rare in practice",
+    "UK regulatory approach conservative compared to Singapore/Switzerland - retail access highly restricted",
+    "Brexit impact: UK Prospectus Regulation post-Brexit allows offers <£8M without prospectus (previously €8M)"
+  ],
+  "related_regulations": [
+    "uk-fca-cryptoasset-promotion-ban-2023",
+    "uk-fca-smcr-2024",
+    "uk-mlr-aml-2024",
+    "uk-fca-operational-resilience-2022"
+  ],
+  "confidence": 0.93,
+  "notes": "UK's regulatory framework for tokenized real estate is stringent and protective of retail investors. Post-October 2023, retail promotion of real estate tokens effectively banned unless listed on recognized exchange. Total first-year compliance costs for UK real estate token offering: £500K-1.2M (USD $620K-1.5M). Ongoing annual costs: £150K-400K. Most cost-effective approach: market only to high net worth/sophisticated investors (reduces prospectus and authorization complexity). Full retail authorization path extremely expensive and time-consuming (12-24 months)."
+}