Spaces:
Build error
Build error
Deploy Whale_Arbitrum on HF Spaces
Browse files- .env +11 -0
- README.md +135 -0
- app.py +719 -0
- modules/__init__.py +1 -0
- modules/__pycache__/__init__.cpython-312.pyc +0 -0
- modules/__pycache__/api_client.cpython-312.pyc +0 -0
- modules/__pycache__/crew_system.cpython-312.pyc +0 -0
- modules/__pycache__/crew_tools.cpython-312.pyc +0 -0
- modules/__pycache__/data_processor.cpython-312.pyc +0 -0
- modules/__pycache__/detection.cpython-312.pyc +0 -0
- modules/__pycache__/visualizer.cpython-312.pyc +0 -0
- modules/api_client.py +768 -0
- modules/crew_system.py +1117 -0
- modules/crew_tools.py +362 -0
- modules/data_processor.py +1425 -0
- modules/detection.py +684 -0
- modules/tools.py +373 -0
- modules/visualizer.py +638 -0
- requirements.txt +12 -0
- test_api.py +205 -0
.env
ADDED
|
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Your current API key appears to be having issues
|
| 2 |
+
# Please replace it with your own key from https://arbiscan.io/myapikey
|
| 3 |
+
# Uncomment one of the API keys below or add your own
|
| 4 |
+
ARBISCAN_API_KEY=4YEN1UTUEZ8I8ZBWSZW5NH6ZDFYEUVKQ5U
|
| 5 |
+
# ARBISCAN_API_KEY=HVZC2W3IZWCGJWS8QDBZ56D1GZZNDJMZ25
|
| 6 |
+
|
| 7 |
+
# Gemini API key for price data
|
| 8 |
+
GEMINI_API_KEY=AIzaSyCyble5D3dlgPxDXWLlaZmu8hOM_nt-V6M
|
| 9 |
+
|
| 10 |
+
# OpenAI API key for CrewAI functionality
|
| 11 |
+
OPENAI_API_KEY=your-openai-api-key
|
README.md
ADDED
|
@@ -0,0 +1,135 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Whale Wallet AI β Market Manipulation Detection
|
| 2 |
+
|
| 3 |
+
A powerful Streamlit-based tool that tracks large holders ("whales") on the Arbitrum network to uncover potential market manipulation tactics.
|
| 4 |
+
|
| 5 |
+
## 1. Prerequisites & Setup
|
| 6 |
+
|
| 7 |
+
### 1.1. Python & Dependencies
|
| 8 |
+
- Ensure you have Python 3.8+ installed.
|
| 9 |
+
- Install required packages via:
|
| 10 |
+
```bash
|
| 11 |
+
pip install -r requirements.txt
|
| 12 |
+
```
|
| 13 |
+
|
| 14 |
+
### 1.2. API Keys
|
| 15 |
+
You need API keys to fetch on-chain data and real-time prices:
|
| 16 |
+
- **ARBISCAN_API_KEY**: For fetching Arbitrum transaction data
|
| 17 |
+
- **GEMINI_API_KEY**: For retrieving live token prices
|
| 18 |
+
- **OPENAI_API_KEY**: For powering the CrewAI agents
|
| 19 |
+
|
| 20 |
+
Save these in a file named `.env` at the project root:
|
| 21 |
+
```env
|
| 22 |
+
ARBISCAN_API_KEY=your_arbiscan_key
|
| 23 |
+
GEMINI_API_KEY=your_gemini_key
|
| 24 |
+
OPENAI_API_KEY=your_openai_key
|
| 25 |
+
```
|
| 26 |
+
Note: Sample API keys are provided in the default .env file, but you should replace them with your own for production use.
|
| 27 |
+
|
| 28 |
+
### 1.3. Run the App
|
| 29 |
+
Launch the web interface with:
|
| 30 |
+
```bash
|
| 31 |
+
streamlit run app.py
|
| 32 |
+
```
|
| 33 |
+
|
| 34 |
+
## 2. Core Features & How to Use Them
|
| 35 |
+
|
| 36 |
+
### 2.1 Track Large Buy/Sell Transactions
|
| 37 |
+
|
| 38 |
+
**What it does:**
|
| 39 |
+
Monitors on-chain transfers exceeding a configurable threshold (e.g., 1,000 tokens or $100K) for any wallet or contract you specify.
|
| 40 |
+
|
| 41 |
+
**How to use:**
|
| 42 |
+
1. In the sidebar, enter one or more wallet addresses
|
| 43 |
+
2. Set your minimum token or USD value filter
|
| 44 |
+
3. Click **Track Transactions**
|
| 45 |
+
4. The dashboard will list incoming/outgoing transfers above the threshold.
|
| 46 |
+
|
| 47 |
+
### 2.2 Identify Trading Patterns of Whale Wallets
|
| 48 |
+
|
| 49 |
+
**What it does:**
|
| 50 |
+
Uses time-series clustering and sequence analysis to surface recurring behaviors (e.g., cyclical dumping, accumulation bursts).
|
| 51 |
+
|
| 52 |
+
**How to use:**
|
| 53 |
+
1. Select a wallet address
|
| 54 |
+
2. Choose a time period (e.g., last 7 days)
|
| 55 |
+
3. Click **Analyze Patterns**
|
| 56 |
+
4. View a summary of detected clusters and drill down into individual events.
|
| 57 |
+
|
| 58 |
+
### 2.3 Analyze Impact of Whale Transactions on Token Prices
|
| 59 |
+
|
| 60 |
+
**What it does:**
|
| 61 |
+
Correlates large trades against minute-by-minute price ticks to quantify slippage, price spikes, or dumps.
|
| 62 |
+
|
| 63 |
+
**How to use:**
|
| 64 |
+
1. Enable **Price Impact** analysis in settings
|
| 65 |
+
2. Specify lookback/lookahead windows (e.g., 5 minutes)
|
| 66 |
+
3. Click **Run Impact Analysis**
|
| 67 |
+
4. See interactive line charts and slippage metrics.
|
| 68 |
+
|
| 69 |
+
### 2.4 Detect Potential Market Manipulation Techniques
|
| 70 |
+
|
| 71 |
+
**What it does:**
|
| 72 |
+
Automatically flags suspicious behaviors such as:
|
| 73 |
+
- **Pump-and-Dump:** Rapid buys followed by coordinated sell-offs
|
| 74 |
+
- **Wash Trading:** Self-trading across multiple addresses
|
| 75 |
+
- **Spoofing:** Large orders placed then canceled
|
| 76 |
+
|
| 77 |
+
**How to use:**
|
| 78 |
+
1. Toggle **Manipulation Detection** on
|
| 79 |
+
2. Adjust sensitivity slider (Low/Medium/High)
|
| 80 |
+
3. Click **Detect**
|
| 81 |
+
4. Examine the **Alerts** panel for flagged events.
|
| 82 |
+
|
| 83 |
+
### 2.5 Generate Reports & Visualizations
|
| 84 |
+
|
| 85 |
+
**What it does:**
|
| 86 |
+
Compiles whale activity into PDF/CSV summaries and interactive charts.
|
| 87 |
+
|
| 88 |
+
**How to use:**
|
| 89 |
+
1. Select **Export** in the top menu
|
| 90 |
+
2. Choose **CSV**, **PDF**, or **PNG**
|
| 91 |
+
3. Specify time range and wallets to include
|
| 92 |
+
4. Click **Download**
|
| 93 |
+
5. Saved file will appear in your browser's download folder.
|
| 94 |
+
|
| 95 |
+
## 3. Advanced Features: CrewAI Integration
|
| 96 |
+
|
| 97 |
+
This application leverages CrewAI to provide advanced analysis through specialized AI agents:
|
| 98 |
+
|
| 99 |
+
- **Blockchain Data Collector**: Extracts and organizes on-chain data
|
| 100 |
+
- **Price Impact Analyst**: Correlates trading activity with price movements
|
| 101 |
+
- **Trading Pattern Detector**: Identifies recurring behavioral patterns
|
| 102 |
+
- **Market Manipulation Investigator**: Detects potential market abuse
|
| 103 |
+
- **Insights Reporter**: Transforms data into actionable intelligence
|
| 104 |
+
|
| 105 |
+
## 4. Project Structure
|
| 106 |
+
|
| 107 |
+
```
|
| 108 |
+
/Whale_Arbitrum/
|
| 109 |
+
βββ app.py # Main Streamlit application entry point
|
| 110 |
+
βββ requirements.txt # Dependencies and package versions
|
| 111 |
+
βββ .env # API keys and environment variables
|
| 112 |
+
βββ modules/
|
| 113 |
+
β βββ api_client.py # Arbiscan and Gemini API clients
|
| 114 |
+
β βββ data_processor.py # Data processing and analysis
|
| 115 |
+
β βββ detection.py # Market manipulation detection algorithms
|
| 116 |
+
β βββ visualizer.py # Visualization and report generation
|
| 117 |
+
β βββ crew_system.py # CrewAI agentic system
|
| 118 |
+
```
|
| 119 |
+
|
| 120 |
+
## 5. Use Cases
|
| 121 |
+
|
| 122 |
+
- **Regulatory Compliance & Fraud Detection**
|
| 123 |
+
Auditors and regulators can monitor DeFi markets for wash trades and suspicious dumps.
|
| 124 |
+
|
| 125 |
+
- **Investment Strategy Optimization**
|
| 126 |
+
Traders gain insight into institutional flows and can calibrate entry/exit points.
|
| 127 |
+
|
| 128 |
+
- **Market Research & Analysis**
|
| 129 |
+
Researchers study whale behavior to gauge token health and potential volatility.
|
| 130 |
+
|
| 131 |
+
- **DeFi Protocol Security Monitoring**
|
| 132 |
+
Protocol teams receive alerts on large dumps that may destabilize liquidity pools.
|
| 133 |
+
|
| 134 |
+
- **Token Project Risk Assessment**
|
| 135 |
+
Token issuers review top-holder actions to flag governance or distribution issues.
|
app.py
ADDED
|
@@ -0,0 +1,719 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import streamlit as st
|
| 2 |
+
import pandas as pd
|
| 3 |
+
import numpy as np
|
| 4 |
+
import plotly.express as px
|
| 5 |
+
import plotly.graph_objects as go
|
| 6 |
+
import os
|
| 7 |
+
import json
|
| 8 |
+
import logging
|
| 9 |
+
import time
|
| 10 |
+
from datetime import datetime, timedelta
|
| 11 |
+
from typing import Dict, List, Optional, Union, Any
|
| 12 |
+
from dotenv import load_dotenv
|
| 13 |
+
|
| 14 |
+
# Configure logging - Reduce verbosity and improve performance
|
| 15 |
+
logging.basicConfig(
|
| 16 |
+
level=logging.WARNING, # Only show warnings and errors by default
|
| 17 |
+
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
|
| 18 |
+
)
|
| 19 |
+
|
| 20 |
+
# Create a custom filter to suppress repetitive Gemini API errors
|
| 21 |
+
class SuppressRepetitiveErrors(logging.Filter):
|
| 22 |
+
def __init__(self):
|
| 23 |
+
super().__init__()
|
| 24 |
+
self.error_counts = {}
|
| 25 |
+
self.max_errors = 3 # Show at most 3 instances of each error
|
| 26 |
+
|
| 27 |
+
def filter(self, record):
|
| 28 |
+
if record.levelno < logging.WARNING:
|
| 29 |
+
return True
|
| 30 |
+
|
| 31 |
+
# If it's a Gemini API error for non-existent tokens, suppress it after a few occurrences
|
| 32 |
+
if 'Error fetching historical prices from Gemini API' in record.getMessage():
|
| 33 |
+
key = 'gemini_api_error'
|
| 34 |
+
self.error_counts[key] = self.error_counts.get(key, 0) + 1
|
| 35 |
+
|
| 36 |
+
# Only allow the first few errors through
|
| 37 |
+
return self.error_counts[key] <= self.max_errors
|
| 38 |
+
|
| 39 |
+
return True
|
| 40 |
+
|
| 41 |
+
# Apply the filter
|
| 42 |
+
logging.getLogger().addFilter(SuppressRepetitiveErrors())
|
| 43 |
+
|
| 44 |
+
from modules.api_client import ArbiscanClient, GeminiClient
|
| 45 |
+
from modules.data_processor import DataProcessor
|
| 46 |
+
from modules.visualizer import Visualizer
|
| 47 |
+
from modules.detection import ManipulationDetector
|
| 48 |
+
|
| 49 |
+
# Load environment variables
|
| 50 |
+
load_dotenv()
|
| 51 |
+
|
| 52 |
+
# Set page configuration
|
| 53 |
+
st.set_page_config(
|
| 54 |
+
page_title="Whale Wallet AI - Market Manipulation Detection",
|
| 55 |
+
page_icon="π³",
|
| 56 |
+
layout="wide",
|
| 57 |
+
initial_sidebar_state="expanded"
|
| 58 |
+
)
|
| 59 |
+
|
| 60 |
+
# Add custom CSS
|
| 61 |
+
st.markdown("""
|
| 62 |
+
<style>
|
| 63 |
+
.main-header {
|
| 64 |
+
font-size: 2.5rem;
|
| 65 |
+
color: #1E88E5;
|
| 66 |
+
text-align: center;
|
| 67 |
+
margin-bottom: 1rem;
|
| 68 |
+
}
|
| 69 |
+
.sub-header {
|
| 70 |
+
font-size: 1.5rem;
|
| 71 |
+
color: #424242;
|
| 72 |
+
margin-bottom: 1rem;
|
| 73 |
+
}
|
| 74 |
+
.info-text {
|
| 75 |
+
background-color: #E3F2FD;
|
| 76 |
+
padding: 1rem;
|
| 77 |
+
border-radius: 0.5rem;
|
| 78 |
+
margin-bottom: 1rem;
|
| 79 |
+
}
|
| 80 |
+
.stButton>button {
|
| 81 |
+
width: 100%;
|
| 82 |
+
}
|
| 83 |
+
</style>
|
| 84 |
+
""", unsafe_allow_html=True)
|
| 85 |
+
|
| 86 |
+
# Initialize Streamlit session state for persisting data between tab navigation
|
| 87 |
+
if 'transactions_data' not in st.session_state:
|
| 88 |
+
st.session_state.transactions_data = pd.DataFrame()
|
| 89 |
+
|
| 90 |
+
if 'patterns_data' not in st.session_state:
|
| 91 |
+
st.session_state.patterns_data = None
|
| 92 |
+
|
| 93 |
+
if 'price_impact_data' not in st.session_state:
|
| 94 |
+
st.session_state.price_impact_data = None
|
| 95 |
+
|
| 96 |
+
# Performance metrics tracking
|
| 97 |
+
if 'performance_metrics' not in st.session_state:
|
| 98 |
+
st.session_state.performance_metrics = {
|
| 99 |
+
'api_calls': 0,
|
| 100 |
+
'data_processing_time': 0,
|
| 101 |
+
'visualization_time': 0,
|
| 102 |
+
'last_refresh': None
|
| 103 |
+
}
|
| 104 |
+
|
| 105 |
+
# Function to track performance
|
| 106 |
+
def track_timing(category: str):
|
| 107 |
+
def timing_decorator(func):
|
| 108 |
+
def wrapper(*args, **kwargs):
|
| 109 |
+
start_time = time.time()
|
| 110 |
+
result = func(*args, **kwargs)
|
| 111 |
+
elapsed = time.time() - start_time
|
| 112 |
+
|
| 113 |
+
if category in st.session_state.performance_metrics:
|
| 114 |
+
st.session_state.performance_metrics[category] += elapsed
|
| 115 |
+
else:
|
| 116 |
+
st.session_state.performance_metrics[category] = elapsed
|
| 117 |
+
|
| 118 |
+
return result
|
| 119 |
+
return wrapper
|
| 120 |
+
return timing_decorator
|
| 121 |
+
|
| 122 |
+
if 'alerts_data' not in st.session_state:
|
| 123 |
+
st.session_state.alerts_data = None
|
| 124 |
+
|
| 125 |
+
# Initialize API clients
|
| 126 |
+
arbiscan_client = ArbiscanClient(os.getenv("ARBISCAN_API_KEY"))
|
| 127 |
+
# Set debug mode to False to reduce log output
|
| 128 |
+
arbiscan_client.verbose_debug = False
|
| 129 |
+
gemini_client = GeminiClient(os.getenv("GEMINI_API_KEY"))
|
| 130 |
+
|
| 131 |
+
# Initialize data processor and visualizer
|
| 132 |
+
data_processor = DataProcessor()
|
| 133 |
+
visualizer = Visualizer()
|
| 134 |
+
|
| 135 |
+
# Apply performance tracking to key instance methods after initialization
|
| 136 |
+
original_fetch_whale = arbiscan_client.fetch_whale_transactions
|
| 137 |
+
arbiscan_client.fetch_whale_transactions = track_timing('api_calls')(original_fetch_whale)
|
| 138 |
+
|
| 139 |
+
original_identify_patterns = data_processor.identify_patterns
|
| 140 |
+
data_processor.identify_patterns = track_timing('data_processing_time')(original_identify_patterns)
|
| 141 |
+
|
| 142 |
+
original_analyze_price_impact = data_processor.analyze_price_impact
|
| 143 |
+
data_processor.analyze_price_impact = track_timing('data_processing_time')(original_analyze_price_impact)
|
| 144 |
+
detection = ManipulationDetector()
|
| 145 |
+
|
| 146 |
+
# Initialize crew system (for AI-assisted analysis)
|
| 147 |
+
try:
|
| 148 |
+
from modules.crew_system import WhaleAnalysisCrewSystem
|
| 149 |
+
crew_system = WhaleAnalysisCrewSystem(arbiscan_client, gemini_client, data_processor)
|
| 150 |
+
CREW_ENABLED = True
|
| 151 |
+
logging.info("CrewAI system loaded successfully")
|
| 152 |
+
except Exception as e:
|
| 153 |
+
CREW_ENABLED = False
|
| 154 |
+
logging.error(f"Failed to load CrewAI system: {str(e)}")
|
| 155 |
+
st.sidebar.error("CrewAI features are disabled due to an error.")
|
| 156 |
+
|
| 157 |
+
# Sidebar for inputs
|
| 158 |
+
st.sidebar.header("Configuration")
|
| 159 |
+
|
| 160 |
+
# Wallet tracking section
|
| 161 |
+
st.sidebar.subheader("Track Wallets")
|
| 162 |
+
wallet_addresses = st.sidebar.text_area(
|
| 163 |
+
"Enter wallet addresses (one per line)",
|
| 164 |
+
placeholder="0x1234abcd...\n0xabcd1234..."
|
| 165 |
+
)
|
| 166 |
+
|
| 167 |
+
threshold_type = st.sidebar.radio(
|
| 168 |
+
"Threshold Type",
|
| 169 |
+
["Token Amount", "USD Value"]
|
| 170 |
+
)
|
| 171 |
+
|
| 172 |
+
if threshold_type == "Token Amount":
|
| 173 |
+
threshold_value = st.sidebar.number_input("Minimum Token Amount", min_value=0.0, value=1000.0)
|
| 174 |
+
token_symbol = st.sidebar.text_input("Token Symbol", placeholder="ETH")
|
| 175 |
+
else:
|
| 176 |
+
threshold_value = st.sidebar.number_input("Minimum USD Value", min_value=0.0, value=100000.0)
|
| 177 |
+
|
| 178 |
+
# Time period selection
|
| 179 |
+
st.sidebar.subheader("Time Period")
|
| 180 |
+
time_period = st.sidebar.selectbox(
|
| 181 |
+
"Select Time Period",
|
| 182 |
+
["Last 24 hours", "Last 7 days", "Last 30 days", "Custom"]
|
| 183 |
+
)
|
| 184 |
+
|
| 185 |
+
if time_period == "Custom":
|
| 186 |
+
start_date = st.sidebar.date_input("Start Date", datetime.now() - timedelta(days=7))
|
| 187 |
+
end_date = st.sidebar.date_input("End Date", datetime.now())
|
| 188 |
+
else:
|
| 189 |
+
# Calculate dates based on selection
|
| 190 |
+
end_date = datetime.now()
|
| 191 |
+
if time_period == "Last 24 hours":
|
| 192 |
+
start_date = end_date - timedelta(days=1)
|
| 193 |
+
elif time_period == "Last 7 days":
|
| 194 |
+
start_date = end_date - timedelta(days=7)
|
| 195 |
+
else: # Last 30 days
|
| 196 |
+
start_date = end_date - timedelta(days=30)
|
| 197 |
+
|
| 198 |
+
# Manipulation detection settings
|
| 199 |
+
st.sidebar.subheader("Manipulation Detection")
|
| 200 |
+
enable_manipulation_detection = st.sidebar.toggle("Enable Manipulation Detection", value=True)
|
| 201 |
+
if enable_manipulation_detection:
|
| 202 |
+
sensitivity = st.sidebar.select_slider(
|
| 203 |
+
"Detection Sensitivity",
|
| 204 |
+
options=["Low", "Medium", "High"],
|
| 205 |
+
value="Medium"
|
| 206 |
+
)
|
| 207 |
+
|
| 208 |
+
# Price impact analysis settings
|
| 209 |
+
st.sidebar.subheader("Price Impact Analysis")
|
| 210 |
+
enable_price_impact = st.sidebar.toggle("Enable Price Impact Analysis", value=True)
|
| 211 |
+
if enable_price_impact:
|
| 212 |
+
lookback_minutes = st.sidebar.slider("Lookback (minutes)", 1, 60, 5)
|
| 213 |
+
lookahead_minutes = st.sidebar.slider("Lookahead (minutes)", 1, 60, 5)
|
| 214 |
+
|
| 215 |
+
# Action buttons
|
| 216 |
+
track_button = st.sidebar.button("Track Transactions", type="primary")
|
| 217 |
+
pattern_button = st.sidebar.button("Analyze Patterns")
|
| 218 |
+
if enable_manipulation_detection:
|
| 219 |
+
detect_button = st.sidebar.button("Detect Manipulation")
|
| 220 |
+
|
| 221 |
+
# Main content area
|
| 222 |
+
tab1, tab2, tab3, tab4, tab5 = st.tabs([
|
| 223 |
+
"Transactions", "Patterns", "Price Impact", "Alerts", "Reports"
|
| 224 |
+
])
|
| 225 |
+
|
| 226 |
+
with tab1:
|
| 227 |
+
st.header("Whale Transactions")
|
| 228 |
+
if track_button and wallet_addresses:
|
| 229 |
+
with st.spinner("Fetching whale transactions..."):
|
| 230 |
+
# Function to track whale transactions
|
| 231 |
+
def track_whale_transactions(wallets, start_date, end_date, threshold_value, threshold_type, token_symbol=None):
|
| 232 |
+
# Direct API call since CrewAI is temporarily disabled
|
| 233 |
+
try:
|
| 234 |
+
min_token_amount = None
|
| 235 |
+
min_usd_value = None
|
| 236 |
+
if threshold_type == "Token Amount":
|
| 237 |
+
min_token_amount = threshold_value
|
| 238 |
+
else:
|
| 239 |
+
min_usd_value = threshold_value
|
| 240 |
+
|
| 241 |
+
# Add pagination control to prevent infinite API requests
|
| 242 |
+
max_pages = 5 # Limit the number of pages to prevent excessive API calls
|
| 243 |
+
transactions = arbiscan_client.fetch_whale_transactions(
|
| 244 |
+
addresses=wallets,
|
| 245 |
+
min_token_amount=min_token_amount,
|
| 246 |
+
max_pages=5,
|
| 247 |
+
min_usd_value=min_usd_value
|
| 248 |
+
)
|
| 249 |
+
|
| 250 |
+
if transactions.empty:
|
| 251 |
+
st.warning("No transactions found for the specified addresses")
|
| 252 |
+
|
| 253 |
+
return transactions
|
| 254 |
+
except Exception as e:
|
| 255 |
+
st.error(f"Error fetching transactions: {str(e)}")
|
| 256 |
+
return pd.DataFrame()
|
| 257 |
+
|
| 258 |
+
wallet_list = [addr.strip() for addr in wallet_addresses.split("\n") if addr.strip()]
|
| 259 |
+
|
| 260 |
+
# Use cached data or fetch new if not available
|
| 261 |
+
if st.session_state.transactions_data is None or track_button:
|
| 262 |
+
with st.spinner("Fetching transactions..."):
|
| 263 |
+
transactions = track_whale_transactions(
|
| 264 |
+
wallets=wallet_list,
|
| 265 |
+
start_date=start_date,
|
| 266 |
+
end_date=end_date,
|
| 267 |
+
threshold_value=threshold_value,
|
| 268 |
+
threshold_type=threshold_type,
|
| 269 |
+
token_symbol=token_symbol
|
| 270 |
+
)
|
| 271 |
+
# Store in session state
|
| 272 |
+
st.session_state.transactions_data = transactions
|
| 273 |
+
else:
|
| 274 |
+
transactions = st.session_state.transactions_data
|
| 275 |
+
|
| 276 |
+
if not transactions.empty:
|
| 277 |
+
st.success(f"Found {len(transactions)} transactions matching your criteria")
|
| 278 |
+
|
| 279 |
+
# Display transactions
|
| 280 |
+
if len(transactions) > 0:
|
| 281 |
+
st.dataframe(transactions, use_container_width=True)
|
| 282 |
+
|
| 283 |
+
# Add download button
|
| 284 |
+
csv = transactions.to_csv(index=False).encode('utf-8')
|
| 285 |
+
st.download_button(
|
| 286 |
+
"Download Transactions CSV",
|
| 287 |
+
csv,
|
| 288 |
+
"whale_transactions.csv",
|
| 289 |
+
"text/csv",
|
| 290 |
+
key='download-csv'
|
| 291 |
+
)
|
| 292 |
+
|
| 293 |
+
# Volume by day chart
|
| 294 |
+
st.subheader("Transaction Volume by Day")
|
| 295 |
+
try:
|
| 296 |
+
st.plotly_chart(visualizer.plot_volume_by_day(transactions), use_container_width=True)
|
| 297 |
+
except Exception as e:
|
| 298 |
+
st.error(f"Error generating volume chart: {str(e)}")
|
| 299 |
+
|
| 300 |
+
# Transaction flow visualization
|
| 301 |
+
st.subheader("Transaction Flow")
|
| 302 |
+
try:
|
| 303 |
+
flow_chart = visualizer.plot_transaction_flow(transactions)
|
| 304 |
+
st.plotly_chart(flow_chart, use_container_width=True)
|
| 305 |
+
except Exception as e:
|
| 306 |
+
st.error(f"Error generating flow chart: {str(e)}")
|
| 307 |
+
else:
|
| 308 |
+
st.warning("No transactions found matching your criteria. Try adjusting the parameters.")
|
| 309 |
+
else:
|
| 310 |
+
st.info("Enter wallet addresses and click 'Track Transactions' to view whale activity")
|
| 311 |
+
|
| 312 |
+
with tab2:
|
| 313 |
+
st.header("Trading Patterns")
|
| 314 |
+
if track_button and wallet_addresses:
|
| 315 |
+
with st.spinner("Analyzing trading patterns..."):
|
| 316 |
+
# Function to analyze trading patterns
|
| 317 |
+
def analyze_trading_patterns(wallets, start_date, end_date):
|
| 318 |
+
# Direct analysis
|
| 319 |
+
try:
|
| 320 |
+
transactions_df = arbiscan_client.fetch_whale_transactions(addresses=wallets, max_pages=5)
|
| 321 |
+
if transactions_df.empty:
|
| 322 |
+
st.warning("No transactions found for the specified addresses")
|
| 323 |
+
return []
|
| 324 |
+
|
| 325 |
+
return data_processor.identify_patterns(transactions_df)
|
| 326 |
+
except Exception as e:
|
| 327 |
+
st.error(f"Error analyzing trading patterns: {str(e)}")
|
| 328 |
+
return []
|
| 329 |
+
|
| 330 |
+
wallet_list = [addr.strip() for addr in wallet_addresses.split("\n") if addr.strip()]
|
| 331 |
+
|
| 332 |
+
# Use cached data or fetch new if not available
|
| 333 |
+
if st.session_state.patterns_data is None or track_button:
|
| 334 |
+
with st.spinner("Analyzing trading patterns..."):
|
| 335 |
+
patterns = analyze_trading_patterns(
|
| 336 |
+
wallets=wallet_list,
|
| 337 |
+
start_date=start_date,
|
| 338 |
+
end_date=end_date
|
| 339 |
+
)
|
| 340 |
+
# Store in session state
|
| 341 |
+
st.session_state.patterns_data = patterns
|
| 342 |
+
else:
|
| 343 |
+
patterns = st.session_state.patterns_data
|
| 344 |
+
|
| 345 |
+
if patterns:
|
| 346 |
+
for i, pattern in enumerate(patterns):
|
| 347 |
+
pattern_card = st.container()
|
| 348 |
+
with pattern_card:
|
| 349 |
+
# Pattern header with name and risk profile
|
| 350 |
+
header_cols = st.columns([3, 1])
|
| 351 |
+
with header_cols[0]:
|
| 352 |
+
st.subheader(f"Pattern {i+1}: {pattern['name']}")
|
| 353 |
+
with header_cols[1]:
|
| 354 |
+
risk_color = "green"
|
| 355 |
+
if pattern.get('risk_profile') == "Medium":
|
| 356 |
+
risk_color = "orange"
|
| 357 |
+
elif pattern.get('risk_profile') in ["High", "Very High"]:
|
| 358 |
+
risk_color = "red"
|
| 359 |
+
st.markdown(f"<h5 style='color:{risk_color};'>Risk: {pattern.get('risk_profile', 'Unknown')}</h5>", unsafe_allow_html=True)
|
| 360 |
+
|
| 361 |
+
# Pattern description and details
|
| 362 |
+
st.markdown(f"**Description:** {pattern['description']}")
|
| 363 |
+
|
| 364 |
+
# Additional strategy information
|
| 365 |
+
if 'strategy' in pattern:
|
| 366 |
+
st.markdown(f"**Strategy:** {pattern['strategy']}")
|
| 367 |
+
|
| 368 |
+
# Time insight
|
| 369 |
+
if 'time_insight' in pattern:
|
| 370 |
+
st.info(pattern['time_insight'])
|
| 371 |
+
|
| 372 |
+
# Metrics
|
| 373 |
+
metric_cols = st.columns(3)
|
| 374 |
+
with metric_cols[0]:
|
| 375 |
+
st.markdown(f"**Occurrences:** {pattern['occurrence_count']} instances")
|
| 376 |
+
with metric_cols[1]:
|
| 377 |
+
st.markdown(f"**Confidence:** {pattern.get('confidence', 0):.2f}")
|
| 378 |
+
with metric_cols[2]:
|
| 379 |
+
st.markdown(f"**Volume:** {pattern.get('volume_metric', 'N/A')}")
|
| 380 |
+
|
| 381 |
+
# Display main chart first
|
| 382 |
+
if 'charts' in pattern and 'main' in pattern['charts']:
|
| 383 |
+
st.plotly_chart(pattern['charts']['main'], use_container_width=True)
|
| 384 |
+
elif 'chart_data' in pattern and pattern['chart_data'] is not None: # Fallback for old format
|
| 385 |
+
st.plotly_chart(pattern['chart_data'], use_container_width=True)
|
| 386 |
+
|
| 387 |
+
# Create two columns for additional charts
|
| 388 |
+
if 'charts' in pattern and len(pattern['charts']) > 1:
|
| 389 |
+
charts_col1, charts_col2 = st.columns(2)
|
| 390 |
+
|
| 391 |
+
# Hourly distribution chart
|
| 392 |
+
if 'hourly_distribution' in pattern['charts']:
|
| 393 |
+
with charts_col1:
|
| 394 |
+
st.plotly_chart(pattern['charts']['hourly_distribution'], use_container_width=True)
|
| 395 |
+
|
| 396 |
+
# Value distribution chart
|
| 397 |
+
if 'value_distribution' in pattern['charts']:
|
| 398 |
+
with charts_col2:
|
| 399 |
+
st.plotly_chart(pattern['charts']['value_distribution'], use_container_width=True)
|
| 400 |
+
|
| 401 |
+
# Advanced metrics in expander
|
| 402 |
+
if 'metrics' in pattern and pattern['metrics']:
|
| 403 |
+
with st.expander("Detailed Metrics"):
|
| 404 |
+
metrics_table = []
|
| 405 |
+
for k, v in pattern['metrics'].items():
|
| 406 |
+
if v is not None:
|
| 407 |
+
if isinstance(v, float):
|
| 408 |
+
metrics_table.append([k.replace('_', ' ').title(), f"{v:.4f}"])
|
| 409 |
+
else:
|
| 410 |
+
metrics_table.append([k.replace('_', ' ').title(), v])
|
| 411 |
+
|
| 412 |
+
if metrics_table:
|
| 413 |
+
st.table(pd.DataFrame(metrics_table, columns=["Metric", "Value"]))
|
| 414 |
+
|
| 415 |
+
# Display example transactions
|
| 416 |
+
if 'examples' in pattern and not pattern['examples'].empty:
|
| 417 |
+
with st.expander("Example Transactions"):
|
| 418 |
+
# Format the dataframe for better display
|
| 419 |
+
display_df = pattern['examples'].copy()
|
| 420 |
+
# Convert timestamp to readable format if needed
|
| 421 |
+
if 'timeStamp' in display_df.columns and not pd.api.types.is_datetime64_any_dtype(display_df['timeStamp']):
|
| 422 |
+
display_df['timeStamp'] = pd.to_datetime(display_df['timeStamp'], unit='s')
|
| 423 |
+
|
| 424 |
+
st.dataframe(display_df, use_container_width=True)
|
| 425 |
+
|
| 426 |
+
st.markdown("---")
|
| 427 |
+
else:
|
| 428 |
+
st.info("No significant trading patterns detected. Try expanding the date range or adding more addresses.")
|
| 429 |
+
else:
|
| 430 |
+
st.info("Track transactions to analyze trading patterns")
|
| 431 |
+
|
| 432 |
+
with tab3:
|
| 433 |
+
st.header("Price Impact Analysis")
|
| 434 |
+
if enable_price_impact and track_button and wallet_addresses:
|
| 435 |
+
with st.spinner("Analyzing price impact..."):
|
| 436 |
+
# Function to analyze price impact
|
| 437 |
+
def analyze_price_impact(wallets, start_date, end_date, lookback_minutes, lookahead_minutes):
|
| 438 |
+
# Direct analysis
|
| 439 |
+
transactions_df = arbiscan_client.fetch_whale_transactions(addresses=wallets, max_pages=5)
|
| 440 |
+
# Get token from first transaction
|
| 441 |
+
if not transactions_df.empty:
|
| 442 |
+
token_symbol = transactions_df.iloc[0].get('tokenSymbol', 'ETH')
|
| 443 |
+
# For each transaction, get price impact
|
| 444 |
+
price_impacts = {}
|
| 445 |
+
progress_bar = st.progress(0)
|
| 446 |
+
for idx, row in transactions_df.iterrows():
|
| 447 |
+
progress = int((idx + 1) / len(transactions_df) * 100)
|
| 448 |
+
progress_bar.progress(progress, text=f"Analyzing transaction {idx+1} of {len(transactions_df)}")
|
| 449 |
+
if 'timeStamp' in row:
|
| 450 |
+
try:
|
| 451 |
+
tx_time = datetime.fromtimestamp(int(row['timeStamp']))
|
| 452 |
+
impact_data = gemini_client.get_price_impact(
|
| 453 |
+
symbol=f"{token_symbol}USD",
|
| 454 |
+
transaction_time=tx_time,
|
| 455 |
+
lookback_minutes=lookback_minutes,
|
| 456 |
+
lookahead_minutes=lookahead_minutes
|
| 457 |
+
)
|
| 458 |
+
price_impacts[row['hash']] = impact_data
|
| 459 |
+
except Exception as e:
|
| 460 |
+
st.warning(f"Could not get price data for transaction: {str(e)}")
|
| 461 |
+
|
| 462 |
+
progress_bar.empty()
|
| 463 |
+
if price_impacts:
|
| 464 |
+
return data_processor.analyze_price_impact(transactions_df, price_impacts)
|
| 465 |
+
|
| 466 |
+
# Create an empty chart for the default case
|
| 467 |
+
empty_fig = go.Figure()
|
| 468 |
+
empty_fig.update_layout(
|
| 469 |
+
title="No Price Impact Data Available",
|
| 470 |
+
xaxis_title="Time",
|
| 471 |
+
yaxis_title="Price Impact (%)",
|
| 472 |
+
height=400,
|
| 473 |
+
template="plotly_white"
|
| 474 |
+
)
|
| 475 |
+
empty_fig.add_annotation(
|
| 476 |
+
text="No transactions found with price impact data",
|
| 477 |
+
showarrow=False,
|
| 478 |
+
font=dict(size=14)
|
| 479 |
+
)
|
| 480 |
+
|
| 481 |
+
return {
|
| 482 |
+
"avg_impact_pct": 0,
|
| 483 |
+
"max_impact_pct": 0,
|
| 484 |
+
"min_impact_pct": 0,
|
| 485 |
+
"significant_moves_count": 0,
|
| 486 |
+
"total_transactions": 0,
|
| 487 |
+
"transactions_with_impact": pd.DataFrame(),
|
| 488 |
+
"charts": {
|
| 489 |
+
"main_chart": empty_fig,
|
| 490 |
+
"impact_distribution": empty_fig,
|
| 491 |
+
"cumulative_impact": empty_fig,
|
| 492 |
+
"hourly_impact": empty_fig
|
| 493 |
+
},
|
| 494 |
+
"insights": [],
|
| 495 |
+
"impact_summary": "No price impact data available"
|
| 496 |
+
}
|
| 497 |
+
|
| 498 |
+
wallet_list = [addr.strip() for addr in wallet_addresses.split("\n") if addr.strip()]
|
| 499 |
+
|
| 500 |
+
# Use cached data or fetch new if not available
|
| 501 |
+
if st.session_state.price_impact_data is None or track_button:
|
| 502 |
+
with st.spinner("Analyzing price impact..."):
|
| 503 |
+
impact_analysis = analyze_price_impact(
|
| 504 |
+
wallets=wallet_list,
|
| 505 |
+
start_date=start_date,
|
| 506 |
+
end_date=end_date,
|
| 507 |
+
lookback_minutes=lookback_minutes,
|
| 508 |
+
lookahead_minutes=lookahead_minutes
|
| 509 |
+
)
|
| 510 |
+
# Store in session state
|
| 511 |
+
st.session_state.price_impact_data = impact_analysis
|
| 512 |
+
else:
|
| 513 |
+
impact_analysis = st.session_state.price_impact_data
|
| 514 |
+
|
| 515 |
+
if impact_analysis:
|
| 516 |
+
# Display impact summary
|
| 517 |
+
if 'impact_summary' in impact_analysis:
|
| 518 |
+
st.info(impact_analysis['impact_summary'])
|
| 519 |
+
|
| 520 |
+
# Summary metrics in two rows
|
| 521 |
+
metrics_row1 = st.columns(4)
|
| 522 |
+
with metrics_row1[0]:
|
| 523 |
+
st.metric("Avg. Price Impact (%)", f"{impact_analysis.get('avg_impact_pct', 0):.2f}%")
|
| 524 |
+
with metrics_row1[1]:
|
| 525 |
+
st.metric("Max Impact (%)", f"{impact_analysis.get('max_impact_pct', 0):.2f}%")
|
| 526 |
+
with metrics_row1[2]:
|
| 527 |
+
st.metric("Min Impact (%)", f"{impact_analysis.get('min_impact_pct', 0):.2f}%")
|
| 528 |
+
with metrics_row1[3]:
|
| 529 |
+
st.metric("Std Dev (%)", f"{impact_analysis.get('std_impact_pct', 0):.2f}%")
|
| 530 |
+
|
| 531 |
+
metrics_row2 = st.columns(4)
|
| 532 |
+
with metrics_row2[0]:
|
| 533 |
+
st.metric("Significant Moves", impact_analysis.get('significant_moves_count', 0))
|
| 534 |
+
with metrics_row2[1]:
|
| 535 |
+
st.metric("High Impact Moves", impact_analysis.get('high_impact_moves_count', 0))
|
| 536 |
+
with metrics_row2[2]:
|
| 537 |
+
st.metric("Positive/Negative", f"{impact_analysis.get('positive_impacts_count', 0)}/{impact_analysis.get('negative_impacts_count', 0)}")
|
| 538 |
+
with metrics_row2[3]:
|
| 539 |
+
st.metric("Total Transactions", impact_analysis.get('total_transactions', 0))
|
| 540 |
+
|
| 541 |
+
# Display insights if available
|
| 542 |
+
if 'insights' in impact_analysis and impact_analysis['insights']:
|
| 543 |
+
st.subheader("Key Insights")
|
| 544 |
+
for insight in impact_analysis['insights']:
|
| 545 |
+
st.markdown(f"**{insight['title']}**: {insight['description']}")
|
| 546 |
+
|
| 547 |
+
# Display the main chart
|
| 548 |
+
if 'charts' in impact_analysis and 'main_chart' in impact_analysis['charts']:
|
| 549 |
+
st.subheader("Price Impact Over Time")
|
| 550 |
+
st.plotly_chart(impact_analysis['charts']['main_chart'], use_container_width=True)
|
| 551 |
+
|
| 552 |
+
# Create two columns for secondary charts
|
| 553 |
+
col1, col2 = st.columns(2)
|
| 554 |
+
|
| 555 |
+
# Distribution chart
|
| 556 |
+
if 'charts' in impact_analysis and 'impact_distribution' in impact_analysis['charts']:
|
| 557 |
+
with col1:
|
| 558 |
+
st.plotly_chart(impact_analysis['charts']['impact_distribution'], use_container_width=True)
|
| 559 |
+
|
| 560 |
+
# Cumulative impact chart
|
| 561 |
+
if 'charts' in impact_analysis and 'cumulative_impact' in impact_analysis['charts']:
|
| 562 |
+
with col2:
|
| 563 |
+
st.plotly_chart(impact_analysis['charts']['cumulative_impact'], use_container_width=True)
|
| 564 |
+
|
| 565 |
+
# Hourly impact chart
|
| 566 |
+
if 'charts' in impact_analysis and 'hourly_impact' in impact_analysis['charts']:
|
| 567 |
+
st.plotly_chart(impact_analysis['charts']['hourly_impact'], use_container_width=True)
|
| 568 |
+
|
| 569 |
+
# Detailed transactions with impact
|
| 570 |
+
if not impact_analysis['transactions_with_impact'].empty:
|
| 571 |
+
st.subheader("Transactions with Price Impact")
|
| 572 |
+
# Convert numeric columns to have 2 decimal places for better display
|
| 573 |
+
display_df = impact_analysis['transactions_with_impact'].copy()
|
| 574 |
+
for col in ['impact_pct', 'pre_price', 'post_price', 'cumulative_impact']:
|
| 575 |
+
if col in display_df.columns:
|
| 576 |
+
display_df[col] = display_df[col].apply(lambda x: f"{float(x):.2f}%" if pd.notnull(x) else "N/A")
|
| 577 |
+
|
| 578 |
+
st.dataframe(display_df, use_container_width=True)
|
| 579 |
+
else:
|
| 580 |
+
st.info("No transaction-specific price impact data available")
|
| 581 |
+
else:
|
| 582 |
+
st.info("No price impact data available for the given parameters")
|
| 583 |
+
else:
|
| 584 |
+
st.info("Enable Price Impact Analysis and track transactions to see price effects")
|
| 585 |
+
|
| 586 |
+
with tab4:
|
| 587 |
+
st.header("Manipulation Alerts")
|
| 588 |
+
if enable_manipulation_detection and detect_button and wallet_addresses:
|
| 589 |
+
with st.spinner("Detecting potential manipulation..."):
|
| 590 |
+
wallet_list = [addr.strip() for addr in wallet_addresses.split("\n") if addr.strip()]
|
| 591 |
+
|
| 592 |
+
# Function to detect manipulation
|
| 593 |
+
def detect_manipulation(wallets, start_date, end_date, sensitivity):
|
| 594 |
+
try:
|
| 595 |
+
transactions_df = arbiscan_client.fetch_whale_transactions(addresses=wallets, max_pages=5)
|
| 596 |
+
if transactions_df.empty:
|
| 597 |
+
st.warning("No transactions found for the specified addresses")
|
| 598 |
+
return []
|
| 599 |
+
|
| 600 |
+
pump_dump = detection.detect_pump_and_dump(transactions_df, sensitivity)
|
| 601 |
+
wash_trades = detection.detect_wash_trading(transactions_df, wallets, sensitivity)
|
| 602 |
+
return pump_dump + wash_trades
|
| 603 |
+
except Exception as e:
|
| 604 |
+
st.error(f"Error detecting manipulation: {str(e)}")
|
| 605 |
+
return []
|
| 606 |
+
|
| 607 |
+
alerts = detect_manipulation(
|
| 608 |
+
wallets=wallet_list,
|
| 609 |
+
start_date=start_date,
|
| 610 |
+
end_date=end_date,
|
| 611 |
+
sensitivity=sensitivity
|
| 612 |
+
)
|
| 613 |
+
|
| 614 |
+
if alerts:
|
| 615 |
+
for i, alert in enumerate(alerts):
|
| 616 |
+
alert_color = "red" if alert['risk_level'] == "High" else "orange" if alert['risk_level'] == "Medium" else "blue"
|
| 617 |
+
|
| 618 |
+
with st.expander(f" {alert['type']} - Risk: {alert['risk_level']}", expanded=i==0):
|
| 619 |
+
st.markdown(f"<h4 style='color:{alert_color}'>{alert['title']}</h4>", unsafe_allow_html=True)
|
| 620 |
+
st.write(f"**Description:** {alert['description']}")
|
| 621 |
+
st.write(f"**Detection Time:** {alert['detection_time']}")
|
| 622 |
+
st.write(f"**Involved Addresses:** {', '.join(alert['addresses'])}")
|
| 623 |
+
|
| 624 |
+
# Display evidence
|
| 625 |
+
if 'evidence' in alert and alert['evidence'] is not None and not (isinstance(alert['evidence'], pd.DataFrame) and alert['evidence'].empty):
|
| 626 |
+
st.subheader("Evidence")
|
| 627 |
+
try:
|
| 628 |
+
evidence_df = alert['evidence']
|
| 629 |
+
if isinstance(evidence_df, str):
|
| 630 |
+
# Try to convert from JSON string if needed
|
| 631 |
+
evidence_df = pd.read_json(evidence_df)
|
| 632 |
+
st.dataframe(evidence_df, use_container_width=True)
|
| 633 |
+
except Exception as e:
|
| 634 |
+
st.error(f"Error displaying evidence: {str(e)}")
|
| 635 |
+
|
| 636 |
+
# Display chart if available
|
| 637 |
+
if 'chart' in alert and alert['chart'] is not None:
|
| 638 |
+
try:
|
| 639 |
+
st.plotly_chart(alert['chart'], use_container_width=True)
|
| 640 |
+
except Exception as e:
|
| 641 |
+
st.error(f"Error displaying chart: {str(e)}")
|
| 642 |
+
else:
|
| 643 |
+
st.success("No manipulation tactics detected for the given parameters")
|
| 644 |
+
else:
|
| 645 |
+
st.info("Enable Manipulation Detection and click 'Detect Manipulation' to scan for suspicious activity")
|
| 646 |
+
|
| 647 |
+
with tab5:
|
| 648 |
+
st.header("Reports & Visualizations")
|
| 649 |
+
|
| 650 |
+
# Report type selection
|
| 651 |
+
report_type = st.selectbox(
|
| 652 |
+
"Select Report Type",
|
| 653 |
+
["Transaction Summary", "Pattern Analysis", "Price Impact", "Manipulation Detection", "Complete Analysis"]
|
| 654 |
+
)
|
| 655 |
+
|
| 656 |
+
# Export format
|
| 657 |
+
export_format = st.radio(
|
| 658 |
+
"Export Format",
|
| 659 |
+
["CSV", "PDF", "PNG"],
|
| 660 |
+
horizontal=True
|
| 661 |
+
)
|
| 662 |
+
|
| 663 |
+
# Generate report button
|
| 664 |
+
if st.button("Generate Report"):
|
| 665 |
+
if wallet_addresses:
|
| 666 |
+
with st.spinner("Generating report..."):
|
| 667 |
+
wallet_list = [addr.strip() for addr in wallet_addresses.split("\n") if addr.strip()]
|
| 668 |
+
|
| 669 |
+
if CREW_ENABLED and crew_system is not None:
|
| 670 |
+
try:
|
| 671 |
+
with st.spinner("Generating AI analysis report..."):
|
| 672 |
+
# Check if crew_system has llm attribute defined
|
| 673 |
+
if not hasattr(crew_system, 'llm') or crew_system.llm is None:
|
| 674 |
+
raise ValueError("LLM not initialized in crew system")
|
| 675 |
+
|
| 676 |
+
report = crew_system.generate_market_manipulation_report(wallet_addresses=wallet_list)
|
| 677 |
+
st.markdown(f"## AI Analysis Report")
|
| 678 |
+
st.markdown(report['content'])
|
| 679 |
+
|
| 680 |
+
if 'charts' in report and report['charts']:
|
| 681 |
+
for i, chart in enumerate(report['charts']):
|
| 682 |
+
st.plotly_chart(chart, use_container_width=True)
|
| 683 |
+
except Exception as e:
|
| 684 |
+
st.error(f"CrewAI report generation failed: {str(e)}")
|
| 685 |
+
st.warning("Using direct analysis instead")
|
| 686 |
+
|
| 687 |
+
# Fallback to direct analysis
|
| 688 |
+
with st.spinner("Generating basic analysis..."):
|
| 689 |
+
insights = detection.generate_manipulation_insights(transactions=st.session_state.transactions_data)
|
| 690 |
+
st.markdown(f"## Potential Manipulation Insights")
|
| 691 |
+
|
| 692 |
+
for insight in insights:
|
| 693 |
+
st.markdown(f"**{insight['title']}**\n{insight['description']}")
|
| 694 |
+
else:
|
| 695 |
+
st.error("Failed to generate report: CrewAI is not enabled")
|
| 696 |
+
else:
|
| 697 |
+
st.error("Please enter wallet addresses to generate a report")
|
| 698 |
+
|
| 699 |
+
# Footer with instructions
|
| 700 |
+
st.markdown("---")
|
| 701 |
+
with st.expander("How to Use"):
|
| 702 |
+
st.markdown("""
|
| 703 |
+
### Typical Workflow
|
| 704 |
+
|
| 705 |
+
1. **Input wallet addresses** in the sidebar - these are the whale wallets you want to track
|
| 706 |
+
2. **Set the minimum threshold** for transaction size (token amount or USD value)
|
| 707 |
+
3. **Select time period** for analysis
|
| 708 |
+
4. **Click 'Track Transactions'** to see large transfers for these wallets
|
| 709 |
+
5. **Enable additional analysis** like pattern recognition or manipulation detection
|
| 710 |
+
6. **Export reports** for further analysis or record-keeping
|
| 711 |
+
|
| 712 |
+
### API Keys
|
| 713 |
+
|
| 714 |
+
This app requires two API keys to function properly:
|
| 715 |
+
- **ARBISCAN_API_KEY** - For accessing Arbitrum blockchain data
|
| 716 |
+
- **GEMINI_API_KEY** - For real-time token price data
|
| 717 |
+
|
| 718 |
+
These should be stored in a `.env` file in the project root.
|
| 719 |
+
""")
|
modules/__init__.py
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
|
modules/__pycache__/__init__.cpython-312.pyc
ADDED
|
Binary file (157 Bytes). View file
|
|
|
modules/__pycache__/api_client.cpython-312.pyc
ADDED
|
Binary file (30.1 kB). View file
|
|
|
modules/__pycache__/crew_system.cpython-312.pyc
ADDED
|
Binary file (36.2 kB). View file
|
|
|
modules/__pycache__/crew_tools.cpython-312.pyc
ADDED
|
Binary file (18.3 kB). View file
|
|
|
modules/__pycache__/data_processor.cpython-312.pyc
ADDED
|
Binary file (44.1 kB). View file
|
|
|
modules/__pycache__/detection.cpython-312.pyc
ADDED
|
Binary file (22.3 kB). View file
|
|
|
modules/__pycache__/visualizer.cpython-312.pyc
ADDED
|
Binary file (23.2 kB). View file
|
|
|
modules/api_client.py
ADDED
|
@@ -0,0 +1,768 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import requests
|
| 2 |
+
import json
|
| 3 |
+
import time
|
| 4 |
+
import logging
|
| 5 |
+
from datetime import datetime
|
| 6 |
+
import pandas as pd
|
| 7 |
+
from typing import Dict, List, Optional, Union, Any
|
| 8 |
+
|
| 9 |
+
class ArbiscanClient:
|
| 10 |
+
"""
|
| 11 |
+
Client to interact with the Arbiscan API for fetching on-chain data from Arbitrum
|
| 12 |
+
"""
|
| 13 |
+
|
| 14 |
+
def __init__(self, api_key: str):
|
| 15 |
+
self.api_key = api_key
|
| 16 |
+
self.base_url = "https://api.arbiscan.io/api"
|
| 17 |
+
self.rate_limit_delay = 0.2 # Delay between API calls to avoid rate limiting (200ms)
|
| 18 |
+
|
| 19 |
+
# Add caching to improve performance
|
| 20 |
+
self._transaction_cache = {}
|
| 21 |
+
self._last_api_call_time = 0
|
| 22 |
+
|
| 23 |
+
# Configure debug logging - set to True for verbose output, False for minimal output
|
| 24 |
+
self.verbose_debug = False
|
| 25 |
+
|
| 26 |
+
def _make_request(self, params: Dict[str, str]) -> Dict[str, Any]:
|
| 27 |
+
"""
|
| 28 |
+
Make a request to the Arbiscan API with rate limiting
|
| 29 |
+
"""
|
| 30 |
+
params["apikey"] = self.api_key
|
| 31 |
+
|
| 32 |
+
# Implement rate limiting
|
| 33 |
+
current_time = time.time()
|
| 34 |
+
time_since_last_call = current_time - self._last_api_call_time
|
| 35 |
+
if time_since_last_call < self.rate_limit_delay:
|
| 36 |
+
time.sleep(self.rate_limit_delay - time_since_last_call)
|
| 37 |
+
self._last_api_call_time = time.time()
|
| 38 |
+
|
| 39 |
+
try:
|
| 40 |
+
# Log the request details but only in verbose mode
|
| 41 |
+
if self.verbose_debug:
|
| 42 |
+
debug_params = params.copy()
|
| 43 |
+
debug_params.pop("apikey", None)
|
| 44 |
+
logging.debug(f"API Request: {self.base_url}")
|
| 45 |
+
logging.debug(f"Params: {json.dumps(debug_params, indent=2)}")
|
| 46 |
+
|
| 47 |
+
response = requests.get(self.base_url, params=params)
|
| 48 |
+
|
| 49 |
+
# Print response status and URL only in verbose mode
|
| 50 |
+
if self.verbose_debug:
|
| 51 |
+
logging.debug(f"Response Status: {response.status_code}")
|
| 52 |
+
logging.debug(f"Full URL: {response.url.replace(self.api_key, 'API_KEY_REDACTED')}")
|
| 53 |
+
|
| 54 |
+
response.raise_for_status()
|
| 55 |
+
|
| 56 |
+
# Parse the JSON response
|
| 57 |
+
json_data = response.json()
|
| 58 |
+
|
| 59 |
+
# Log the response structure but only in verbose mode
|
| 60 |
+
if self.verbose_debug:
|
| 61 |
+
result_preview = str(json_data.get('result', ''))[:100] + '...' if len(str(json_data.get('result', ''))) > 100 else str(json_data.get('result', ''))
|
| 62 |
+
logging.debug(f"Response Status: {json_data.get('status')}")
|
| 63 |
+
logging.debug(f"Response Message: {json_data.get('message', 'No message')}")
|
| 64 |
+
logging.debug(f"Result Preview: {result_preview}")
|
| 65 |
+
|
| 66 |
+
# Check for API-level errors in the response
|
| 67 |
+
status = json_data.get('status')
|
| 68 |
+
message = json_data.get('message', 'No message')
|
| 69 |
+
if status == '0' and message != 'No transactions found':
|
| 70 |
+
logging.warning(f"API Error: {message}")
|
| 71 |
+
|
| 72 |
+
return json_data
|
| 73 |
+
|
| 74 |
+
except requests.exceptions.HTTPError as e:
|
| 75 |
+
logging.error(f"HTTP Error in API Request: {e.response.status_code}")
|
| 76 |
+
raise
|
| 77 |
+
|
| 78 |
+
except requests.exceptions.ConnectionError as e:
|
| 79 |
+
logging.error(f"Connection Error in API Request: {str(e)}")
|
| 80 |
+
raise
|
| 81 |
+
|
| 82 |
+
except requests.exceptions.Timeout as e:
|
| 83 |
+
logging.error(f"Timeout in API Request: {str(e)}")
|
| 84 |
+
raise
|
| 85 |
+
|
| 86 |
+
except requests.exceptions.RequestException as e:
|
| 87 |
+
logging.error(f"API Request failed: {str(e)}")
|
| 88 |
+
print(f"ERROR - URL: {self.base_url}")
|
| 89 |
+
print(f"ERROR - Method: {params.get('module')}/{params.get('action')}")
|
| 90 |
+
return {"status": "0", "message": f"Error: {str(e)}", "result": []}
|
| 91 |
+
|
| 92 |
+
def get_eth_balance(self, address: str) -> float:
|
| 93 |
+
"""
|
| 94 |
+
Get the ETH balance of an address
|
| 95 |
+
|
| 96 |
+
Args:
|
| 97 |
+
address: Wallet address
|
| 98 |
+
|
| 99 |
+
Returns:
|
| 100 |
+
ETH balance as a float
|
| 101 |
+
"""
|
| 102 |
+
params = {
|
| 103 |
+
"module": "account",
|
| 104 |
+
"action": "balance",
|
| 105 |
+
"address": address,
|
| 106 |
+
"tag": "latest"
|
| 107 |
+
}
|
| 108 |
+
|
| 109 |
+
result = self._make_request(params)
|
| 110 |
+
|
| 111 |
+
if result.get("status") == "1":
|
| 112 |
+
# Convert wei to ETH
|
| 113 |
+
wei_balance = int(result.get("result", "0"))
|
| 114 |
+
eth_balance = wei_balance / 10**18
|
| 115 |
+
return eth_balance
|
| 116 |
+
else:
|
| 117 |
+
return 0.0
|
| 118 |
+
|
| 119 |
+
def get_token_balance(self, address: str, token_address: str) -> float:
|
| 120 |
+
"""
|
| 121 |
+
Get the token balance of an address for a specific token
|
| 122 |
+
|
| 123 |
+
Args:
|
| 124 |
+
address: Wallet address
|
| 125 |
+
token_address: Token contract address
|
| 126 |
+
|
| 127 |
+
Returns:
|
| 128 |
+
Token balance as a float
|
| 129 |
+
"""
|
| 130 |
+
params = {
|
| 131 |
+
"module": "account",
|
| 132 |
+
"action": "tokenbalance",
|
| 133 |
+
"address": address,
|
| 134 |
+
"contractaddress": token_address,
|
| 135 |
+
"tag": "latest"
|
| 136 |
+
}
|
| 137 |
+
|
| 138 |
+
result = self._make_request(params)
|
| 139 |
+
|
| 140 |
+
if result.get("status") == "1":
|
| 141 |
+
# Get token decimals and convert to proper amount
|
| 142 |
+
decimals = self.get_token_decimals(token_address)
|
| 143 |
+
raw_balance = int(result.get("result", "0"))
|
| 144 |
+
token_balance = raw_balance / 10**decimals
|
| 145 |
+
return token_balance
|
| 146 |
+
else:
|
| 147 |
+
return 0.0
|
| 148 |
+
|
| 149 |
+
def get_token_decimals(self, token_address: str) -> int:
|
| 150 |
+
"""
|
| 151 |
+
Get the number of decimals for a token
|
| 152 |
+
|
| 153 |
+
Args:
|
| 154 |
+
token_address: Token contract address
|
| 155 |
+
|
| 156 |
+
Returns:
|
| 157 |
+
Number of decimals (default: 18)
|
| 158 |
+
"""
|
| 159 |
+
params = {
|
| 160 |
+
"module": "token",
|
| 161 |
+
"action": "getToken",
|
| 162 |
+
"contractaddress": token_address
|
| 163 |
+
}
|
| 164 |
+
|
| 165 |
+
result = self._make_request(params)
|
| 166 |
+
|
| 167 |
+
if result.get("status") == "1":
|
| 168 |
+
token_info = result.get("result", {})
|
| 169 |
+
return int(token_info.get("divisor", "18"))
|
| 170 |
+
else:
|
| 171 |
+
# Default to 18 decimals (most ERC-20 tokens)
|
| 172 |
+
return 18
|
| 173 |
+
|
| 174 |
+
def get_token_transfers(self,
|
| 175 |
+
address: str,
|
| 176 |
+
contract_address: Optional[str] = None,
|
| 177 |
+
start_block: int = 0,
|
| 178 |
+
end_block: int = 99999999,
|
| 179 |
+
page: int = 1,
|
| 180 |
+
offset: int = 100,
|
| 181 |
+
sort: str = "desc") -> List[Dict[str, Any]]:
|
| 182 |
+
"""
|
| 183 |
+
Get token transfers for an address
|
| 184 |
+
|
| 185 |
+
Args:
|
| 186 |
+
address: Wallet address
|
| 187 |
+
contract_address: Optional token contract address to filter by
|
| 188 |
+
start_block: Starting block number
|
| 189 |
+
end_block: Ending block number
|
| 190 |
+
page: Page number
|
| 191 |
+
offset: Number of results per page
|
| 192 |
+
sort: Sort order ("asc" or "desc")
|
| 193 |
+
|
| 194 |
+
Returns:
|
| 195 |
+
List of token transfers
|
| 196 |
+
"""
|
| 197 |
+
params = {
|
| 198 |
+
"module": "account",
|
| 199 |
+
"action": "tokentx",
|
| 200 |
+
"address": address,
|
| 201 |
+
"startblock": str(start_block),
|
| 202 |
+
"endblock": str(end_block),
|
| 203 |
+
"page": str(page),
|
| 204 |
+
"offset": str(offset),
|
| 205 |
+
"sort": sort
|
| 206 |
+
}
|
| 207 |
+
|
| 208 |
+
# Add contract address if specified
|
| 209 |
+
if contract_address:
|
| 210 |
+
params["contractaddress"] = contract_address
|
| 211 |
+
|
| 212 |
+
result = self._make_request(params)
|
| 213 |
+
|
| 214 |
+
if result.get("status") == "1":
|
| 215 |
+
return result.get("result", [])
|
| 216 |
+
else:
|
| 217 |
+
message = result.get("message", "Unknown error")
|
| 218 |
+
if "No transactions found" in message:
|
| 219 |
+
return []
|
| 220 |
+
else:
|
| 221 |
+
logging.warning(f"Error fetching token transfers: {message}")
|
| 222 |
+
return []
|
| 223 |
+
|
| 224 |
+
def fetch_all_token_transfers(self,
|
| 225 |
+
address: str,
|
| 226 |
+
contract_address: Optional[str] = None,
|
| 227 |
+
start_block: int = 0,
|
| 228 |
+
end_block: int = 99999999,
|
| 229 |
+
max_pages: int = 10) -> List[Dict[str, Any]]:
|
| 230 |
+
"""
|
| 231 |
+
Fetch all token transfers for an address, paginating through results
|
| 232 |
+
|
| 233 |
+
Args:
|
| 234 |
+
address: Wallet address
|
| 235 |
+
contract_address: Optional token contract address to filter by
|
| 236 |
+
start_block: Starting block number
|
| 237 |
+
end_block: Ending block number
|
| 238 |
+
max_pages: Maximum number of pages to fetch
|
| 239 |
+
|
| 240 |
+
Returns:
|
| 241 |
+
List of all token transfers
|
| 242 |
+
"""
|
| 243 |
+
all_transfers = []
|
| 244 |
+
offset = 100 # Results per page (API limit)
|
| 245 |
+
|
| 246 |
+
for page in range(1, max_pages + 1):
|
| 247 |
+
try:
|
| 248 |
+
transfers = self.get_token_transfers(
|
| 249 |
+
address=address,
|
| 250 |
+
contract_address=contract_address,
|
| 251 |
+
start_block=start_block,
|
| 252 |
+
end_block=end_block,
|
| 253 |
+
page=page,
|
| 254 |
+
offset=offset
|
| 255 |
+
)
|
| 256 |
+
|
| 257 |
+
# No more transfers, break the loop
|
| 258 |
+
if not transfers:
|
| 259 |
+
break
|
| 260 |
+
|
| 261 |
+
all_transfers.extend(transfers)
|
| 262 |
+
|
| 263 |
+
# If we got fewer results than the offset, we've reached the end
|
| 264 |
+
if len(transfers) < offset:
|
| 265 |
+
break
|
| 266 |
+
|
| 267 |
+
except Exception as e:
|
| 268 |
+
logging.error(f"Error fetching page {page} of token transfers: {str(e)}")
|
| 269 |
+
break
|
| 270 |
+
|
| 271 |
+
return all_transfers
|
| 272 |
+
|
| 273 |
+
def fetch_whale_transactions(self,
|
| 274 |
+
addresses: List[str],
|
| 275 |
+
token_address: Optional[str] = None,
|
| 276 |
+
min_token_amount: Optional[float] = None,
|
| 277 |
+
min_usd_value: Optional[float] = None,
|
| 278 |
+
start_block: int = 0,
|
| 279 |
+
end_block: int = 99999999,
|
| 280 |
+
max_pages: int = 10) -> pd.DataFrame:
|
| 281 |
+
"""
|
| 282 |
+
Fetch whale transactions for a list of addresses
|
| 283 |
+
|
| 284 |
+
Args:
|
| 285 |
+
addresses: List of wallet addresses
|
| 286 |
+
token_address: Optional token contract address to filter by
|
| 287 |
+
min_token_amount: Minimum token amount to be considered a whale transaction
|
| 288 |
+
min_usd_value: Minimum USD value to be considered a whale transaction
|
| 289 |
+
start_block: Starting block number
|
| 290 |
+
end_block: Ending block number
|
| 291 |
+
max_pages: Maximum number of pages to fetch per address (default: 10)
|
| 292 |
+
|
| 293 |
+
Returns:
|
| 294 |
+
DataFrame of whale transactions
|
| 295 |
+
"""
|
| 296 |
+
try:
|
| 297 |
+
# Create a cache key based on parameters
|
| 298 |
+
cache_key = f"{','.join(addresses)}_{token_address}_{min_token_amount}_{min_usd_value}_{start_block}_{end_block}_{max_pages}"
|
| 299 |
+
|
| 300 |
+
# Check if we have cached results
|
| 301 |
+
if cache_key in self._transaction_cache:
|
| 302 |
+
logging.info(f"Using cached transactions for {len(addresses)} addresses")
|
| 303 |
+
return self._transaction_cache[cache_key]
|
| 304 |
+
|
| 305 |
+
all_transfers = []
|
| 306 |
+
|
| 307 |
+
logging.info(f"Fetching whale transactions for {len(addresses)} addresses")
|
| 308 |
+
logging.info(f"Token address filter: {token_address if token_address else 'None'}")
|
| 309 |
+
logging.info(f"Min token amount: {min_token_amount}")
|
| 310 |
+
logging.info(f"Min USD value: {min_usd_value}")
|
| 311 |
+
|
| 312 |
+
for i, address in enumerate(addresses):
|
| 313 |
+
try:
|
| 314 |
+
logging.info(f"Processing address {i+1}/{len(addresses)}: {address}")
|
| 315 |
+
|
| 316 |
+
# Create address-specific cache key
|
| 317 |
+
addr_cache_key = f"{address}_{token_address}_{start_block}_{end_block}_{max_pages}"
|
| 318 |
+
|
| 319 |
+
# Check if we have cached results for this specific address
|
| 320 |
+
if addr_cache_key in self._transaction_cache:
|
| 321 |
+
transfers = self._transaction_cache[addr_cache_key]
|
| 322 |
+
logging.info(f"Using cached {len(transfers)} transfers for address {address}")
|
| 323 |
+
else:
|
| 324 |
+
transfers = self.fetch_all_token_transfers(
|
| 325 |
+
address=address,
|
| 326 |
+
contract_address=token_address,
|
| 327 |
+
start_block=start_block,
|
| 328 |
+
end_block=end_block,
|
| 329 |
+
max_pages=max_pages
|
| 330 |
+
)
|
| 331 |
+
logging.info(f"Found {len(transfers)} transfers for address {address}")
|
| 332 |
+
# Cache the results for this address
|
| 333 |
+
self._transaction_cache[addr_cache_key] = transfers
|
| 334 |
+
|
| 335 |
+
all_transfers.extend(transfers)
|
| 336 |
+
except Exception as e:
|
| 337 |
+
logging.error(f"Failed to fetch transactions for address {address}: {str(e)}")
|
| 338 |
+
continue
|
| 339 |
+
|
| 340 |
+
logging.info(f"Total transfers found: {len(all_transfers)}")
|
| 341 |
+
|
| 342 |
+
if not all_transfers:
|
| 343 |
+
logging.warning("No whale transactions found for the specified addresses")
|
| 344 |
+
return pd.DataFrame()
|
| 345 |
+
|
| 346 |
+
# Convert to DataFrame
|
| 347 |
+
logging.info("Converting transfers to DataFrame")
|
| 348 |
+
df = pd.DataFrame(all_transfers)
|
| 349 |
+
|
| 350 |
+
# Log the column names
|
| 351 |
+
logging.info(f"DataFrame created with {len(df)} rows and {len(df.columns)} columns")
|
| 352 |
+
logging.info(f"Columns: {', '.join(df.columns[:5])}...")
|
| 353 |
+
|
| 354 |
+
# Apply token amount filter if specified
|
| 355 |
+
if min_token_amount is not None:
|
| 356 |
+
logging.info(f"Applying min token amount filter: {min_token_amount}")
|
| 357 |
+
# Convert to float and then filter
|
| 358 |
+
df['tokenAmount'] = df['value'].astype(float) / (10 ** df['tokenDecimal'].astype(int))
|
| 359 |
+
df = df[df['tokenAmount'] >= min_token_amount]
|
| 360 |
+
logging.info(f"After token amount filtering: {len(df)}/{len(all_transfers)} rows remain")
|
| 361 |
+
|
| 362 |
+
# Apply USD value filter if specified (this would require price data)
|
| 363 |
+
if min_usd_value is not None and 'tokenAmount' in df.columns:
|
| 364 |
+
logging.info(f"USD value filtering is not implemented yet")
|
| 365 |
+
# This would require token price data, which we don't have yet
|
| 366 |
+
# df = df[df['usd_value'] >= min_usd_value]
|
| 367 |
+
|
| 368 |
+
# Convert timestamp to datetime
|
| 369 |
+
if 'timeStamp' in df.columns:
|
| 370 |
+
logging.info("Converting timestamp to datetime")
|
| 371 |
+
try:
|
| 372 |
+
df['timeStamp'] = pd.to_datetime(df['timeStamp'].astype(float), unit='s')
|
| 373 |
+
except Exception as e:
|
| 374 |
+
logging.error(f"Error converting timestamp: {str(e)}")
|
| 375 |
+
|
| 376 |
+
logging.info(f"Final DataFrame has {len(df)} rows")
|
| 377 |
+
|
| 378 |
+
# Cache the final result
|
| 379 |
+
self._transaction_cache[cache_key] = df
|
| 380 |
+
|
| 381 |
+
return df
|
| 382 |
+
|
| 383 |
+
except Exception as e:
|
| 384 |
+
logging.error(f"Error fetching whale transactions: {str(e)}")
|
| 385 |
+
return pd.DataFrame()
|
| 386 |
+
|
| 387 |
+
def get_internal_transactions(self,
|
| 388 |
+
address: str,
|
| 389 |
+
start_block: int = 0,
|
| 390 |
+
end_block: int = 99999999,
|
| 391 |
+
page: int = 1,
|
| 392 |
+
offset: int = 100,
|
| 393 |
+
sort: str = "desc") -> List[Dict[str, Any]]:
|
| 394 |
+
"""
|
| 395 |
+
Get internal transactions for an address
|
| 396 |
+
|
| 397 |
+
Args:
|
| 398 |
+
address: Wallet address
|
| 399 |
+
start_block: Starting block number
|
| 400 |
+
end_block: Ending block number
|
| 401 |
+
page: Page number
|
| 402 |
+
offset: Number of results per page
|
| 403 |
+
sort: Sort order ("asc" or "desc")
|
| 404 |
+
|
| 405 |
+
Returns:
|
| 406 |
+
List of internal transactions
|
| 407 |
+
"""
|
| 408 |
+
params = {
|
| 409 |
+
"module": "account",
|
| 410 |
+
"action": "txlistinternal",
|
| 411 |
+
"address": address,
|
| 412 |
+
"startblock": str(start_block),
|
| 413 |
+
"endblock": str(end_block),
|
| 414 |
+
"page": str(page),
|
| 415 |
+
"offset": str(offset),
|
| 416 |
+
"sort": sort
|
| 417 |
+
}
|
| 418 |
+
|
| 419 |
+
result = self._make_request(params)
|
| 420 |
+
|
| 421 |
+
if result.get("status") == "1":
|
| 422 |
+
return result.get("result", [])
|
| 423 |
+
else:
|
| 424 |
+
message = result.get("message", "Unknown error")
|
| 425 |
+
if "No transactions found" in message:
|
| 426 |
+
return []
|
| 427 |
+
else:
|
| 428 |
+
logging.warning(f"Error fetching internal transactions: {message}")
|
| 429 |
+
return []
|
| 430 |
+
|
| 431 |
+
|
| 432 |
+
class GeminiClient:
|
| 433 |
+
"""
|
| 434 |
+
Client to interact with the Gemini API for fetching token prices
|
| 435 |
+
"""
|
| 436 |
+
|
| 437 |
+
def __init__(self, api_key: str):
|
| 438 |
+
self.api_key = api_key
|
| 439 |
+
self.base_url = "https://api.gemini.com/v1"
|
| 440 |
+
# Add caching to avoid repetitive API calls
|
| 441 |
+
self._price_cache = {}
|
| 442 |
+
# Track API errors to avoid flooding logs
|
| 443 |
+
self._error_count = {}
|
| 444 |
+
self._last_api_call = 0 # For rate limiting
|
| 445 |
+
|
| 446 |
+
def get_current_price(self, symbol: str) -> Optional[float]:
|
| 447 |
+
"""
|
| 448 |
+
Get the current price of a token
|
| 449 |
+
|
| 450 |
+
Args:
|
| 451 |
+
symbol: Token symbol (e.g., "ETHUSD")
|
| 452 |
+
|
| 453 |
+
Returns:
|
| 454 |
+
Current price as a float or None if not found
|
| 455 |
+
"""
|
| 456 |
+
try:
|
| 457 |
+
url = f"{self.base_url}/pubticker/{symbol}"
|
| 458 |
+
response = requests.get(url)
|
| 459 |
+
response.raise_for_status()
|
| 460 |
+
data = response.json()
|
| 461 |
+
return float(data.get("last", 0))
|
| 462 |
+
except requests.exceptions.RequestException as e:
|
| 463 |
+
logging.error(f"Error fetching price from Gemini API: {e}")
|
| 464 |
+
return None
|
| 465 |
+
|
| 466 |
+
def get_historical_prices(self,
|
| 467 |
+
symbol: str,
|
| 468 |
+
start_time: datetime,
|
| 469 |
+
end_time: datetime) -> Optional[pd.DataFrame]:
|
| 470 |
+
"""
|
| 471 |
+
Get historical prices for a token within a time range
|
| 472 |
+
|
| 473 |
+
Args:
|
| 474 |
+
symbol: Token symbol (e.g., "ETHUSD")
|
| 475 |
+
start_time: Start datetime
|
| 476 |
+
end_time: End datetime
|
| 477 |
+
|
| 478 |
+
Returns:
|
| 479 |
+
DataFrame of historical prices with timestamps
|
| 480 |
+
"""
|
| 481 |
+
# Implement simple rate limiting
|
| 482 |
+
current_time = time.time()
|
| 483 |
+
if current_time - self._last_api_call < 0.05: # 50ms minimum between calls
|
| 484 |
+
time.sleep(0.05)
|
| 485 |
+
self._last_api_call = current_time
|
| 486 |
+
|
| 487 |
+
# Create a cache key based on the parameters
|
| 488 |
+
cache_key = f"{symbol}_{int(start_time.timestamp())}_{int(end_time.timestamp())}"
|
| 489 |
+
|
| 490 |
+
# Check if we already have this data cached
|
| 491 |
+
if cache_key in self._price_cache:
|
| 492 |
+
return self._price_cache[cache_key]
|
| 493 |
+
|
| 494 |
+
try:
|
| 495 |
+
# Convert datetime to milliseconds
|
| 496 |
+
start_ms = int(start_time.timestamp() * 1000)
|
| 497 |
+
end_ms = int(end_time.timestamp() * 1000)
|
| 498 |
+
|
| 499 |
+
url = f"{self.base_url}/trades/{symbol}"
|
| 500 |
+
params = {
|
| 501 |
+
"limit_trades": 500,
|
| 502 |
+
"timestamp": start_ms
|
| 503 |
+
}
|
| 504 |
+
|
| 505 |
+
# Check if we've seen too many errors for this symbol
|
| 506 |
+
error_key = f"error_{symbol}"
|
| 507 |
+
if self._error_count.get(error_key, 0) > 10:
|
| 508 |
+
# If we've already had too many errors for this symbol, don't try again
|
| 509 |
+
return None
|
| 510 |
+
|
| 511 |
+
response = requests.get(url, params=params)
|
| 512 |
+
response.raise_for_status()
|
| 513 |
+
trades = response.json()
|
| 514 |
+
|
| 515 |
+
# Reset error count on success
|
| 516 |
+
self._error_count[error_key] = 0
|
| 517 |
+
|
| 518 |
+
# Filter trades within the time range
|
| 519 |
+
filtered_trades = [
|
| 520 |
+
trade for trade in trades
|
| 521 |
+
if start_ms <= trade.get("timestampms", 0) <= end_ms
|
| 522 |
+
]
|
| 523 |
+
|
| 524 |
+
if not filtered_trades:
|
| 525 |
+
# Cache negative result to avoid future lookups
|
| 526 |
+
self._price_cache[cache_key] = None
|
| 527 |
+
return None
|
| 528 |
+
|
| 529 |
+
# Convert to DataFrame
|
| 530 |
+
df = pd.DataFrame(filtered_trades)
|
| 531 |
+
|
| 532 |
+
# Convert timestamp to datetime
|
| 533 |
+
df['timestamp'] = pd.to_datetime(df['timestampms'], unit='ms')
|
| 534 |
+
|
| 535 |
+
# Select and rename columns
|
| 536 |
+
result_df = df[['timestamp', 'price', 'amount']].copy()
|
| 537 |
+
result_df.columns = ['Timestamp', 'Price', 'Amount']
|
| 538 |
+
|
| 539 |
+
# Convert price to float
|
| 540 |
+
result_df['Price'] = result_df['Price'].astype(float)
|
| 541 |
+
|
| 542 |
+
# Cache the result
|
| 543 |
+
self._price_cache[cache_key] = result_df
|
| 544 |
+
return result_df
|
| 545 |
+
|
| 546 |
+
except requests.exceptions.HTTPError as e:
|
| 547 |
+
# Handle HTTP errors more efficiently
|
| 548 |
+
self._error_count[error_key] = self._error_count.get(error_key, 0) + 1
|
| 549 |
+
|
| 550 |
+
# Only log the first few occurrences of each error
|
| 551 |
+
if self._error_count[error_key] <= 3:
|
| 552 |
+
logging.warning(f"HTTP error fetching price for {symbol}: {e.response.status_code}")
|
| 553 |
+
return None
|
| 554 |
+
|
| 555 |
+
except Exception as e:
|
| 556 |
+
# For other errors, use a similar approach
|
| 557 |
+
self._error_count[error_key] = self._error_count.get(error_key, 0) + 1
|
| 558 |
+
|
| 559 |
+
if self._error_count[error_key] <= 3:
|
| 560 |
+
logging.error(f"Error fetching prices for {symbol}: {str(e)}")
|
| 561 |
+
return None
|
| 562 |
+
|
| 563 |
+
def get_price_at_time(self,
|
| 564 |
+
symbol: str,
|
| 565 |
+
timestamp: datetime) -> Optional[float]:
|
| 566 |
+
"""
|
| 567 |
+
Get the approximate price of a token at a specific time
|
| 568 |
+
|
| 569 |
+
Args:
|
| 570 |
+
symbol: Token symbol (e.g., "ETHUSD")
|
| 571 |
+
timestamp: Target datetime
|
| 572 |
+
|
| 573 |
+
Returns:
|
| 574 |
+
Price at the specified time as a float or None if not found
|
| 575 |
+
"""
|
| 576 |
+
# Look for prices 5 minutes before and after the target time
|
| 577 |
+
start_time = timestamp - pd.Timedelta(minutes=5)
|
| 578 |
+
end_time = timestamp + pd.Timedelta(minutes=5)
|
| 579 |
+
|
| 580 |
+
prices_df = self.get_historical_prices(symbol, start_time, end_time)
|
| 581 |
+
|
| 582 |
+
if prices_df is None or prices_df.empty:
|
| 583 |
+
return None
|
| 584 |
+
|
| 585 |
+
# Find the closest price
|
| 586 |
+
prices_df['time_diff'] = abs(prices_df['Timestamp'] - timestamp)
|
| 587 |
+
closest_price = prices_df.loc[prices_df['time_diff'].idxmin(), 'Price']
|
| 588 |
+
|
| 589 |
+
return closest_price
|
| 590 |
+
|
| 591 |
+
def get_price_impact(self,
|
| 592 |
+
symbol: str,
|
| 593 |
+
transaction_time: datetime,
|
| 594 |
+
lookback_minutes: int = 5,
|
| 595 |
+
lookahead_minutes: int = 5) -> Dict[str, Any]:
|
| 596 |
+
"""
|
| 597 |
+
Analyze the price impact before and after a transaction
|
| 598 |
+
|
| 599 |
+
Args:
|
| 600 |
+
symbol: Token symbol (e.g., "ETHUSD")
|
| 601 |
+
transaction_time: Transaction datetime
|
| 602 |
+
lookback_minutes: Minutes to look back before the transaction
|
| 603 |
+
lookahead_minutes: Minutes to look ahead after the transaction
|
| 604 |
+
|
| 605 |
+
Returns:
|
| 606 |
+
Dictionary with price impact metrics
|
| 607 |
+
"""
|
| 608 |
+
start_time = transaction_time - pd.Timedelta(minutes=lookback_minutes)
|
| 609 |
+
end_time = transaction_time + pd.Timedelta(minutes=lookahead_minutes)
|
| 610 |
+
|
| 611 |
+
prices_df = self.get_historical_prices(symbol, start_time, end_time)
|
| 612 |
+
|
| 613 |
+
if prices_df is None or prices_df.empty:
|
| 614 |
+
return {
|
| 615 |
+
"pre_price": None,
|
| 616 |
+
"post_price": None,
|
| 617 |
+
"impact_pct": None,
|
| 618 |
+
"prices_df": None
|
| 619 |
+
}
|
| 620 |
+
|
| 621 |
+
# Find pre and post transaction prices
|
| 622 |
+
pre_prices = prices_df[prices_df['Timestamp'] < transaction_time]
|
| 623 |
+
post_prices = prices_df[prices_df['Timestamp'] >= transaction_time]
|
| 624 |
+
|
| 625 |
+
pre_price = pre_prices['Price'].iloc[-1] if not pre_prices.empty else None
|
| 626 |
+
post_price = post_prices['Price'].iloc[0] if not post_prices.empty else None
|
| 627 |
+
|
| 628 |
+
# Calculate impact percentage
|
| 629 |
+
impact_pct = None
|
| 630 |
+
if pre_price is not None and post_price is not None:
|
| 631 |
+
impact_pct = ((post_price - pre_price) / pre_price) * 100
|
| 632 |
+
|
| 633 |
+
return {
|
| 634 |
+
"pre_price": pre_price,
|
| 635 |
+
"post_price": post_price,
|
| 636 |
+
"impact_pct": impact_pct,
|
| 637 |
+
"prices_df": prices_df
|
| 638 |
+
}
|
| 639 |
+
|
| 640 |
+
def fetch_historical_prices(self, token_symbol: str, timestamp) -> Dict[str, Any]:
|
| 641 |
+
"""Fetch historical price data for a token at a specific timestamp
|
| 642 |
+
|
| 643 |
+
Args:
|
| 644 |
+
token_symbol: Token symbol (e.g., "ETH")
|
| 645 |
+
timestamp: Timestamp (can be int, float, datetime, or pandas Timestamp)
|
| 646 |
+
|
| 647 |
+
Returns:
|
| 648 |
+
Dictionary with price data
|
| 649 |
+
"""
|
| 650 |
+
# Convert timestamp to integer if it's not already
|
| 651 |
+
timestamp_value = 0
|
| 652 |
+
try:
|
| 653 |
+
# Handle different timestamp types
|
| 654 |
+
if isinstance(timestamp, (int, float)):
|
| 655 |
+
timestamp_value = int(timestamp)
|
| 656 |
+
elif isinstance(timestamp, pd.Timestamp):
|
| 657 |
+
timestamp_value = int(timestamp.timestamp())
|
| 658 |
+
elif isinstance(timestamp, datetime):
|
| 659 |
+
timestamp_value = int(timestamp.timestamp())
|
| 660 |
+
elif isinstance(timestamp, str):
|
| 661 |
+
# Try to parse string as timestamp
|
| 662 |
+
dt = pd.to_datetime(timestamp)
|
| 663 |
+
timestamp_value = int(dt.timestamp())
|
| 664 |
+
else:
|
| 665 |
+
# Default to current time if invalid type
|
| 666 |
+
logging.warning(f"Invalid timestamp type: {type(timestamp)}, using current time")
|
| 667 |
+
timestamp_value = int(time.time())
|
| 668 |
+
except Exception as e:
|
| 669 |
+
logging.warning(f"Error converting timestamp {timestamp}: {str(e)}, using current time")
|
| 670 |
+
timestamp_value = int(time.time())
|
| 671 |
+
|
| 672 |
+
# Check cache first
|
| 673 |
+
cache_key = f"{token_symbol}_{timestamp_value}"
|
| 674 |
+
if cache_key in self._price_cache:
|
| 675 |
+
return self._price_cache[cache_key]
|
| 676 |
+
|
| 677 |
+
# Implement rate limiting
|
| 678 |
+
current_time = time.time()
|
| 679 |
+
if current_time - self._last_api_call < 0.05: # 50ms minimum between calls
|
| 680 |
+
time.sleep(0.05)
|
| 681 |
+
self._last_api_call = current_time
|
| 682 |
+
|
| 683 |
+
# Check error count for this symbol
|
| 684 |
+
error_key = f"error_{token_symbol}"
|
| 685 |
+
if self._error_count.get(error_key, 0) > 10:
|
| 686 |
+
# Too many errors, return cached failure
|
| 687 |
+
return {
|
| 688 |
+
'symbol': token_symbol,
|
| 689 |
+
'timestamp': timestamp_value,
|
| 690 |
+
'price': None,
|
| 691 |
+
'status': 'error',
|
| 692 |
+
'error': 'Too many previous errors'
|
| 693 |
+
}
|
| 694 |
+
|
| 695 |
+
try:
|
| 696 |
+
url = f"{self.base_url}/trades/{token_symbol}USD"
|
| 697 |
+
params = {
|
| 698 |
+
'limit_trades': 500,
|
| 699 |
+
'timestamp': timestamp_value * 1000 # Convert to milliseconds
|
| 700 |
+
}
|
| 701 |
+
|
| 702 |
+
response = requests.get(url, params=params)
|
| 703 |
+
response.raise_for_status()
|
| 704 |
+
data = response.json()
|
| 705 |
+
|
| 706 |
+
# Reset error count on success
|
| 707 |
+
self._error_count[error_key] = 0
|
| 708 |
+
|
| 709 |
+
# Calculate average price from recent trades
|
| 710 |
+
if data:
|
| 711 |
+
prices = [float(trade['price']) for trade in data]
|
| 712 |
+
avg_price = sum(prices) / len(prices)
|
| 713 |
+
result = {
|
| 714 |
+
'symbol': token_symbol,
|
| 715 |
+
'timestamp': timestamp_value,
|
| 716 |
+
'price': avg_price,
|
| 717 |
+
'status': 'success'
|
| 718 |
+
}
|
| 719 |
+
# Cache success
|
| 720 |
+
self._price_cache[cache_key] = result
|
| 721 |
+
return result
|
| 722 |
+
else:
|
| 723 |
+
result = {
|
| 724 |
+
'symbol': token_symbol,
|
| 725 |
+
'timestamp': timestamp_value,
|
| 726 |
+
'price': None,
|
| 727 |
+
'status': 'no_data'
|
| 728 |
+
}
|
| 729 |
+
# Cache no data
|
| 730 |
+
self._price_cache[cache_key] = result
|
| 731 |
+
return result
|
| 732 |
+
|
| 733 |
+
except requests.exceptions.HTTPError as e:
|
| 734 |
+
# Handle HTTP errors efficiently
|
| 735 |
+
self._error_count[error_key] = self._error_count.get(error_key, 0) + 1
|
| 736 |
+
|
| 737 |
+
# Only log first few occurrences
|
| 738 |
+
if self._error_count[error_key] <= 3:
|
| 739 |
+
logging.warning(f"HTTP error fetching price for {token_symbol}: {e.response.status_code}")
|
| 740 |
+
elif self._error_count[error_key] == 10:
|
| 741 |
+
logging.warning(f"Suppressing further logs for {token_symbol} errors")
|
| 742 |
+
|
| 743 |
+
result = {
|
| 744 |
+
'symbol': token_symbol,
|
| 745 |
+
'timestamp': timestamp,
|
| 746 |
+
'price': None,
|
| 747 |
+
'status': 'error',
|
| 748 |
+
'error': f"HTTP {e.response.status_code}"
|
| 749 |
+
}
|
| 750 |
+
self._price_cache[cache_key] = result
|
| 751 |
+
return result
|
| 752 |
+
|
| 753 |
+
except Exception as e:
|
| 754 |
+
# For other errors
|
| 755 |
+
self._error_count[error_key] = self._error_count.get(error_key, 0) + 1
|
| 756 |
+
|
| 757 |
+
if self._error_count[error_key] <= 3:
|
| 758 |
+
logging.error(f"Error fetching prices for {token_symbol}: {str(e)}")
|
| 759 |
+
|
| 760 |
+
result = {
|
| 761 |
+
'symbol': token_symbol,
|
| 762 |
+
'timestamp': timestamp_value,
|
| 763 |
+
'price': None,
|
| 764 |
+
'status': 'error',
|
| 765 |
+
'error': str(e)
|
| 766 |
+
}
|
| 767 |
+
self._price_cache[cache_key] = result
|
| 768 |
+
return result
|
modules/crew_system.py
ADDED
|
@@ -0,0 +1,1117 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import os
|
| 2 |
+
import logging
|
| 3 |
+
from typing import Dict, List, Optional, Union, Any, Tuple
|
| 4 |
+
import pandas as pd
|
| 5 |
+
from datetime import datetime, timedelta
|
| 6 |
+
import io
|
| 7 |
+
import base64
|
| 8 |
+
|
| 9 |
+
from crewai import Agent, Task, Crew, Process
|
| 10 |
+
from langchain.tools import BaseTool
|
| 11 |
+
from langchain.chat_models import ChatOpenAI
|
| 12 |
+
|
| 13 |
+
from modules.api_client import ArbiscanClient, GeminiClient
|
| 14 |
+
from modules.data_processor import DataProcessor
|
| 15 |
+
from modules.crew_tools import (
|
| 16 |
+
ArbiscanGetTokenTransfersTool,
|
| 17 |
+
ArbiscanGetNormalTransactionsTool,
|
| 18 |
+
ArbiscanGetInternalTransactionsTool,
|
| 19 |
+
ArbiscanFetchWhaleTransactionsTool,
|
| 20 |
+
GeminiGetCurrentPriceTool,
|
| 21 |
+
GeminiGetHistoricalPricesTool,
|
| 22 |
+
DataProcessorIdentifyPatternsTool,
|
| 23 |
+
DataProcessorDetectAnomalousTransactionsTool,
|
| 24 |
+
set_global_clients
|
| 25 |
+
)
|
| 26 |
+
|
| 27 |
+
|
| 28 |
+
class WhaleAnalysisCrewSystem:
|
| 29 |
+
"""
|
| 30 |
+
CrewAI system for analyzing whale wallet activity and detecting market manipulation
|
| 31 |
+
"""
|
| 32 |
+
|
| 33 |
+
def __init__(self, arbiscan_client: ArbiscanClient, gemini_client: GeminiClient, data_processor: DataProcessor):
|
| 34 |
+
self.arbiscan_client = arbiscan_client
|
| 35 |
+
self.gemini_client = gemini_client
|
| 36 |
+
self.data_processor = data_processor
|
| 37 |
+
|
| 38 |
+
# Initialize LLM
|
| 39 |
+
try:
|
| 40 |
+
from langchain.chat_models import ChatOpenAI
|
| 41 |
+
self.llm = ChatOpenAI(
|
| 42 |
+
model="gpt-4",
|
| 43 |
+
temperature=0.2,
|
| 44 |
+
api_key=os.getenv("OPENAI_API_KEY")
|
| 45 |
+
)
|
| 46 |
+
except Exception as e:
|
| 47 |
+
logging.warning(f"Could not initialize LLM: {str(e)}")
|
| 48 |
+
self.llm = None
|
| 49 |
+
|
| 50 |
+
# Use a factory method to safely create tool instances
|
| 51 |
+
self.setup_tools()
|
| 52 |
+
|
| 53 |
+
def setup_tools(self):
|
| 54 |
+
"""Setup LangChain tools for the whale analysis crew"""
|
| 55 |
+
try:
|
| 56 |
+
# Setup clients
|
| 57 |
+
arbiscan_client = ArbiscanClient(api_key=os.getenv("ARBISCAN_API_KEY"))
|
| 58 |
+
gemini_client = GeminiClient(api_key=os.getenv("GEMINI_API_KEY"))
|
| 59 |
+
data_processor = DataProcessor()
|
| 60 |
+
|
| 61 |
+
# Set global clients first
|
| 62 |
+
set_global_clients(
|
| 63 |
+
arbiscan_client=arbiscan_client,
|
| 64 |
+
gemini_client=gemini_client,
|
| 65 |
+
data_processor=data_processor
|
| 66 |
+
)
|
| 67 |
+
|
| 68 |
+
# Create tools (no need to pass clients, they'll use globals)
|
| 69 |
+
self.arbiscan_tools = [
|
| 70 |
+
self._create_tool(ArbiscanGetTokenTransfersTool),
|
| 71 |
+
self._create_tool(ArbiscanGetNormalTransactionsTool),
|
| 72 |
+
self._create_tool(ArbiscanGetInternalTransactionsTool),
|
| 73 |
+
self._create_tool(ArbiscanFetchWhaleTransactionsTool)
|
| 74 |
+
]
|
| 75 |
+
|
| 76 |
+
self.gemini_tools = [
|
| 77 |
+
self._create_tool(GeminiGetCurrentPriceTool),
|
| 78 |
+
self._create_tool(GeminiGetHistoricalPricesTool)
|
| 79 |
+
]
|
| 80 |
+
|
| 81 |
+
self.data_processor_tools = [
|
| 82 |
+
self._create_tool(DataProcessorIdentifyPatternsTool),
|
| 83 |
+
self._create_tool(DataProcessorDetectAnomalousTransactionsTool)
|
| 84 |
+
]
|
| 85 |
+
|
| 86 |
+
logging.info(f"Successfully created {len(self.arbiscan_tools + self.gemini_tools + self.data_processor_tools)} tools")
|
| 87 |
+
|
| 88 |
+
except Exception as e:
|
| 89 |
+
logging.error(f"Error setting up tools: {str(e)}")
|
| 90 |
+
raise Exception(f"Error setting up tools: {str(e)}")
|
| 91 |
+
|
| 92 |
+
def _create_tool(self, tool_class, *args, **kwargs):
|
| 93 |
+
"""Factory method to safely create a tool with proper error handling"""
|
| 94 |
+
try:
|
| 95 |
+
tool = tool_class(*args, **kwargs)
|
| 96 |
+
return tool
|
| 97 |
+
except Exception as e:
|
| 98 |
+
logging.error(f"Failed to create tool {tool_class.__name__}: {str(e)}")
|
| 99 |
+
raise Exception(f"Failed to create tool {tool_class.__name__}: {str(e)}")
|
| 100 |
+
|
| 101 |
+
def create_agents(self):
|
| 102 |
+
"""Create the agents for the crew"""
|
| 103 |
+
|
| 104 |
+
# Data Collection Agent
|
| 105 |
+
data_collector = Agent(
|
| 106 |
+
role="Blockchain Data Collector",
|
| 107 |
+
goal="Collect comprehensive whale transaction data from the blockchain",
|
| 108 |
+
backstory="""You are a blockchain analytics expert specialized in extracting and
|
| 109 |
+
organizing on-chain data from the Arbitrum network. You have deep knowledge of blockchain
|
| 110 |
+
transaction structures and can efficiently query APIs to gather relevant whale activity.""",
|
| 111 |
+
verbose=True,
|
| 112 |
+
allow_delegation=True,
|
| 113 |
+
tools=self.arbiscan_tools,
|
| 114 |
+
llm=self.llm
|
| 115 |
+
)
|
| 116 |
+
|
| 117 |
+
# Price Analysis Agent
|
| 118 |
+
price_analyst = Agent(
|
| 119 |
+
role="Price Impact Analyst",
|
| 120 |
+
goal="Analyze how whale transactions impact token prices",
|
| 121 |
+
backstory="""You are a quantitative market analyst with expertise in correlating
|
| 122 |
+
trading activity with price movements. You specialize in detecting how large trades
|
| 123 |
+
influence market dynamics, and can identify unusual price patterns.""",
|
| 124 |
+
verbose=True,
|
| 125 |
+
allow_delegation=True,
|
| 126 |
+
tools=self.gemini_tools,
|
| 127 |
+
llm=self.llm
|
| 128 |
+
)
|
| 129 |
+
|
| 130 |
+
# Pattern Detection Agent
|
| 131 |
+
pattern_detector = Agent(
|
| 132 |
+
role="Trading Pattern Detector",
|
| 133 |
+
goal="Identify recurring behavior patterns in whale trading activity",
|
| 134 |
+
backstory="""You are a data scientist specialized in time-series analysis and behavioral
|
| 135 |
+
pattern recognition. You excel at spotting cyclical behaviors, correlation patterns, and
|
| 136 |
+
anomalous trading activities across multiple addresses.""",
|
| 137 |
+
verbose=True,
|
| 138 |
+
allow_delegation=True,
|
| 139 |
+
tools=self.data_processor_tools,
|
| 140 |
+
llm=self.llm
|
| 141 |
+
)
|
| 142 |
+
|
| 143 |
+
# Manipulation Detector Agent
|
| 144 |
+
manipulation_detector = Agent(
|
| 145 |
+
role="Market Manipulation Investigator",
|
| 146 |
+
goal="Detect potential market manipulation in whale activity",
|
| 147 |
+
backstory="""You are a financial forensics expert who has studied market manipulation
|
| 148 |
+
techniques for years. You can identify pump-and-dump schemes, wash trading, spoofing,
|
| 149 |
+
and other deceptive practices used by whale traders to manipulate market prices.""",
|
| 150 |
+
verbose=True,
|
| 151 |
+
allow_delegation=True,
|
| 152 |
+
tools=self.data_processor_tools,
|
| 153 |
+
llm=self.llm
|
| 154 |
+
)
|
| 155 |
+
|
| 156 |
+
# Report Generator Agent
|
| 157 |
+
report_generator = Agent(
|
| 158 |
+
role="Insights Reporter",
|
| 159 |
+
goal="Create comprehensive, actionable reports on whale activity",
|
| 160 |
+
backstory="""You are a financial data storyteller who excels at transforming complex
|
| 161 |
+
blockchain data into clear, insightful narratives. You can distill technical findings
|
| 162 |
+
into actionable intelligence for different audiences.""",
|
| 163 |
+
verbose=True,
|
| 164 |
+
allow_delegation=True,
|
| 165 |
+
tools=[],
|
| 166 |
+
llm=self.llm
|
| 167 |
+
)
|
| 168 |
+
|
| 169 |
+
return {
|
| 170 |
+
"data_collector": data_collector,
|
| 171 |
+
"price_analyst": price_analyst,
|
| 172 |
+
"pattern_detector": pattern_detector,
|
| 173 |
+
"manipulation_detector": manipulation_detector,
|
| 174 |
+
"report_generator": report_generator
|
| 175 |
+
}
|
| 176 |
+
|
| 177 |
+
def track_large_transactions(self,
|
| 178 |
+
wallets: List[str],
|
| 179 |
+
start_date: datetime,
|
| 180 |
+
end_date: datetime,
|
| 181 |
+
threshold_value: float,
|
| 182 |
+
threshold_type: str,
|
| 183 |
+
token_symbol: Optional[str] = None) -> pd.DataFrame:
|
| 184 |
+
"""
|
| 185 |
+
Track large buy/sell transactions for specified wallets
|
| 186 |
+
|
| 187 |
+
Args:
|
| 188 |
+
wallets: List of wallet addresses to track
|
| 189 |
+
start_date: Start date for analysis
|
| 190 |
+
end_date: End date for analysis
|
| 191 |
+
threshold_value: Minimum value for transaction tracking
|
| 192 |
+
threshold_type: Type of threshold ("Token Amount" or "USD Value")
|
| 193 |
+
token_symbol: Symbol of token to track (only required if threshold_type is "Token Amount")
|
| 194 |
+
|
| 195 |
+
Returns:
|
| 196 |
+
DataFrame of large transactions
|
| 197 |
+
"""
|
| 198 |
+
agents = self.create_agents()
|
| 199 |
+
|
| 200 |
+
# Define tasks
|
| 201 |
+
data_collection_task = Task(
|
| 202 |
+
description=f"""
|
| 203 |
+
Collect all transactions for the following wallets: {', '.join(wallets)}
|
| 204 |
+
between {start_date.strftime('%Y-%m-%d')} and {end_date.strftime('%Y-%m-%d')}.
|
| 205 |
+
|
| 206 |
+
Filter for transactions {'of ' + token_symbol if token_symbol else ''} with a
|
| 207 |
+
{'token amount greater than ' + str(threshold_value) if threshold_type == 'Token Amount'
|
| 208 |
+
else 'USD value greater than $' + str(threshold_value)}.
|
| 209 |
+
|
| 210 |
+
Return the data in a well-structured format with timestamp, transaction hash,
|
| 211 |
+
sender, recipient, token symbol, and amount.
|
| 212 |
+
""",
|
| 213 |
+
agent=agents["data_collector"],
|
| 214 |
+
expected_output="""
|
| 215 |
+
A comprehensive dataset of all large transactions for the specified wallets,
|
| 216 |
+
properly filtered according to the threshold criteria.
|
| 217 |
+
"""
|
| 218 |
+
)
|
| 219 |
+
|
| 220 |
+
# Create and run the crew
|
| 221 |
+
crew = Crew(
|
| 222 |
+
agents=[agents["data_collector"]],
|
| 223 |
+
tasks=[data_collection_task],
|
| 224 |
+
verbose=2,
|
| 225 |
+
process=Process.sequential
|
| 226 |
+
)
|
| 227 |
+
|
| 228 |
+
result = crew.kickoff()
|
| 229 |
+
|
| 230 |
+
# Process the result
|
| 231 |
+
import json
|
| 232 |
+
try:
|
| 233 |
+
# Try to extract JSON from the result
|
| 234 |
+
import re
|
| 235 |
+
json_match = re.search(r'```json\n([\s\S]*?)\n```', result)
|
| 236 |
+
|
| 237 |
+
if json_match:
|
| 238 |
+
json_str = json_match.group(1)
|
| 239 |
+
transactions_data = json.loads(json_str)
|
| 240 |
+
|
| 241 |
+
if isinstance(transactions_data, list):
|
| 242 |
+
return pd.DataFrame(transactions_data)
|
| 243 |
+
else:
|
| 244 |
+
return pd.DataFrame()
|
| 245 |
+
else:
|
| 246 |
+
# Try to parse the entire result as JSON
|
| 247 |
+
transactions_data = json.loads(result)
|
| 248 |
+
|
| 249 |
+
if isinstance(transactions_data, list):
|
| 250 |
+
return pd.DataFrame(transactions_data)
|
| 251 |
+
else:
|
| 252 |
+
return pd.DataFrame()
|
| 253 |
+
except:
|
| 254 |
+
# Fallback to querying the API directly
|
| 255 |
+
token_address = None # Would need a mapping of symbol to address
|
| 256 |
+
|
| 257 |
+
transactions_df = self.arbiscan_client.fetch_whale_transactions(
|
| 258 |
+
addresses=wallets,
|
| 259 |
+
token_address=token_address,
|
| 260 |
+
min_token_amount=threshold_value if threshold_type == "Token Amount" else None,
|
| 261 |
+
min_usd_value=threshold_value if threshold_type == "USD Value" else None
|
| 262 |
+
)
|
| 263 |
+
|
| 264 |
+
return transactions_df
|
| 265 |
+
|
| 266 |
+
def identify_trading_patterns(self,
|
| 267 |
+
wallets: List[str],
|
| 268 |
+
start_date: datetime,
|
| 269 |
+
end_date: datetime) -> List[Dict[str, Any]]:
|
| 270 |
+
"""
|
| 271 |
+
Identify trading patterns for specified wallets
|
| 272 |
+
|
| 273 |
+
Args:
|
| 274 |
+
wallets: List of wallet addresses to analyze
|
| 275 |
+
start_date: Start date for analysis
|
| 276 |
+
end_date: End date for analysis
|
| 277 |
+
|
| 278 |
+
Returns:
|
| 279 |
+
List of identified patterns
|
| 280 |
+
"""
|
| 281 |
+
agents = self.create_agents()
|
| 282 |
+
|
| 283 |
+
# Define tasks
|
| 284 |
+
data_collection_task = Task(
|
| 285 |
+
description=f"""
|
| 286 |
+
Collect all transactions for the following wallets: {', '.join(wallets)}
|
| 287 |
+
between {start_date.strftime('%Y-%m-%d')} and {end_date.strftime('%Y-%m-%d')}.
|
| 288 |
+
|
| 289 |
+
Include all token transfers, regardless of size.
|
| 290 |
+
""",
|
| 291 |
+
agent=agents["data_collector"],
|
| 292 |
+
expected_output="""
|
| 293 |
+
A comprehensive dataset of all transactions for the specified wallets.
|
| 294 |
+
"""
|
| 295 |
+
)
|
| 296 |
+
|
| 297 |
+
pattern_analysis_task = Task(
|
| 298 |
+
description="""
|
| 299 |
+
Analyze the transaction data to identify recurring trading patterns.
|
| 300 |
+
Look for:
|
| 301 |
+
1. Cyclical buying/selling behaviors
|
| 302 |
+
2. Time-of-day patterns
|
| 303 |
+
3. Accumulation/distribution phases
|
| 304 |
+
4. Coordinated movements across multiple addresses
|
| 305 |
+
|
| 306 |
+
Cluster similar behaviors and describe each pattern identified.
|
| 307 |
+
""",
|
| 308 |
+
agent=agents["pattern_detector"],
|
| 309 |
+
expected_output="""
|
| 310 |
+
A detailed analysis of trading patterns with:
|
| 311 |
+
- Pattern name/type
|
| 312 |
+
- Description of behavior
|
| 313 |
+
- Frequency and confidence level
|
| 314 |
+
- Example transactions showing the pattern
|
| 315 |
+
""",
|
| 316 |
+
context=[data_collection_task]
|
| 317 |
+
)
|
| 318 |
+
|
| 319 |
+
# Create and run the crew
|
| 320 |
+
crew = Crew(
|
| 321 |
+
agents=[agents["data_collector"], agents["pattern_detector"]],
|
| 322 |
+
tasks=[data_collection_task, pattern_analysis_task],
|
| 323 |
+
verbose=2,
|
| 324 |
+
process=Process.sequential
|
| 325 |
+
)
|
| 326 |
+
|
| 327 |
+
result = crew.kickoff()
|
| 328 |
+
|
| 329 |
+
# Process the result
|
| 330 |
+
import json
|
| 331 |
+
try:
|
| 332 |
+
# Try to extract JSON from the result
|
| 333 |
+
import re
|
| 334 |
+
json_match = re.search(r'```json\n([\s\S]*?)\n```', result)
|
| 335 |
+
|
| 336 |
+
if json_match:
|
| 337 |
+
json_str = json_match.group(1)
|
| 338 |
+
patterns_data = json.loads(json_str)
|
| 339 |
+
|
| 340 |
+
# Convert the patterns to the expected format
|
| 341 |
+
return self._convert_patterns_to_visual_format(patterns_data)
|
| 342 |
+
else:
|
| 343 |
+
# Fallback to a simple pattern analysis
|
| 344 |
+
# First, get transaction data directly
|
| 345 |
+
all_transactions = []
|
| 346 |
+
|
| 347 |
+
for wallet in wallets:
|
| 348 |
+
transfers = self.arbiscan_client.fetch_all_token_transfers(
|
| 349 |
+
address=wallet
|
| 350 |
+
)
|
| 351 |
+
all_transactions.extend(transfers)
|
| 352 |
+
|
| 353 |
+
if not all_transactions:
|
| 354 |
+
return []
|
| 355 |
+
|
| 356 |
+
transactions_df = pd.DataFrame(all_transactions)
|
| 357 |
+
|
| 358 |
+
# Use data processor to identify patterns
|
| 359 |
+
patterns = self.data_processor.identify_patterns(transactions_df)
|
| 360 |
+
|
| 361 |
+
return patterns
|
| 362 |
+
except Exception as e:
|
| 363 |
+
print(f"Error processing patterns: {str(e)}")
|
| 364 |
+
return []
|
| 365 |
+
|
| 366 |
+
def _convert_patterns_to_visual_format(self, patterns_data: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
|
| 367 |
+
"""
|
| 368 |
+
Convert pattern data from agents to visual format with charts
|
| 369 |
+
|
| 370 |
+
Args:
|
| 371 |
+
patterns_data: Pattern data from agents
|
| 372 |
+
|
| 373 |
+
Returns:
|
| 374 |
+
List of patterns with visualizations
|
| 375 |
+
"""
|
| 376 |
+
visual_patterns = []
|
| 377 |
+
|
| 378 |
+
for pattern in patterns_data:
|
| 379 |
+
# Create chart
|
| 380 |
+
if 'examples' in pattern and pattern['examples']:
|
| 381 |
+
examples_data = []
|
| 382 |
+
|
| 383 |
+
# Check if examples is a JSON string
|
| 384 |
+
if isinstance(pattern['examples'], str):
|
| 385 |
+
try:
|
| 386 |
+
examples_data = pd.read_json(pattern['examples'])
|
| 387 |
+
except:
|
| 388 |
+
examples_data = pd.DataFrame()
|
| 389 |
+
else:
|
| 390 |
+
examples_data = pd.DataFrame(pattern['examples'])
|
| 391 |
+
|
| 392 |
+
# Create visualization
|
| 393 |
+
if not examples_data.empty:
|
| 394 |
+
import plotly.express as px
|
| 395 |
+
|
| 396 |
+
# Check for timestamp column
|
| 397 |
+
if 'Timestamp' in examples_data.columns:
|
| 398 |
+
time_col = 'Timestamp'
|
| 399 |
+
elif 'timeStamp' in examples_data.columns:
|
| 400 |
+
time_col = 'timeStamp'
|
| 401 |
+
else:
|
| 402 |
+
time_col = None
|
| 403 |
+
|
| 404 |
+
# Check for amount column
|
| 405 |
+
if 'Amount' in examples_data.columns:
|
| 406 |
+
amount_col = 'Amount'
|
| 407 |
+
elif 'tokenAmount' in examples_data.columns:
|
| 408 |
+
amount_col = 'tokenAmount'
|
| 409 |
+
elif 'value' in examples_data.columns:
|
| 410 |
+
amount_col = 'value'
|
| 411 |
+
else:
|
| 412 |
+
amount_col = None
|
| 413 |
+
|
| 414 |
+
if time_col and amount_col:
|
| 415 |
+
# Create time series chart
|
| 416 |
+
fig = px.line(
|
| 417 |
+
examples_data,
|
| 418 |
+
x=time_col,
|
| 419 |
+
y=amount_col,
|
| 420 |
+
title=f"Pattern: {pattern['name']}"
|
| 421 |
+
)
|
| 422 |
+
else:
|
| 423 |
+
fig = None
|
| 424 |
+
else:
|
| 425 |
+
fig = None
|
| 426 |
+
else:
|
| 427 |
+
fig = None
|
| 428 |
+
examples_data = pd.DataFrame()
|
| 429 |
+
|
| 430 |
+
# Create visual pattern object
|
| 431 |
+
visual_pattern = {
|
| 432 |
+
"name": pattern.get("name", "Unknown Pattern"),
|
| 433 |
+
"description": pattern.get("description", ""),
|
| 434 |
+
"confidence": pattern.get("confidence", 0.5),
|
| 435 |
+
"occurrence_count": pattern.get("occurrence_count", 0),
|
| 436 |
+
"chart_data": fig,
|
| 437 |
+
"examples": examples_data
|
| 438 |
+
}
|
| 439 |
+
|
| 440 |
+
visual_patterns.append(visual_pattern)
|
| 441 |
+
|
| 442 |
+
return visual_patterns
|
| 443 |
+
|
| 444 |
+
def analyze_price_impact(self,
|
| 445 |
+
wallets: List[str],
|
| 446 |
+
start_date: datetime,
|
| 447 |
+
end_date: datetime,
|
| 448 |
+
lookback_minutes: int = 5,
|
| 449 |
+
lookahead_minutes: int = 5) -> Dict[str, Any]:
|
| 450 |
+
"""
|
| 451 |
+
Analyze the impact of whale transactions on token prices
|
| 452 |
+
|
| 453 |
+
Args:
|
| 454 |
+
wallets: List of wallet addresses to analyze
|
| 455 |
+
start_date: Start date for analysis
|
| 456 |
+
end_date: End date for analysis
|
| 457 |
+
lookback_minutes: Minutes to look back before transactions
|
| 458 |
+
lookahead_minutes: Minutes to look ahead after transactions
|
| 459 |
+
|
| 460 |
+
Returns:
|
| 461 |
+
Dictionary with price impact analysis
|
| 462 |
+
"""
|
| 463 |
+
agents = self.create_agents()
|
| 464 |
+
|
| 465 |
+
# Define tasks
|
| 466 |
+
data_collection_task = Task(
|
| 467 |
+
description=f"""
|
| 468 |
+
Collect all transactions for the following wallets: {', '.join(wallets)}
|
| 469 |
+
between {start_date.strftime('%Y-%m-%d')} and {end_date.strftime('%Y-%m-%d')}.
|
| 470 |
+
|
| 471 |
+
Focus on large transactions that might impact price.
|
| 472 |
+
""",
|
| 473 |
+
agent=agents["data_collector"],
|
| 474 |
+
expected_output="""
|
| 475 |
+
A comprehensive dataset of all significant transactions for the specified wallets.
|
| 476 |
+
"""
|
| 477 |
+
)
|
| 478 |
+
|
| 479 |
+
price_impact_task = Task(
|
| 480 |
+
description=f"""
|
| 481 |
+
Analyze the price impact of the whale transactions.
|
| 482 |
+
For each transaction:
|
| 483 |
+
1. Fetch price data for {lookback_minutes} minutes before and {lookahead_minutes} minutes after the transaction
|
| 484 |
+
2. Calculate the percentage price change
|
| 485 |
+
3. Identify transactions that caused significant price moves
|
| 486 |
+
|
| 487 |
+
Summarize the overall price impact statistics and highlight notable instances.
|
| 488 |
+
""",
|
| 489 |
+
agent=agents["price_analyst"],
|
| 490 |
+
expected_output="""
|
| 491 |
+
A detailed analysis of price impacts with:
|
| 492 |
+
- Average price impact percentage
|
| 493 |
+
- Maximum price impact (positive and negative)
|
| 494 |
+
- Count of significant price moves
|
| 495 |
+
- List of transactions with their corresponding price impacts
|
| 496 |
+
""",
|
| 497 |
+
context=[data_collection_task]
|
| 498 |
+
)
|
| 499 |
+
|
| 500 |
+
# Create and run the crew
|
| 501 |
+
crew = Crew(
|
| 502 |
+
agents=[agents["data_collector"], agents["price_analyst"]],
|
| 503 |
+
tasks=[data_collection_task, price_impact_task],
|
| 504 |
+
verbose=2,
|
| 505 |
+
process=Process.sequential
|
| 506 |
+
)
|
| 507 |
+
|
| 508 |
+
result = crew.kickoff()
|
| 509 |
+
|
| 510 |
+
# Process the result
|
| 511 |
+
import json
|
| 512 |
+
try:
|
| 513 |
+
# Try to extract JSON from the result
|
| 514 |
+
import re
|
| 515 |
+
json_match = re.search(r'```json\n([\s\S]*?)\n```', result)
|
| 516 |
+
|
| 517 |
+
if json_match:
|
| 518 |
+
json_str = json_match.group(1)
|
| 519 |
+
impact_data = json.loads(json_str)
|
| 520 |
+
|
| 521 |
+
# Convert the impact data to visual format
|
| 522 |
+
return self._convert_impact_to_visual_format(impact_data)
|
| 523 |
+
else:
|
| 524 |
+
# Fallback to direct calculation
|
| 525 |
+
# First, get transaction data
|
| 526 |
+
all_transactions = []
|
| 527 |
+
|
| 528 |
+
for wallet in wallets:
|
| 529 |
+
transfers = self.arbiscan_client.fetch_all_token_transfers(
|
| 530 |
+
address=wallet
|
| 531 |
+
)
|
| 532 |
+
all_transactions.extend(transfers)
|
| 533 |
+
|
| 534 |
+
if not all_transactions:
|
| 535 |
+
return {}
|
| 536 |
+
|
| 537 |
+
transactions_df = pd.DataFrame(all_transactions)
|
| 538 |
+
|
| 539 |
+
# Calculate price impact for each transaction
|
| 540 |
+
price_data = {}
|
| 541 |
+
|
| 542 |
+
for idx, row in transactions_df.iterrows():
|
| 543 |
+
tx_hash = row.get('hash', '')
|
| 544 |
+
|
| 545 |
+
if not tx_hash:
|
| 546 |
+
continue
|
| 547 |
+
|
| 548 |
+
# Get symbol
|
| 549 |
+
symbol = row.get('tokenSymbol', '')
|
| 550 |
+
if not symbol:
|
| 551 |
+
continue
|
| 552 |
+
|
| 553 |
+
# Get timestamp
|
| 554 |
+
timestamp = row.get('timeStamp', 0)
|
| 555 |
+
if not timestamp:
|
| 556 |
+
continue
|
| 557 |
+
|
| 558 |
+
# Convert timestamp to datetime
|
| 559 |
+
if isinstance(timestamp, (int, float)):
|
| 560 |
+
tx_time = datetime.fromtimestamp(int(timestamp))
|
| 561 |
+
else:
|
| 562 |
+
tx_time = timestamp
|
| 563 |
+
|
| 564 |
+
# Get price impact
|
| 565 |
+
symbol_usd = f"{symbol}USD"
|
| 566 |
+
impact = self.gemini_client.get_price_impact(
|
| 567 |
+
symbol=symbol_usd,
|
| 568 |
+
transaction_time=tx_time,
|
| 569 |
+
lookback_minutes=lookback_minutes,
|
| 570 |
+
lookahead_minutes=lookahead_minutes
|
| 571 |
+
)
|
| 572 |
+
|
| 573 |
+
price_data[tx_hash] = impact
|
| 574 |
+
|
| 575 |
+
# Use data processor to analyze price impact
|
| 576 |
+
impact_analysis = self.data_processor.analyze_price_impact(
|
| 577 |
+
transactions_df=transactions_df,
|
| 578 |
+
price_data=price_data
|
| 579 |
+
)
|
| 580 |
+
|
| 581 |
+
return impact_analysis
|
| 582 |
+
except Exception as e:
|
| 583 |
+
print(f"Error processing price impact: {str(e)}")
|
| 584 |
+
return {}
|
| 585 |
+
|
| 586 |
+
def _convert_impact_to_visual_format(self, impact_data: Dict[str, Any]) -> Dict[str, Any]:
|
| 587 |
+
"""
|
| 588 |
+
Convert price impact data to visual format with charts
|
| 589 |
+
|
| 590 |
+
Args:
|
| 591 |
+
impact_data: Price impact data
|
| 592 |
+
|
| 593 |
+
Returns:
|
| 594 |
+
Dictionary with price impact analysis and visualizations
|
| 595 |
+
"""
|
| 596 |
+
# Convert transactions_with_impact to DataFrame if it's a string
|
| 597 |
+
if 'transactions_with_impact' in impact_data and isinstance(impact_data['transactions_with_impact'], str):
|
| 598 |
+
try:
|
| 599 |
+
transactions_df = pd.read_json(impact_data['transactions_with_impact'])
|
| 600 |
+
except:
|
| 601 |
+
transactions_df = pd.DataFrame()
|
| 602 |
+
elif 'transactions_with_impact' in impact_data and isinstance(impact_data['transactions_with_impact'], list):
|
| 603 |
+
transactions_df = pd.DataFrame(impact_data['transactions_with_impact'])
|
| 604 |
+
else:
|
| 605 |
+
transactions_df = pd.DataFrame()
|
| 606 |
+
|
| 607 |
+
# Create impact chart
|
| 608 |
+
if not transactions_df.empty and 'impact_pct' in transactions_df.columns and 'Timestamp' in transactions_df.columns:
|
| 609 |
+
import plotly.graph_objects as go
|
| 610 |
+
|
| 611 |
+
fig = go.Figure()
|
| 612 |
+
|
| 613 |
+
fig.add_trace(go.Scatter(
|
| 614 |
+
x=transactions_df['Timestamp'],
|
| 615 |
+
y=transactions_df['impact_pct'],
|
| 616 |
+
mode='markers+lines',
|
| 617 |
+
name='Price Impact (%)',
|
| 618 |
+
marker=dict(
|
| 619 |
+
size=10,
|
| 620 |
+
color=transactions_df['impact_pct'],
|
| 621 |
+
colorscale='RdBu',
|
| 622 |
+
cmin=-max(abs(transactions_df['impact_pct'])) if len(transactions_df) > 0 else -1,
|
| 623 |
+
cmax=max(abs(transactions_df['impact_pct'])) if len(transactions_df) > 0 else 1,
|
| 624 |
+
colorbar=dict(title='Impact %'),
|
| 625 |
+
symbol='circle'
|
| 626 |
+
)
|
| 627 |
+
))
|
| 628 |
+
|
| 629 |
+
fig.update_layout(
|
| 630 |
+
title='Price Impact of Whale Transactions',
|
| 631 |
+
xaxis_title='Timestamp',
|
| 632 |
+
yaxis_title='Price Impact (%)',
|
| 633 |
+
hovermode='closest'
|
| 634 |
+
)
|
| 635 |
+
|
| 636 |
+
# Add zero line
|
| 637 |
+
fig.add_hline(y=0, line_dash="dash", line_color="gray")
|
| 638 |
+
else:
|
| 639 |
+
fig = None
|
| 640 |
+
|
| 641 |
+
# Create visual impact analysis
|
| 642 |
+
visual_impact = {
|
| 643 |
+
'avg_impact_pct': impact_data.get('avg_impact_pct', 0),
|
| 644 |
+
'max_impact_pct': impact_data.get('max_impact_pct', 0),
|
| 645 |
+
'min_impact_pct': impact_data.get('min_impact_pct', 0),
|
| 646 |
+
'significant_moves_count': impact_data.get('significant_moves_count', 0),
|
| 647 |
+
'total_transactions': impact_data.get('total_transactions', 0),
|
| 648 |
+
'impact_chart': fig,
|
| 649 |
+
'transactions_with_impact': transactions_df
|
| 650 |
+
}
|
| 651 |
+
|
| 652 |
+
return visual_impact
|
| 653 |
+
|
| 654 |
+
def detect_manipulation(self,
|
| 655 |
+
wallets: List[str],
|
| 656 |
+
start_date: datetime,
|
| 657 |
+
end_date: datetime,
|
| 658 |
+
sensitivity: str = "Medium") -> List[Dict[str, Any]]:
|
| 659 |
+
"""
|
| 660 |
+
Detect potential market manipulation by whale wallets
|
| 661 |
+
|
| 662 |
+
Args:
|
| 663 |
+
wallets: List of wallet addresses to analyze
|
| 664 |
+
start_date: Start date for analysis
|
| 665 |
+
end_date: End date for analysis
|
| 666 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
| 667 |
+
|
| 668 |
+
Returns:
|
| 669 |
+
List of manipulation alerts
|
| 670 |
+
"""
|
| 671 |
+
agents = self.create_agents()
|
| 672 |
+
|
| 673 |
+
# Define tasks
|
| 674 |
+
data_collection_task = Task(
|
| 675 |
+
description=f"""
|
| 676 |
+
Collect all transactions for the following wallets: {', '.join(wallets)}
|
| 677 |
+
between {start_date.strftime('%Y-%m-%d')} and {end_date.strftime('%Y-%m-%d')}.
|
| 678 |
+
|
| 679 |
+
Include all token transfers and also fetch price data if available.
|
| 680 |
+
""",
|
| 681 |
+
agent=agents["data_collector"],
|
| 682 |
+
expected_output="""
|
| 683 |
+
A comprehensive dataset of all transactions for the specified wallets.
|
| 684 |
+
"""
|
| 685 |
+
)
|
| 686 |
+
|
| 687 |
+
price_impact_task = Task(
|
| 688 |
+
description="""
|
| 689 |
+
Analyze the price impact of the whale transactions.
|
| 690 |
+
For each significant transaction, fetch and analyze price data around the transaction time.
|
| 691 |
+
""",
|
| 692 |
+
agent=agents["price_analyst"],
|
| 693 |
+
expected_output="""
|
| 694 |
+
Price impact data for the transactions.
|
| 695 |
+
""",
|
| 696 |
+
context=[data_collection_task]
|
| 697 |
+
)
|
| 698 |
+
|
| 699 |
+
manipulation_detection_task = Task(
|
| 700 |
+
description=f"""
|
| 701 |
+
Detect potential market manipulation patterns in the transaction data with sensitivity level: {sensitivity}.
|
| 702 |
+
Look for:
|
| 703 |
+
1. Pump-and-Dump: Rapid buys followed by coordinated sell-offs
|
| 704 |
+
2. Wash Trading: Self-trading across multiple addresses
|
| 705 |
+
3. Spoofing: Large orders placed then canceled (if detectable)
|
| 706 |
+
4. Momentum Ignition: Creating sharp price moves to trigger other participants' momentum-based trading
|
| 707 |
+
|
| 708 |
+
For each potential manipulation, provide:
|
| 709 |
+
- Type of manipulation
|
| 710 |
+
- Involved addresses
|
| 711 |
+
- Risk level (High, Medium, Low)
|
| 712 |
+
- Description of the suspicious behavior
|
| 713 |
+
- Evidence (transactions showing the pattern)
|
| 714 |
+
""",
|
| 715 |
+
agent=agents["manipulation_detector"],
|
| 716 |
+
expected_output="""
|
| 717 |
+
A detailed list of potential manipulation incidents with supporting evidence.
|
| 718 |
+
""",
|
| 719 |
+
context=[data_collection_task, price_impact_task]
|
| 720 |
+
)
|
| 721 |
+
|
| 722 |
+
# Create and run the crew
|
| 723 |
+
crew = Crew(
|
| 724 |
+
agents=[
|
| 725 |
+
agents["data_collector"],
|
| 726 |
+
agents["price_analyst"],
|
| 727 |
+
agents["manipulation_detector"]
|
| 728 |
+
],
|
| 729 |
+
tasks=[
|
| 730 |
+
data_collection_task,
|
| 731 |
+
price_impact_task,
|
| 732 |
+
manipulation_detection_task
|
| 733 |
+
],
|
| 734 |
+
verbose=2,
|
| 735 |
+
process=Process.sequential
|
| 736 |
+
)
|
| 737 |
+
|
| 738 |
+
result = crew.kickoff()
|
| 739 |
+
|
| 740 |
+
# Process the result
|
| 741 |
+
import json
|
| 742 |
+
try:
|
| 743 |
+
# Try to extract JSON from the result
|
| 744 |
+
import re
|
| 745 |
+
json_match = re.search(r'```json\n([\s\S]*?)\n```', result)
|
| 746 |
+
|
| 747 |
+
if json_match:
|
| 748 |
+
json_str = json_match.group(1)
|
| 749 |
+
alerts_data = json.loads(json_str)
|
| 750 |
+
|
| 751 |
+
# Convert the alerts to visual format
|
| 752 |
+
return self._convert_alerts_to_visual_format(alerts_data)
|
| 753 |
+
else:
|
| 754 |
+
# Fallback to direct detection
|
| 755 |
+
# First, get transaction data
|
| 756 |
+
all_transactions = []
|
| 757 |
+
|
| 758 |
+
for wallet in wallets:
|
| 759 |
+
transfers = self.arbiscan_client.fetch_all_token_transfers(
|
| 760 |
+
address=wallet
|
| 761 |
+
)
|
| 762 |
+
all_transactions.extend(transfers)
|
| 763 |
+
|
| 764 |
+
if not all_transactions:
|
| 765 |
+
return []
|
| 766 |
+
|
| 767 |
+
transactions_df = pd.DataFrame(all_transactions)
|
| 768 |
+
|
| 769 |
+
# Calculate price impact for each transaction
|
| 770 |
+
price_data = {}
|
| 771 |
+
|
| 772 |
+
for idx, row in transactions_df.iterrows():
|
| 773 |
+
tx_hash = row.get('hash', '')
|
| 774 |
+
|
| 775 |
+
if not tx_hash:
|
| 776 |
+
continue
|
| 777 |
+
|
| 778 |
+
# Get symbol
|
| 779 |
+
symbol = row.get('tokenSymbol', '')
|
| 780 |
+
if not symbol:
|
| 781 |
+
continue
|
| 782 |
+
|
| 783 |
+
# Get timestamp
|
| 784 |
+
timestamp = row.get('timeStamp', 0)
|
| 785 |
+
if not timestamp:
|
| 786 |
+
continue
|
| 787 |
+
|
| 788 |
+
# Convert timestamp to datetime
|
| 789 |
+
if isinstance(timestamp, (int, float)):
|
| 790 |
+
tx_time = datetime.fromtimestamp(int(timestamp))
|
| 791 |
+
else:
|
| 792 |
+
tx_time = timestamp
|
| 793 |
+
|
| 794 |
+
# Get price impact
|
| 795 |
+
symbol_usd = f"{symbol}USD"
|
| 796 |
+
impact = self.gemini_client.get_price_impact(
|
| 797 |
+
symbol=symbol_usd,
|
| 798 |
+
transaction_time=tx_time,
|
| 799 |
+
lookback_minutes=5,
|
| 800 |
+
lookahead_minutes=5
|
| 801 |
+
)
|
| 802 |
+
|
| 803 |
+
price_data[tx_hash] = impact
|
| 804 |
+
|
| 805 |
+
# Detect wash trading
|
| 806 |
+
wash_trading_alerts = self.data_processor.detect_wash_trading(
|
| 807 |
+
transactions_df=transactions_df,
|
| 808 |
+
addresses=wallets,
|
| 809 |
+
sensitivity=sensitivity
|
| 810 |
+
)
|
| 811 |
+
|
| 812 |
+
# Detect pump and dump
|
| 813 |
+
pump_and_dump_alerts = self.data_processor.detect_pump_and_dump(
|
| 814 |
+
transactions_df=transactions_df,
|
| 815 |
+
price_data=price_data,
|
| 816 |
+
sensitivity=sensitivity
|
| 817 |
+
)
|
| 818 |
+
|
| 819 |
+
# Combine alerts
|
| 820 |
+
all_alerts = wash_trading_alerts + pump_and_dump_alerts
|
| 821 |
+
|
| 822 |
+
return all_alerts
|
| 823 |
+
except Exception as e:
|
| 824 |
+
print(f"Error detecting manipulation: {str(e)}")
|
| 825 |
+
return []
|
| 826 |
+
|
| 827 |
+
def _convert_alerts_to_visual_format(self, alerts_data: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
|
| 828 |
+
"""
|
| 829 |
+
Convert manipulation alerts data to visual format with charts
|
| 830 |
+
|
| 831 |
+
Args:
|
| 832 |
+
alerts_data: Alerts data from agents
|
| 833 |
+
|
| 834 |
+
Returns:
|
| 835 |
+
List of alerts with visualizations
|
| 836 |
+
"""
|
| 837 |
+
visual_alerts = []
|
| 838 |
+
|
| 839 |
+
for alert in alerts_data:
|
| 840 |
+
# Create chart based on alert type
|
| 841 |
+
if 'evidence' in alert and alert['evidence']:
|
| 842 |
+
evidence_data = []
|
| 843 |
+
|
| 844 |
+
# Check if evidence is a JSON string
|
| 845 |
+
if isinstance(alert['evidence'], str):
|
| 846 |
+
try:
|
| 847 |
+
evidence_data = pd.read_json(alert['evidence'])
|
| 848 |
+
except:
|
| 849 |
+
evidence_data = pd.DataFrame()
|
| 850 |
+
else:
|
| 851 |
+
evidence_data = pd.DataFrame(alert['evidence'])
|
| 852 |
+
|
| 853 |
+
# Create visualization based on alert type
|
| 854 |
+
if not evidence_data.empty:
|
| 855 |
+
import plotly.graph_objects as go
|
| 856 |
+
import plotly.express as px
|
| 857 |
+
|
| 858 |
+
# Check for timestamp column
|
| 859 |
+
if 'Timestamp' in evidence_data.columns:
|
| 860 |
+
time_col = 'Timestamp'
|
| 861 |
+
elif 'timeStamp' in evidence_data.columns:
|
| 862 |
+
time_col = 'timeStamp'
|
| 863 |
+
elif 'timestamp' in evidence_data.columns:
|
| 864 |
+
time_col = 'timestamp'
|
| 865 |
+
else:
|
| 866 |
+
time_col = None
|
| 867 |
+
|
| 868 |
+
# Different visualizations based on alert type
|
| 869 |
+
if alert.get('type') == 'Wash Trading' and time_col:
|
| 870 |
+
# Create scatter plot of wash trading
|
| 871 |
+
fig = px.scatter(
|
| 872 |
+
evidence_data,
|
| 873 |
+
x=time_col,
|
| 874 |
+
y=evidence_data.get('Amount', evidence_data.get('tokenAmount', evidence_data.get('value', 0))),
|
| 875 |
+
color=evidence_data.get('From', evidence_data.get('from', 'Unknown')),
|
| 876 |
+
title=f"Wash Trading Evidence: {alert.get('title', '')}"
|
| 877 |
+
)
|
| 878 |
+
elif alert.get('type') == 'Pump and Dump' and time_col and 'pre_price' in evidence_data.columns:
|
| 879 |
+
# Create price line for pump and dump
|
| 880 |
+
fig = go.Figure()
|
| 881 |
+
|
| 882 |
+
# Plot price line
|
| 883 |
+
fig.add_trace(go.Scatter(
|
| 884 |
+
x=evidence_data[time_col],
|
| 885 |
+
y=evidence_data['pre_price'],
|
| 886 |
+
mode='lines+markers',
|
| 887 |
+
name='Price Before Transaction',
|
| 888 |
+
line=dict(color='blue')
|
| 889 |
+
))
|
| 890 |
+
|
| 891 |
+
fig.add_trace(go.Scatter(
|
| 892 |
+
x=evidence_data[time_col],
|
| 893 |
+
y=evidence_data['post_price'],
|
| 894 |
+
mode='lines+markers',
|
| 895 |
+
name='Price After Transaction',
|
| 896 |
+
line=dict(color='red')
|
| 897 |
+
))
|
| 898 |
+
|
| 899 |
+
fig.update_layout(
|
| 900 |
+
title=f"Pump and Dump Evidence: {alert.get('title', '')}",
|
| 901 |
+
xaxis_title='Time',
|
| 902 |
+
yaxis_title='Price',
|
| 903 |
+
hovermode='closest'
|
| 904 |
+
)
|
| 905 |
+
elif alert.get('type') == 'Momentum Ignition' and time_col and 'impact_pct' in evidence_data.columns:
|
| 906 |
+
# Create impact scatter for momentum ignition
|
| 907 |
+
fig = px.scatter(
|
| 908 |
+
evidence_data,
|
| 909 |
+
x=time_col,
|
| 910 |
+
y='impact_pct',
|
| 911 |
+
size=abs(evidence_data['impact_pct']),
|
| 912 |
+
color='impact_pct',
|
| 913 |
+
color_continuous_scale='RdBu',
|
| 914 |
+
title=f"Momentum Ignition Evidence: {alert.get('title', '')}"
|
| 915 |
+
)
|
| 916 |
+
else:
|
| 917 |
+
# Generic timeline view
|
| 918 |
+
if time_col:
|
| 919 |
+
fig = px.timeline(
|
| 920 |
+
evidence_data,
|
| 921 |
+
x_start=time_col,
|
| 922 |
+
x_end=time_col,
|
| 923 |
+
y=evidence_data.get('From', evidence_data.get('from', 'Unknown')),
|
| 924 |
+
color=alert.get('risk_level', 'Medium'),
|
| 925 |
+
title=f"Alert Evidence: {alert.get('title', '')}"
|
| 926 |
+
)
|
| 927 |
+
else:
|
| 928 |
+
fig = None
|
| 929 |
+
else:
|
| 930 |
+
fig = None
|
| 931 |
+
else:
|
| 932 |
+
fig = None
|
| 933 |
+
evidence_data = pd.DataFrame()
|
| 934 |
+
|
| 935 |
+
# Create visual alert object
|
| 936 |
+
visual_alert = {
|
| 937 |
+
"type": alert.get("type", "Unknown"),
|
| 938 |
+
"addresses": alert.get("addresses", []),
|
| 939 |
+
"risk_level": alert.get("risk_level", "Medium"),
|
| 940 |
+
"description": alert.get("description", ""),
|
| 941 |
+
"detection_time": alert.get("detection_time", datetime.now().strftime("%Y-%m-%d %H:%M:%S")),
|
| 942 |
+
"title": alert.get("title", "Alert"),
|
| 943 |
+
"evidence": evidence_data,
|
| 944 |
+
"chart": fig
|
| 945 |
+
}
|
| 946 |
+
|
| 947 |
+
visual_alerts.append(visual_alert)
|
| 948 |
+
|
| 949 |
+
return visual_alerts
|
| 950 |
+
|
| 951 |
+
def generate_report(self,
|
| 952 |
+
wallets: List[str],
|
| 953 |
+
start_date: datetime,
|
| 954 |
+
end_date: datetime,
|
| 955 |
+
report_type: str = "Transaction Summary",
|
| 956 |
+
export_format: str = "PDF") -> Dict[str, Any]:
|
| 957 |
+
"""
|
| 958 |
+
Generate a report of whale activity
|
| 959 |
+
|
| 960 |
+
Args:
|
| 961 |
+
wallets: List of wallet addresses to include in the report
|
| 962 |
+
start_date: Start date for report period
|
| 963 |
+
end_date: End date for report period
|
| 964 |
+
report_type: Type of report to generate
|
| 965 |
+
export_format: Format for the report (CSV, PDF, PNG)
|
| 966 |
+
|
| 967 |
+
Returns:
|
| 968 |
+
Dictionary with report data
|
| 969 |
+
"""
|
| 970 |
+
from modules.visualizer import Visualizer
|
| 971 |
+
visualizer = Visualizer()
|
| 972 |
+
|
| 973 |
+
agents = self.create_agents()
|
| 974 |
+
|
| 975 |
+
# Define tasks
|
| 976 |
+
data_collection_task = Task(
|
| 977 |
+
description=f"""
|
| 978 |
+
Collect all transactions for the following wallets: {', '.join(wallets)}
|
| 979 |
+
between {start_date.strftime('%Y-%m-%d')} and {end_date.strftime('%Y-%m-%d')}.
|
| 980 |
+
""",
|
| 981 |
+
agent=agents["data_collector"],
|
| 982 |
+
expected_output="""
|
| 983 |
+
A comprehensive dataset of all transactions for the specified wallets.
|
| 984 |
+
"""
|
| 985 |
+
)
|
| 986 |
+
|
| 987 |
+
report_task = Task(
|
| 988 |
+
description=f"""
|
| 989 |
+
Generate a {report_type} report in {export_format} format.
|
| 990 |
+
The report should include:
|
| 991 |
+
1. Executive summary of wallet activity
|
| 992 |
+
2. Transaction analysis
|
| 993 |
+
3. Pattern identification (if applicable)
|
| 994 |
+
4. Price impact analysis (if applicable)
|
| 995 |
+
5. Manipulation detection (if applicable)
|
| 996 |
+
|
| 997 |
+
Organize the information clearly and provide actionable insights.
|
| 998 |
+
""",
|
| 999 |
+
agent=agents["report_generator"],
|
| 1000 |
+
expected_output=f"""
|
| 1001 |
+
A complete {export_format} report with all relevant analyses.
|
| 1002 |
+
""",
|
| 1003 |
+
context=[data_collection_task]
|
| 1004 |
+
)
|
| 1005 |
+
|
| 1006 |
+
# Create and run the crew
|
| 1007 |
+
crew = Crew(
|
| 1008 |
+
agents=[agents["data_collector"], agents["report_generator"]],
|
| 1009 |
+
tasks=[data_collection_task, report_task],
|
| 1010 |
+
verbose=2,
|
| 1011 |
+
process=Process.sequential
|
| 1012 |
+
)
|
| 1013 |
+
|
| 1014 |
+
result = crew.kickoff()
|
| 1015 |
+
|
| 1016 |
+
# Process the result - for reports, we'll use our visualizer directly
|
| 1017 |
+
# First, get transaction data
|
| 1018 |
+
all_transactions = []
|
| 1019 |
+
|
| 1020 |
+
for wallet in wallets:
|
| 1021 |
+
transfers = self.arbiscan_client.fetch_all_token_transfers(
|
| 1022 |
+
address=wallet
|
| 1023 |
+
)
|
| 1024 |
+
all_transactions.extend(transfers)
|
| 1025 |
+
|
| 1026 |
+
if not all_transactions:
|
| 1027 |
+
return {
|
| 1028 |
+
"filename": f"no_data_report_{datetime.now().strftime('%Y%m%d_%H%M%S')}.{export_format.lower()}",
|
| 1029 |
+
"content": ""
|
| 1030 |
+
}
|
| 1031 |
+
|
| 1032 |
+
transactions_df = pd.DataFrame(all_transactions)
|
| 1033 |
+
|
| 1034 |
+
# Generate the report based on format
|
| 1035 |
+
filename = f"whale_report_{datetime.now().strftime('%Y%m%d_%H%M%S')}"
|
| 1036 |
+
|
| 1037 |
+
if export_format == "CSV":
|
| 1038 |
+
content = visualizer.generate_csv_report(
|
| 1039 |
+
transactions_df=transactions_df,
|
| 1040 |
+
report_type=report_type
|
| 1041 |
+
)
|
| 1042 |
+
filename += ".csv"
|
| 1043 |
+
|
| 1044 |
+
return {
|
| 1045 |
+
"filename": filename,
|
| 1046 |
+
"content": content
|
| 1047 |
+
}
|
| 1048 |
+
|
| 1049 |
+
elif export_format == "PDF":
|
| 1050 |
+
# For PDF we need to get more data
|
| 1051 |
+
# Run pattern detection
|
| 1052 |
+
patterns = self.identify_trading_patterns(
|
| 1053 |
+
wallets=wallets,
|
| 1054 |
+
start_date=start_date,
|
| 1055 |
+
end_date=end_date
|
| 1056 |
+
)
|
| 1057 |
+
|
| 1058 |
+
# Run price impact analysis
|
| 1059 |
+
price_impact = self.analyze_price_impact(
|
| 1060 |
+
wallets=wallets,
|
| 1061 |
+
start_date=start_date,
|
| 1062 |
+
end_date=end_date
|
| 1063 |
+
)
|
| 1064 |
+
|
| 1065 |
+
# Run manipulation detection
|
| 1066 |
+
alerts = self.detect_manipulation(
|
| 1067 |
+
wallets=wallets,
|
| 1068 |
+
start_date=start_date,
|
| 1069 |
+
end_date=end_date
|
| 1070 |
+
)
|
| 1071 |
+
|
| 1072 |
+
content = visualizer.generate_pdf_report(
|
| 1073 |
+
transactions_df=transactions_df,
|
| 1074 |
+
patterns=patterns,
|
| 1075 |
+
price_impact=price_impact,
|
| 1076 |
+
alerts=alerts,
|
| 1077 |
+
title=f"Whale Analysis Report: {report_type}",
|
| 1078 |
+
start_date=start_date,
|
| 1079 |
+
end_date=end_date
|
| 1080 |
+
)
|
| 1081 |
+
filename += ".pdf"
|
| 1082 |
+
|
| 1083 |
+
return {
|
| 1084 |
+
"filename": filename,
|
| 1085 |
+
"content": content
|
| 1086 |
+
}
|
| 1087 |
+
|
| 1088 |
+
elif export_format == "PNG":
|
| 1089 |
+
# For PNG we'll create a chart based on report type
|
| 1090 |
+
if report_type == "Transaction Summary":
|
| 1091 |
+
fig = visualizer.create_transaction_timeline(transactions_df)
|
| 1092 |
+
elif report_type == "Pattern Analysis":
|
| 1093 |
+
fig = visualizer.create_volume_chart(transactions_df)
|
| 1094 |
+
elif report_type == "Price Impact":
|
| 1095 |
+
# Run price impact analysis first
|
| 1096 |
+
price_impact = self.analyze_price_impact(
|
| 1097 |
+
wallets=wallets,
|
| 1098 |
+
start_date=start_date,
|
| 1099 |
+
end_date=end_date
|
| 1100 |
+
)
|
| 1101 |
+
fig = price_impact.get('impact_chart', visualizer.create_transaction_timeline(transactions_df))
|
| 1102 |
+
else: # "Manipulation Detection" or "Complete Analysis"
|
| 1103 |
+
fig = visualizer.create_network_graph(transactions_df)
|
| 1104 |
+
|
| 1105 |
+
content = visualizer.generate_png_chart(fig)
|
| 1106 |
+
filename += ".png"
|
| 1107 |
+
|
| 1108 |
+
return {
|
| 1109 |
+
"filename": filename,
|
| 1110 |
+
"content": content
|
| 1111 |
+
}
|
| 1112 |
+
|
| 1113 |
+
else:
|
| 1114 |
+
return {
|
| 1115 |
+
"filename": f"unsupported_format_{datetime.now().strftime('%Y%m%d_%H%M%S')}.txt",
|
| 1116 |
+
"content": "Unsupported export format requested."
|
| 1117 |
+
}
|
modules/crew_tools.py
ADDED
|
@@ -0,0 +1,362 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Properly implemented tools for the WhaleAnalysisCrewSystem
|
| 3 |
+
"""
|
| 4 |
+
|
| 5 |
+
import json
|
| 6 |
+
import pandas as pd
|
| 7 |
+
from datetime import datetime
|
| 8 |
+
from typing import Any, Dict, List, Optional, Type
|
| 9 |
+
from pydantic import BaseModel, Field
|
| 10 |
+
import logging
|
| 11 |
+
|
| 12 |
+
from modules.api_client import ArbiscanClient, GeminiClient
|
| 13 |
+
from modules.data_processor import DataProcessor
|
| 14 |
+
from langchain.tools import BaseTool
|
| 15 |
+
|
| 16 |
+
|
| 17 |
+
class GetTokenTransfersInput(BaseModel):
|
| 18 |
+
"""Input for the get_token_transfers tool."""
|
| 19 |
+
address: str = Field(..., description="Wallet address to query")
|
| 20 |
+
contract_address: Optional[str] = Field(None, description="Optional token contract address to filter by")
|
| 21 |
+
|
| 22 |
+
|
| 23 |
+
# Global clients that will be used by all tools
|
| 24 |
+
_GLOBAL_ARBISCAN_CLIENT = None
|
| 25 |
+
_GLOBAL_GEMINI_CLIENT = None
|
| 26 |
+
_GLOBAL_DATA_PROCESSOR = None
|
| 27 |
+
|
| 28 |
+
def set_global_clients(arbiscan_client=None, gemini_client=None, data_processor=None):
|
| 29 |
+
"""Set global client instances that will be used by all tools"""
|
| 30 |
+
global _GLOBAL_ARBISCAN_CLIENT, _GLOBAL_GEMINI_CLIENT, _GLOBAL_DATA_PROCESSOR
|
| 31 |
+
if arbiscan_client:
|
| 32 |
+
_GLOBAL_ARBISCAN_CLIENT = arbiscan_client
|
| 33 |
+
if gemini_client:
|
| 34 |
+
_GLOBAL_GEMINI_CLIENT = gemini_client
|
| 35 |
+
if data_processor:
|
| 36 |
+
_GLOBAL_DATA_PROCESSOR = data_processor
|
| 37 |
+
|
| 38 |
+
class ArbiscanGetTokenTransfersTool(BaseTool):
|
| 39 |
+
"""Tool for fetching token transfers from Arbiscan."""
|
| 40 |
+
name = "arbiscan_get_token_transfers"
|
| 41 |
+
description = "Get ERC-20 token transfers for a specific address"
|
| 42 |
+
args_schema: Type[BaseModel] = GetTokenTransfersInput
|
| 43 |
+
|
| 44 |
+
def __init__(self, arbiscan_client=None):
|
| 45 |
+
super().__init__()
|
| 46 |
+
# Store reference to client if provided, otherwise we'll use global instance
|
| 47 |
+
if arbiscan_client:
|
| 48 |
+
set_global_clients(arbiscan_client=arbiscan_client)
|
| 49 |
+
|
| 50 |
+
def _run(self, address: str, contract_address: Optional[str] = None) -> str:
|
| 51 |
+
global _GLOBAL_ARBISCAN_CLIENT
|
| 52 |
+
|
| 53 |
+
if not _GLOBAL_ARBISCAN_CLIENT:
|
| 54 |
+
return json.dumps({"error": "Arbiscan client not initialized. Please set global client first."})
|
| 55 |
+
|
| 56 |
+
try:
|
| 57 |
+
transfers = _GLOBAL_ARBISCAN_CLIENT.get_token_transfers(
|
| 58 |
+
address=address,
|
| 59 |
+
contract_address=contract_address
|
| 60 |
+
)
|
| 61 |
+
return json.dumps(transfers)
|
| 62 |
+
except Exception as e:
|
| 63 |
+
logging.error(f"Error in ArbiscanGetTokenTransfersTool: {str(e)}")
|
| 64 |
+
return json.dumps({"error": str(e)})
|
| 65 |
+
|
| 66 |
+
|
| 67 |
+
class GetNormalTransactionsInput(BaseModel):
|
| 68 |
+
"""Input for the get_normal_transactions tool."""
|
| 69 |
+
address: str = Field(..., description="Wallet address to query")
|
| 70 |
+
|
| 71 |
+
|
| 72 |
+
class ArbiscanGetNormalTransactionsTool(BaseTool):
|
| 73 |
+
"""Tool for fetching normal transactions from Arbiscan."""
|
| 74 |
+
name = "arbiscan_get_normal_transactions"
|
| 75 |
+
description = "Get normal transactions (ETH/ARB transfers) for a specific address"
|
| 76 |
+
args_schema: Type[BaseModel] = GetNormalTransactionsInput
|
| 77 |
+
|
| 78 |
+
def __init__(self, arbiscan_client=None):
|
| 79 |
+
super().__init__()
|
| 80 |
+
# Store reference to client if provided, otherwise we'll use global instance
|
| 81 |
+
if arbiscan_client:
|
| 82 |
+
set_global_clients(arbiscan_client=arbiscan_client)
|
| 83 |
+
|
| 84 |
+
def _run(self, address: str, startblock: int = 0, endblock: int = 99999999, page: int = 1, offset: int = 10) -> str:
|
| 85 |
+
global _GLOBAL_ARBISCAN_CLIENT
|
| 86 |
+
|
| 87 |
+
if not _GLOBAL_ARBISCAN_CLIENT:
|
| 88 |
+
return json.dumps({"error": "Arbiscan client not initialized. Please set global client first."})
|
| 89 |
+
|
| 90 |
+
try:
|
| 91 |
+
txs = _GLOBAL_ARBISCAN_CLIENT.get_normal_transactions(
|
| 92 |
+
address=address,
|
| 93 |
+
start_block=startblock,
|
| 94 |
+
end_block=endblock,
|
| 95 |
+
page=page,
|
| 96 |
+
offset=offset
|
| 97 |
+
)
|
| 98 |
+
return json.dumps(txs)
|
| 99 |
+
except Exception as e:
|
| 100 |
+
logging.error(f"Error in ArbiscanGetNormalTransactionsTool: {str(e)}")
|
| 101 |
+
return json.dumps({"error": str(e)})
|
| 102 |
+
|
| 103 |
+
|
| 104 |
+
class GetInternalTransactionsInput(BaseModel):
|
| 105 |
+
"""Input for the get_internal_transactions tool."""
|
| 106 |
+
address: str = Field(..., description="Wallet address to query")
|
| 107 |
+
|
| 108 |
+
|
| 109 |
+
class ArbiscanGetInternalTransactionsTool(BaseTool):
|
| 110 |
+
"""Tool for fetching internal transactions from Arbiscan."""
|
| 111 |
+
name = "arbiscan_get_internal_transactions"
|
| 112 |
+
description = "Get internal transactions for a specific address"
|
| 113 |
+
args_schema: Type[BaseModel] = GetInternalTransactionsInput
|
| 114 |
+
|
| 115 |
+
def __init__(self, arbiscan_client=None):
|
| 116 |
+
super().__init__()
|
| 117 |
+
# Store reference to client if provided, otherwise we'll use global instance
|
| 118 |
+
if arbiscan_client:
|
| 119 |
+
set_global_clients(arbiscan_client=arbiscan_client)
|
| 120 |
+
|
| 121 |
+
def _run(self, address: str, startblock: int = 0, endblock: int = 99999999, page: int = 1, offset: int = 10) -> str:
|
| 122 |
+
global _GLOBAL_ARBISCAN_CLIENT
|
| 123 |
+
|
| 124 |
+
if not _GLOBAL_ARBISCAN_CLIENT:
|
| 125 |
+
return json.dumps({"error": "Arbiscan client not initialized. Please set global client first."})
|
| 126 |
+
|
| 127 |
+
try:
|
| 128 |
+
txs = _GLOBAL_ARBISCAN_CLIENT.get_internal_transactions(
|
| 129 |
+
address=address,
|
| 130 |
+
start_block=startblock,
|
| 131 |
+
end_block=endblock,
|
| 132 |
+
page=page,
|
| 133 |
+
offset=offset
|
| 134 |
+
)
|
| 135 |
+
return json.dumps(txs)
|
| 136 |
+
except Exception as e:
|
| 137 |
+
logging.error(f"Error in ArbiscanGetInternalTransactionsTool: {str(e)}")
|
| 138 |
+
return json.dumps({"error": str(e)})
|
| 139 |
+
|
| 140 |
+
|
| 141 |
+
class FetchWhaleTransactionsInput(BaseModel):
|
| 142 |
+
"""Input for the fetch_whale_transactions tool."""
|
| 143 |
+
addresses: List[str] = Field(..., description="List of wallet addresses to query")
|
| 144 |
+
token_address: Optional[str] = Field(None, description="Optional token contract address to filter by")
|
| 145 |
+
min_token_amount: Optional[float] = Field(None, description="Minimum token amount")
|
| 146 |
+
min_usd_value: Optional[float] = Field(None, description="Minimum USD value")
|
| 147 |
+
|
| 148 |
+
|
| 149 |
+
class ArbiscanFetchWhaleTransactionsTool(BaseTool):
|
| 150 |
+
"""Tool for fetching whale transactions from Arbiscan."""
|
| 151 |
+
name = "arbiscan_fetch_whale_transactions"
|
| 152 |
+
description = "Fetch whale transactions for a list of addresses"
|
| 153 |
+
args_schema: Type[BaseModel] = FetchWhaleTransactionsInput
|
| 154 |
+
|
| 155 |
+
def __init__(self, arbiscan_client=None):
|
| 156 |
+
super().__init__()
|
| 157 |
+
# Store reference to client if provided, otherwise we'll use global instance
|
| 158 |
+
if arbiscan_client:
|
| 159 |
+
set_global_clients(arbiscan_client=arbiscan_client)
|
| 160 |
+
|
| 161 |
+
def _run(self, addresses: List[str], token_address: Optional[str] = None,
|
| 162 |
+
min_token_amount: Optional[float] = None, min_usd_value: Optional[float] = None) -> str:
|
| 163 |
+
global _GLOBAL_ARBISCAN_CLIENT
|
| 164 |
+
|
| 165 |
+
if not _GLOBAL_ARBISCAN_CLIENT:
|
| 166 |
+
return json.dumps({"error": "Arbiscan client not initialized. Please set global client first."})
|
| 167 |
+
|
| 168 |
+
try:
|
| 169 |
+
transactions_df = _GLOBAL_ARBISCAN_CLIENT.fetch_whale_transactions(
|
| 170 |
+
addresses=addresses,
|
| 171 |
+
token_address=token_address,
|
| 172 |
+
min_token_amount=min_token_amount,
|
| 173 |
+
min_usd_value=min_usd_value,
|
| 174 |
+
max_pages=5 # Limit to 5 pages to prevent excessive API calls
|
| 175 |
+
)
|
| 176 |
+
return transactions_df.to_json(orient="records")
|
| 177 |
+
except Exception as e:
|
| 178 |
+
logging.error(f"Error in ArbiscanFetchWhaleTransactionsTool: {str(e)}")
|
| 179 |
+
return json.dumps({"error": str(e)})
|
| 180 |
+
|
| 181 |
+
|
| 182 |
+
class GetCurrentPriceInput(BaseModel):
|
| 183 |
+
"""Input for the get_current_price tool."""
|
| 184 |
+
symbol: str = Field(..., description="Token symbol (e.g., 'ETHUSD')")
|
| 185 |
+
|
| 186 |
+
|
| 187 |
+
class GeminiGetCurrentPriceTool(BaseTool):
|
| 188 |
+
"""Tool for getting current token price from Gemini."""
|
| 189 |
+
name = "gemini_get_current_price"
|
| 190 |
+
description = "Get the current price of a token"
|
| 191 |
+
args_schema: Type[BaseModel] = GetCurrentPriceInput
|
| 192 |
+
|
| 193 |
+
def __init__(self, gemini_client=None):
|
| 194 |
+
super().__init__()
|
| 195 |
+
# Store reference to client if provided, otherwise we'll use global instance
|
| 196 |
+
if gemini_client:
|
| 197 |
+
set_global_clients(gemini_client=gemini_client)
|
| 198 |
+
|
| 199 |
+
def _run(self, symbol: str) -> str:
|
| 200 |
+
global _GLOBAL_GEMINI_CLIENT
|
| 201 |
+
|
| 202 |
+
if not _GLOBAL_GEMINI_CLIENT:
|
| 203 |
+
return json.dumps({"error": "Gemini client not initialized. Please set global client first."})
|
| 204 |
+
|
| 205 |
+
try:
|
| 206 |
+
price = _GLOBAL_GEMINI_CLIENT.get_current_price(symbol)
|
| 207 |
+
return json.dumps({"symbol": symbol, "price": price})
|
| 208 |
+
except Exception as e:
|
| 209 |
+
logging.error(f"Error in GeminiGetCurrentPriceTool: {str(e)}")
|
| 210 |
+
return json.dumps({"error": str(e)})
|
| 211 |
+
|
| 212 |
+
|
| 213 |
+
class GetHistoricalPricesInput(BaseModel):
|
| 214 |
+
"""Input for the get_historical_prices tool."""
|
| 215 |
+
symbol: str = Field(..., description="Token symbol (e.g., 'ETHUSD')")
|
| 216 |
+
start_time: str = Field(..., description="Start datetime in ISO format")
|
| 217 |
+
end_time: str = Field(..., description="End datetime in ISO format")
|
| 218 |
+
|
| 219 |
+
|
| 220 |
+
class GeminiGetHistoricalPricesTool(BaseTool):
|
| 221 |
+
"""Tool for getting historical token prices from Gemini."""
|
| 222 |
+
name = "gemini_get_historical_prices"
|
| 223 |
+
description = "Get historical prices for a token within a time range"
|
| 224 |
+
args_schema: Type[BaseModel] = GetHistoricalPricesInput
|
| 225 |
+
|
| 226 |
+
def __init__(self, gemini_client=None):
|
| 227 |
+
super().__init__()
|
| 228 |
+
# Store reference to client if provided, otherwise we'll use global instance
|
| 229 |
+
if gemini_client:
|
| 230 |
+
set_global_clients(gemini_client=gemini_client)
|
| 231 |
+
|
| 232 |
+
def _run(
|
| 233 |
+
self,
|
| 234 |
+
symbol: str,
|
| 235 |
+
start_time: Optional[str] = None,
|
| 236 |
+
end_time: Optional[str] = None,
|
| 237 |
+
interval: str = "15m"
|
| 238 |
+
) -> str:
|
| 239 |
+
global _GLOBAL_GEMINI_CLIENT
|
| 240 |
+
|
| 241 |
+
if not _GLOBAL_GEMINI_CLIENT:
|
| 242 |
+
return json.dumps({"error": "Gemini client not initialized. Please set global client first."})
|
| 243 |
+
|
| 244 |
+
try:
|
| 245 |
+
# Convert string times to datetime if provided
|
| 246 |
+
start_dt = None
|
| 247 |
+
end_dt = None
|
| 248 |
+
|
| 249 |
+
if start_time:
|
| 250 |
+
start_dt = datetime.fromisoformat(start_time)
|
| 251 |
+
if end_time:
|
| 252 |
+
end_dt = datetime.fromisoformat(end_time)
|
| 253 |
+
|
| 254 |
+
prices = _GLOBAL_GEMINI_CLIENT.get_historical_prices(
|
| 255 |
+
symbol=symbol,
|
| 256 |
+
start_time=start_dt,
|
| 257 |
+
end_time=end_dt,
|
| 258 |
+
interval=interval
|
| 259 |
+
)
|
| 260 |
+
|
| 261 |
+
return json.dumps(prices)
|
| 262 |
+
except Exception as e:
|
| 263 |
+
logging.error(f"Error in GeminiGetHistoricalPricesTool: {str(e)}")
|
| 264 |
+
return json.dumps({"error": str(e)})
|
| 265 |
+
|
| 266 |
+
|
| 267 |
+
class IdentifyPatternsInput(BaseModel):
|
| 268 |
+
"""Input for the identify_patterns tool."""
|
| 269 |
+
transactions_json: str = Field(..., description="JSON string of transactions")
|
| 270 |
+
n_clusters: int = Field(3, description="Number of clusters for K-Means")
|
| 271 |
+
|
| 272 |
+
|
| 273 |
+
class DataProcessorIdentifyPatternsTool(BaseTool):
|
| 274 |
+
"""Tool for identifying trading patterns using the DataProcessor."""
|
| 275 |
+
name = "data_processor_identify_patterns"
|
| 276 |
+
description = "Identify trading patterns in a set of transactions"
|
| 277 |
+
args_schema: Type[BaseModel] = IdentifyPatternsInput
|
| 278 |
+
|
| 279 |
+
def __init__(self, data_processor=None):
|
| 280 |
+
super().__init__()
|
| 281 |
+
# Store reference to processor if provided, otherwise we'll use global instance
|
| 282 |
+
if data_processor:
|
| 283 |
+
set_global_clients(data_processor=data_processor)
|
| 284 |
+
|
| 285 |
+
def _run(self, transactions_json: List[Dict[str, Any]], n_clusters: int = 3) -> str:
|
| 286 |
+
global _GLOBAL_DATA_PROCESSOR
|
| 287 |
+
|
| 288 |
+
if not _GLOBAL_DATA_PROCESSOR:
|
| 289 |
+
return json.dumps({"error": "Data processor not initialized. Please set global processor first."})
|
| 290 |
+
|
| 291 |
+
try:
|
| 292 |
+
# Convert JSON to DataFrame
|
| 293 |
+
transactions_df = pd.DataFrame(transactions_json)
|
| 294 |
+
|
| 295 |
+
# Ensure required columns exist
|
| 296 |
+
required_columns = ['timeStamp', 'hash', 'from', 'to', 'value', 'tokenSymbol']
|
| 297 |
+
for col in required_columns:
|
| 298 |
+
if col not in transactions_df.columns:
|
| 299 |
+
return json.dumps({
|
| 300 |
+
"error": f"Missing required column: {col}",
|
| 301 |
+
"available_columns": list(transactions_df.columns)
|
| 302 |
+
})
|
| 303 |
+
|
| 304 |
+
# Run pattern identification
|
| 305 |
+
patterns = _GLOBAL_DATA_PROCESSOR.identify_patterns(
|
| 306 |
+
transactions_df=transactions_df,
|
| 307 |
+
n_clusters=n_clusters
|
| 308 |
+
)
|
| 309 |
+
|
| 310 |
+
return json.dumps(patterns)
|
| 311 |
+
except Exception as e:
|
| 312 |
+
logging.error(f"Error in DataProcessorIdentifyPatternsTool: {str(e)}")
|
| 313 |
+
return json.dumps({"error": str(e)})
|
| 314 |
+
|
| 315 |
+
|
| 316 |
+
class DetectAnomalousTransactionsInput(BaseModel):
|
| 317 |
+
"""Input for the detect_anomalous_transactions tool."""
|
| 318 |
+
transactions_json: str = Field(..., description="JSON string of transactions")
|
| 319 |
+
sensitivity: str = Field("Medium", description="Detection sensitivity ('Low', 'Medium', 'High')")
|
| 320 |
+
|
| 321 |
+
|
| 322 |
+
class DataProcessorDetectAnomalousTransactionsTool(BaseTool):
|
| 323 |
+
"""Tool for detecting anomalous transactions using the DataProcessor."""
|
| 324 |
+
name = "data_processor_detect_anomalies"
|
| 325 |
+
description = "Detect anomalous transactions in a dataset"
|
| 326 |
+
args_schema: Type[BaseModel] = DetectAnomalousTransactionsInput
|
| 327 |
+
|
| 328 |
+
def __init__(self, data_processor=None):
|
| 329 |
+
super().__init__()
|
| 330 |
+
# Store reference to processor if provided, otherwise we'll use global instance
|
| 331 |
+
if data_processor:
|
| 332 |
+
set_global_clients(data_processor=data_processor)
|
| 333 |
+
|
| 334 |
+
def _run(self, transactions_json: List[Dict[str, Any]], sensitivity: str = "Medium") -> str:
|
| 335 |
+
global _GLOBAL_DATA_PROCESSOR
|
| 336 |
+
|
| 337 |
+
if not _GLOBAL_DATA_PROCESSOR:
|
| 338 |
+
return json.dumps({"error": "Data processor not initialized. Please set global processor first."})
|
| 339 |
+
|
| 340 |
+
try:
|
| 341 |
+
# Convert JSON to DataFrame
|
| 342 |
+
transactions_df = pd.DataFrame(transactions_json)
|
| 343 |
+
|
| 344 |
+
# Ensure required columns exist
|
| 345 |
+
required_columns = ['timeStamp', 'hash', 'from', 'to', 'value', 'tokenSymbol']
|
| 346 |
+
for col in required_columns:
|
| 347 |
+
if col not in transactions_df.columns:
|
| 348 |
+
return json.dumps({
|
| 349 |
+
"error": f"Missing required column: {col}",
|
| 350 |
+
"available_columns": list(transactions_df.columns)
|
| 351 |
+
})
|
| 352 |
+
|
| 353 |
+
# Run anomaly detection
|
| 354 |
+
anomalies = _GLOBAL_DATA_PROCESSOR.detect_anomalous_transactions(
|
| 355 |
+
transactions_df=transactions_df,
|
| 356 |
+
sensitivity=sensitivity
|
| 357 |
+
)
|
| 358 |
+
|
| 359 |
+
return json.dumps(anomalies)
|
| 360 |
+
except Exception as e:
|
| 361 |
+
logging.error(f"Error in DataProcessorDetectAnomalousTransactionsTool: {str(e)}")
|
| 362 |
+
return json.dumps({"error": str(e)})
|
modules/data_processor.py
ADDED
|
@@ -0,0 +1,1425 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import pandas as pd
|
| 2 |
+
import numpy as np
|
| 3 |
+
from datetime import datetime, timedelta
|
| 4 |
+
from typing import Dict, List, Optional, Union, Any, Tuple
|
| 5 |
+
from sklearn.cluster import KMeans, DBSCAN
|
| 6 |
+
from sklearn.preprocessing import StandardScaler
|
| 7 |
+
import plotly.graph_objects as go
|
| 8 |
+
import plotly.express as px
|
| 9 |
+
import logging
|
| 10 |
+
import time
|
| 11 |
+
|
| 12 |
+
class DataProcessor:
|
| 13 |
+
"""
|
| 14 |
+
Process and analyze transaction data from blockchain APIs
|
| 15 |
+
"""
|
| 16 |
+
|
| 17 |
+
def __init__(self):
|
| 18 |
+
pass
|
| 19 |
+
|
| 20 |
+
def aggregate_transactions(self,
|
| 21 |
+
transactions_df: pd.DataFrame,
|
| 22 |
+
time_window: str = 'D') -> pd.DataFrame:
|
| 23 |
+
"""
|
| 24 |
+
Aggregate transactions by time window
|
| 25 |
+
|
| 26 |
+
Args:
|
| 27 |
+
transactions_df: DataFrame of transactions
|
| 28 |
+
time_window: Time window for aggregation (e.g., 'D' for day, 'H' for hour)
|
| 29 |
+
|
| 30 |
+
Returns:
|
| 31 |
+
Aggregated DataFrame with transaction counts and volumes
|
| 32 |
+
"""
|
| 33 |
+
if transactions_df.empty:
|
| 34 |
+
return pd.DataFrame()
|
| 35 |
+
|
| 36 |
+
# Ensure timestamp column is datetime
|
| 37 |
+
if 'Timestamp' in transactions_df.columns:
|
| 38 |
+
timestamp_col = 'Timestamp'
|
| 39 |
+
elif 'timeStamp' in transactions_df.columns:
|
| 40 |
+
timestamp_col = 'timeStamp'
|
| 41 |
+
else:
|
| 42 |
+
raise ValueError("Timestamp column not found in transactions DataFrame")
|
| 43 |
+
|
| 44 |
+
# Ensure amount column exists
|
| 45 |
+
if 'Amount' in transactions_df.columns:
|
| 46 |
+
amount_col = 'Amount'
|
| 47 |
+
elif 'tokenAmount' in transactions_df.columns:
|
| 48 |
+
amount_col = 'tokenAmount'
|
| 49 |
+
elif 'value' in transactions_df.columns:
|
| 50 |
+
# Try to adjust for decimals if 'tokenDecimal' exists
|
| 51 |
+
if 'tokenDecimal' in transactions_df.columns:
|
| 52 |
+
transactions_df['adjustedValue'] = transactions_df['value'].astype(float) / (10 ** transactions_df['tokenDecimal'].astype(int))
|
| 53 |
+
amount_col = 'adjustedValue'
|
| 54 |
+
else:
|
| 55 |
+
amount_col = 'value'
|
| 56 |
+
else:
|
| 57 |
+
raise ValueError("Amount column not found in transactions DataFrame")
|
| 58 |
+
|
| 59 |
+
# Resample by time window
|
| 60 |
+
transactions_df = transactions_df.copy()
|
| 61 |
+
try:
|
| 62 |
+
transactions_df.set_index(pd.DatetimeIndex(transactions_df[timestamp_col]), inplace=True)
|
| 63 |
+
except Exception as e:
|
| 64 |
+
print(f"Error setting DatetimeIndex: {str(e)}")
|
| 65 |
+
# Create a safe index as a fallback
|
| 66 |
+
transactions_df['safe_timestamp'] = pd.date_range(
|
| 67 |
+
start='2025-01-01',
|
| 68 |
+
periods=len(transactions_df),
|
| 69 |
+
freq='H'
|
| 70 |
+
)
|
| 71 |
+
transactions_df.set_index('safe_timestamp', inplace=True)
|
| 72 |
+
|
| 73 |
+
# Identify buy vs sell transactions based on 'from' and 'to' addresses
|
| 74 |
+
if 'From' in transactions_df.columns and 'To' in transactions_df.columns:
|
| 75 |
+
from_col, to_col = 'From', 'To'
|
| 76 |
+
elif 'from' in transactions_df.columns and 'to' in transactions_df.columns:
|
| 77 |
+
from_col, to_col = 'from', 'to'
|
| 78 |
+
else:
|
| 79 |
+
# If we can't determine direction, just aggregate total volume
|
| 80 |
+
agg_df = transactions_df.resample(time_window).agg({
|
| 81 |
+
amount_col: 'sum',
|
| 82 |
+
timestamp_col: 'count'
|
| 83 |
+
})
|
| 84 |
+
agg_df.columns = ['Volume', 'Count']
|
| 85 |
+
return agg_df.reset_index()
|
| 86 |
+
|
| 87 |
+
# Calculate net flow for each wallet address (positive = inflow, negative = outflow)
|
| 88 |
+
wallet_addresses = set(transactions_df[from_col].unique()) | set(transactions_df[to_col].unique())
|
| 89 |
+
|
| 90 |
+
results = []
|
| 91 |
+
for wallet in wallet_addresses:
|
| 92 |
+
wallet_df = transactions_df.copy()
|
| 93 |
+
|
| 94 |
+
# Mark transactions as inflow or outflow
|
| 95 |
+
wallet_df['Direction'] = 'Unknown'
|
| 96 |
+
wallet_df.loc[wallet_df[to_col] == wallet, 'Direction'] = 'In'
|
| 97 |
+
wallet_df.loc[wallet_df[from_col] == wallet, 'Direction'] = 'Out'
|
| 98 |
+
|
| 99 |
+
# Calculate net flow
|
| 100 |
+
wallet_df['NetFlow'] = wallet_df[amount_col]
|
| 101 |
+
wallet_df.loc[wallet_df['Direction'] == 'Out', 'NetFlow'] = -wallet_df.loc[wallet_df['Direction'] == 'Out', amount_col]
|
| 102 |
+
|
| 103 |
+
# Aggregate by time window
|
| 104 |
+
wallet_agg = wallet_df.resample(time_window).agg({
|
| 105 |
+
'NetFlow': 'sum',
|
| 106 |
+
timestamp_col: 'count'
|
| 107 |
+
})
|
| 108 |
+
wallet_agg.columns = ['NetFlow', 'Count']
|
| 109 |
+
wallet_agg['Wallet'] = wallet
|
| 110 |
+
|
| 111 |
+
results.append(wallet_agg.reset_index())
|
| 112 |
+
|
| 113 |
+
if not results:
|
| 114 |
+
return pd.DataFrame()
|
| 115 |
+
|
| 116 |
+
combined_df = pd.concat(results, ignore_index=True)
|
| 117 |
+
return combined_df
|
| 118 |
+
|
| 119 |
+
# Cache for pattern identification to avoid repeating expensive calculations
|
| 120 |
+
_pattern_cache = {}
|
| 121 |
+
|
| 122 |
+
def identify_patterns(self,
|
| 123 |
+
transactions_df: pd.DataFrame,
|
| 124 |
+
n_clusters: int = 3) -> List[Dict[str, Any]]:
|
| 125 |
+
"""
|
| 126 |
+
Identify trading patterns using clustering algorithms
|
| 127 |
+
|
| 128 |
+
Args:
|
| 129 |
+
transactions_df: DataFrame of transactions
|
| 130 |
+
n_clusters: Number of clusters to identify
|
| 131 |
+
|
| 132 |
+
Returns:
|
| 133 |
+
List of pattern dictionaries containing name, description, and confidence
|
| 134 |
+
"""
|
| 135 |
+
# Check for empty data early to avoid processing
|
| 136 |
+
if transactions_df.empty:
|
| 137 |
+
return []
|
| 138 |
+
|
| 139 |
+
# Create a cache key based on DataFrame hash and number of clusters
|
| 140 |
+
try:
|
| 141 |
+
cache_key = f"{hash(tuple(transactions_df.columns))}_{len(transactions_df)}_{n_clusters}"
|
| 142 |
+
|
| 143 |
+
# Check cache first
|
| 144 |
+
if cache_key in self._pattern_cache:
|
| 145 |
+
return self._pattern_cache[cache_key]
|
| 146 |
+
except Exception:
|
| 147 |
+
# If hashing fails, proceed without caching
|
| 148 |
+
cache_key = None
|
| 149 |
+
|
| 150 |
+
try:
|
| 151 |
+
# Create a reference instead of a deep copy to improve memory usage
|
| 152 |
+
df = transactions_df
|
| 153 |
+
|
| 154 |
+
# Ensure timestamp column exists - optimize column presence checks
|
| 155 |
+
timestamp_cols = ['Timestamp', 'timeStamp']
|
| 156 |
+
timestamp_col = next((col for col in timestamp_cols if col in df.columns), None)
|
| 157 |
+
|
| 158 |
+
if timestamp_col:
|
| 159 |
+
# Convert timestamp only if needed
|
| 160 |
+
if not pd.api.types.is_datetime64_any_dtype(df[timestamp_col]):
|
| 161 |
+
try:
|
| 162 |
+
# Use vectorized operations instead of astype where possible
|
| 163 |
+
if df[timestamp_col].dtype == 'object':
|
| 164 |
+
df[timestamp_col] = pd.to_datetime(df[timestamp_col], errors='coerce')
|
| 165 |
+
else:
|
| 166 |
+
df[timestamp_col] = pd.to_datetime(df[timestamp_col], unit='s', errors='coerce')
|
| 167 |
+
except Exception as e:
|
| 168 |
+
# Create a date range index as fallback
|
| 169 |
+
df['dummy_timestamp'] = pd.date_range(start='2025-01-01', periods=len(df), freq='H')
|
| 170 |
+
timestamp_col = 'dummy_timestamp'
|
| 171 |
+
else:
|
| 172 |
+
# If no timestamp column, create a dummy index
|
| 173 |
+
df['dummy_timestamp'] = pd.date_range(start='2025-01-01', periods=len(df), freq='H')
|
| 174 |
+
timestamp_col = 'dummy_timestamp'
|
| 175 |
+
|
| 176 |
+
# Efficiently calculate floor hour using vectorized operations
|
| 177 |
+
df['hour'] = df[timestamp_col].dt.floor('H')
|
| 178 |
+
|
| 179 |
+
# Check for address columns efficiently
|
| 180 |
+
if 'From' in df.columns and 'To' in df.columns:
|
| 181 |
+
from_col, to_col = 'From', 'To'
|
| 182 |
+
elif 'from' in df.columns and 'to' in df.columns:
|
| 183 |
+
from_col, to_col = 'from', 'to'
|
| 184 |
+
else:
|
| 185 |
+
# Create dummy addresses only if necessary
|
| 186 |
+
df['from'] = [f'0x{i:040x}' for i in range(len(df))]
|
| 187 |
+
df['to'] = [f'0x{(i+1):040x}' for i in range(len(df))]
|
| 188 |
+
from_col, to_col = 'from', 'to'
|
| 189 |
+
|
| 190 |
+
# Efficiently determine amount column
|
| 191 |
+
amount_cols = ['Amount', 'tokenAmount', 'value', 'adjustedValue']
|
| 192 |
+
amount_col = next((col for col in amount_cols if col in df.columns), None)
|
| 193 |
+
|
| 194 |
+
if not amount_col:
|
| 195 |
+
# Handle special case for token values with decimals
|
| 196 |
+
if 'value' in df.columns and 'tokenDecimal' in df.columns:
|
| 197 |
+
# Vectorized calculation for improved performance
|
| 198 |
+
try:
|
| 199 |
+
# Ensure values are numeric
|
| 200 |
+
df['value_numeric'] = pd.to_numeric(df['value'], errors='coerce')
|
| 201 |
+
df['tokenDecimal_numeric'] = pd.to_numeric(df['tokenDecimal'], errors='coerce').fillna(18)
|
| 202 |
+
df['adjustedValue'] = df['value_numeric'] / (10 ** df['tokenDecimal_numeric'])
|
| 203 |
+
amount_col = 'adjustedValue'
|
| 204 |
+
except Exception as e:
|
| 205 |
+
logging.warning(f"Error converting values: {e}")
|
| 206 |
+
df['dummy_amount'] = 1.0
|
| 207 |
+
amount_col = 'dummy_amount'
|
| 208 |
+
else:
|
| 209 |
+
# Fallback to dummy values
|
| 210 |
+
df['dummy_amount'] = 1.0
|
| 211 |
+
amount_col = 'dummy_amount'
|
| 212 |
+
|
| 213 |
+
# Ensure the amount column is numeric
|
| 214 |
+
try:
|
| 215 |
+
if amount_col in df.columns:
|
| 216 |
+
df[f"{amount_col}_numeric"] = pd.to_numeric(df[amount_col], errors='coerce').fillna(0)
|
| 217 |
+
amount_col = f"{amount_col}_numeric"
|
| 218 |
+
except Exception:
|
| 219 |
+
# If conversion fails, create a dummy numeric column
|
| 220 |
+
df['safe_amount'] = 1.0
|
| 221 |
+
amount_col = 'safe_amount'
|
| 222 |
+
|
| 223 |
+
# Calculate metrics using optimized groupby operations
|
| 224 |
+
# Use a more efficient approach with built-in pandas aggregation
|
| 225 |
+
agg_df = df.groupby('hour').agg(
|
| 226 |
+
Count=pd.NamedAgg(column=from_col, aggfunc='count'),
|
| 227 |
+
).reset_index()
|
| 228 |
+
|
| 229 |
+
# For NetFlow calculation, we need an additional pass
|
| 230 |
+
# This uses a more efficient calculation method
|
| 231 |
+
def calc_netflow(group):
|
| 232 |
+
# Use optimized filtering and calculations for better performance
|
| 233 |
+
first_to = group[to_col].iloc[0] if len(group) > 0 else None
|
| 234 |
+
first_from = group[from_col].iloc[0] if len(group) > 0 else None
|
| 235 |
+
|
| 236 |
+
if first_to is not None and first_from is not None:
|
| 237 |
+
# Ensure values are converted to numeric before summing
|
| 238 |
+
try:
|
| 239 |
+
# Convert to numeric with pd.to_numeric, coerce errors to NaN
|
| 240 |
+
total_in = pd.to_numeric(group.loc[group[to_col] == first_to, amount_col], errors='coerce').sum()
|
| 241 |
+
total_out = pd.to_numeric(group.loc[group[from_col] == first_from, amount_col], errors='coerce').sum()
|
| 242 |
+
# Replace NaN with 0 to avoid propagation
|
| 243 |
+
if pd.isna(total_in): total_in = 0.0
|
| 244 |
+
if pd.isna(total_out): total_out = 0.0
|
| 245 |
+
return float(total_in) - float(total_out)
|
| 246 |
+
except Exception as e:
|
| 247 |
+
import logging
|
| 248 |
+
logging.debug(f"Error converting values to numeric: {e}")
|
| 249 |
+
return 0.0
|
| 250 |
+
return 0.0
|
| 251 |
+
|
| 252 |
+
# Calculate NetFlow using apply instead of loop
|
| 253 |
+
netflows = df.groupby('hour').apply(calc_netflow)
|
| 254 |
+
agg_df['NetFlow'] = netflows.values
|
| 255 |
+
|
| 256 |
+
# Early return if not enough data for clustering
|
| 257 |
+
if agg_df.empty or len(agg_df) < n_clusters:
|
| 258 |
+
return []
|
| 259 |
+
|
| 260 |
+
# Ensure we don't have too many clusters for the dataset
|
| 261 |
+
actual_n_clusters = min(n_clusters, max(2, len(agg_df) // 2))
|
| 262 |
+
|
| 263 |
+
# Prepare features for clustering - with careful type handling
|
| 264 |
+
try:
|
| 265 |
+
if 'NetFlow' in agg_df.columns:
|
| 266 |
+
# Ensure NetFlow is numeric
|
| 267 |
+
agg_df['NetFlow'] = pd.to_numeric(agg_df['NetFlow'], errors='coerce').fillna(0)
|
| 268 |
+
features = agg_df[['NetFlow', 'Count']].copy()
|
| 269 |
+
primary_metric = 'NetFlow'
|
| 270 |
+
else:
|
| 271 |
+
# Calculate Volume if needed
|
| 272 |
+
if 'Volume' not in agg_df.columns and amount_col in df.columns:
|
| 273 |
+
# Calculate volume with numeric conversion
|
| 274 |
+
volume_by_hour = pd.to_numeric(df[amount_col], errors='coerce').fillna(0).groupby(df['hour']).sum()
|
| 275 |
+
agg_df['Volume'] = agg_df['hour'].map(volume_by_hour)
|
| 276 |
+
|
| 277 |
+
# Ensure Volume exists and is numeric
|
| 278 |
+
if 'Volume' not in agg_df.columns:
|
| 279 |
+
agg_df['Volume'] = 1.0 # Default value if calculation failed
|
| 280 |
+
else:
|
| 281 |
+
agg_df['Volume'] = pd.to_numeric(agg_df['Volume'], errors='coerce').fillna(1.0)
|
| 282 |
+
|
| 283 |
+
# Ensure Count is numeric
|
| 284 |
+
agg_df['Count'] = pd.to_numeric(agg_df['Count'], errors='coerce').fillna(1.0)
|
| 285 |
+
|
| 286 |
+
features = agg_df[['Volume', 'Count']].copy()
|
| 287 |
+
primary_metric = 'Volume'
|
| 288 |
+
|
| 289 |
+
# Final check to ensure features are numeric
|
| 290 |
+
for col in features.columns:
|
| 291 |
+
features[col] = pd.to_numeric(features[col], errors='coerce').fillna(0)
|
| 292 |
+
except Exception as e:
|
| 293 |
+
logging.warning(f"Error preparing clustering features: {e}")
|
| 294 |
+
# Create safe dummy features if everything else fails
|
| 295 |
+
agg_df['SafeFeature'] = 1.0
|
| 296 |
+
agg_df['Count'] = 1.0
|
| 297 |
+
features = agg_df[['SafeFeature', 'Count']].copy()
|
| 298 |
+
primary_metric = 'SafeFeature'
|
| 299 |
+
|
| 300 |
+
# Scale features - import only when needed for efficiency
|
| 301 |
+
from sklearn.preprocessing import StandardScaler
|
| 302 |
+
scaler = StandardScaler()
|
| 303 |
+
scaled_features = scaler.fit_transform(features)
|
| 304 |
+
|
| 305 |
+
# Use K-Means with reduced complexity
|
| 306 |
+
from sklearn.cluster import KMeans
|
| 307 |
+
kmeans = KMeans(n_clusters=actual_n_clusters, random_state=42, n_init=10, max_iter=100)
|
| 308 |
+
agg_df['Cluster'] = kmeans.fit_predict(scaled_features)
|
| 309 |
+
|
| 310 |
+
# Calculate time-based metrics from the hour column directly
|
| 311 |
+
if 'hour' in agg_df.columns:
|
| 312 |
+
try:
|
| 313 |
+
# Convert to datetime for hour and day extraction if needed
|
| 314 |
+
hour_series = pd.to_datetime(agg_df['hour'])
|
| 315 |
+
agg_df['Hour'] = hour_series.dt.hour
|
| 316 |
+
agg_df['Day'] = hour_series.dt.dayofweek
|
| 317 |
+
except Exception:
|
| 318 |
+
# Fallback for non-convertible data
|
| 319 |
+
agg_df['Hour'] = 0
|
| 320 |
+
agg_df['Day'] = 0
|
| 321 |
+
else:
|
| 322 |
+
# Default values if no hour column
|
| 323 |
+
agg_df['Hour'] = 0
|
| 324 |
+
agg_df['Day'] = 0
|
| 325 |
+
|
| 326 |
+
# Identify patterns efficiently
|
| 327 |
+
patterns = []
|
| 328 |
+
for i in range(actual_n_clusters):
|
| 329 |
+
# Use boolean indexing for better performance
|
| 330 |
+
cluster_mask = agg_df['Cluster'] == i
|
| 331 |
+
cluster_df = agg_df[cluster_mask]
|
| 332 |
+
|
| 333 |
+
if len(cluster_df) == 0:
|
| 334 |
+
continue
|
| 335 |
+
|
| 336 |
+
if primary_metric == 'NetFlow':
|
| 337 |
+
# Use numpy methods for faster calculation
|
| 338 |
+
avg_flow = cluster_df['NetFlow'].mean()
|
| 339 |
+
flow_std = cluster_df['NetFlow'].std()
|
| 340 |
+
behavior = "Accumulation" if avg_flow > 0 else "Distribution"
|
| 341 |
+
volume_metric = f"Net Flow: {avg_flow:.2f} Β± {flow_std:.2f}"
|
| 342 |
+
else:
|
| 343 |
+
# Use Volume metrics - optimize to avoid redundant calculations
|
| 344 |
+
avg_volume = cluster_df['Volume'].mean() if 'Volume' in cluster_df else 0
|
| 345 |
+
volume_std = cluster_df['Volume'].std() if 'Volume' in cluster_df else 0
|
| 346 |
+
behavior = "High Volume" if 'Volume' in agg_df and avg_volume > agg_df['Volume'].mean() else "Low Volume"
|
| 347 |
+
volume_metric = f"Volume: {avg_volume:.2f} Β± {volume_std:.2f}"
|
| 348 |
+
|
| 349 |
+
# Pattern characteristics
|
| 350 |
+
pattern_metrics = {
|
| 351 |
+
"avg_flow": avg_flow,
|
| 352 |
+
"flow_std": flow_std,
|
| 353 |
+
"avg_count": cluster_df['Count'].mean(),
|
| 354 |
+
"max_flow": cluster_df['NetFlow'].max(),
|
| 355 |
+
"min_flow": cluster_df['NetFlow'].min(),
|
| 356 |
+
"common_hour": cluster_df['Hour'].mode()[0] if not cluster_df['Hour'].empty else None,
|
| 357 |
+
"common_day": cluster_df['Day'].mode()[0] if not cluster_df['Day'].empty else None
|
| 358 |
+
}
|
| 359 |
+
|
| 360 |
+
# Enhanced confidence calculation
|
| 361 |
+
if primary_metric == 'NetFlow':
|
| 362 |
+
# Calculate within-cluster variance as a percentage of total variance
|
| 363 |
+
cluster_variance = cluster_df['NetFlow'].var()
|
| 364 |
+
total_variance = agg_df['NetFlow'].var() or 1 # Avoid division by zero
|
| 365 |
+
confidence = max(0.4, min(0.95, 1 - (cluster_variance / total_variance)))
|
| 366 |
+
else:
|
| 367 |
+
# Calculate within-cluster variance as a percentage of total variance
|
| 368 |
+
cluster_variance = cluster_df['Volume'].var()
|
| 369 |
+
total_variance = agg_df['Volume'].var() or 1 # Avoid division by zero
|
| 370 |
+
confidence = max(0.4, min(0.95, 1 - (cluster_variance / total_variance)))
|
| 371 |
+
|
| 372 |
+
# Create enhanced pattern charts - Main Chart
|
| 373 |
+
if primary_metric == 'NetFlow':
|
| 374 |
+
main_fig = px.scatter(cluster_df, x=cluster_df.index, y='NetFlow',
|
| 375 |
+
size='Count', color='Cluster',
|
| 376 |
+
title=f"Pattern {i+1}: {behavior}",
|
| 377 |
+
labels={'NetFlow': 'Net Token Flow', 'index': 'Time'},
|
| 378 |
+
color_discrete_sequence=['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd'])
|
| 379 |
+
|
| 380 |
+
# Add a trend line
|
| 381 |
+
main_fig.add_trace(go.Scatter(
|
| 382 |
+
x=cluster_df.index,
|
| 383 |
+
y=cluster_df['NetFlow'].rolling(window=3, min_periods=1).mean(),
|
| 384 |
+
mode='lines',
|
| 385 |
+
name='Trend',
|
| 386 |
+
line=dict(width=2, dash='dash', color='rgba(0,0,0,0.5)')
|
| 387 |
+
))
|
| 388 |
+
|
| 389 |
+
# Add a zero reference line
|
| 390 |
+
main_fig.add_shape(
|
| 391 |
+
type="line",
|
| 392 |
+
x0=cluster_df.index.min(),
|
| 393 |
+
y0=0,
|
| 394 |
+
x1=cluster_df.index.max(),
|
| 395 |
+
y1=0,
|
| 396 |
+
line=dict(color="red", width=1, dash="dot"),
|
| 397 |
+
)
|
| 398 |
+
else:
|
| 399 |
+
main_fig = px.scatter(cluster_df, x=cluster_df.index, y='Volume',
|
| 400 |
+
size='Count', color='Cluster',
|
| 401 |
+
title=f"Pattern {i+1}: {behavior}",
|
| 402 |
+
labels={'Volume': 'Transaction Volume', 'index': 'Time'},
|
| 403 |
+
color_discrete_sequence=['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd'])
|
| 404 |
+
|
| 405 |
+
# Add a trend line
|
| 406 |
+
main_fig.add_trace(go.Scatter(
|
| 407 |
+
x=cluster_df.index,
|
| 408 |
+
y=cluster_df['Volume'].rolling(window=3, min_periods=1).mean(),
|
| 409 |
+
mode='lines',
|
| 410 |
+
name='Trend',
|
| 411 |
+
line=dict(width=2, dash='dash', color='rgba(0,0,0,0.5)')
|
| 412 |
+
))
|
| 413 |
+
|
| 414 |
+
main_fig.update_layout(
|
| 415 |
+
template="plotly_white",
|
| 416 |
+
legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1),
|
| 417 |
+
margin=dict(l=20, r=20, t=50, b=20),
|
| 418 |
+
height=400
|
| 419 |
+
)
|
| 420 |
+
|
| 421 |
+
# Create hourly distribution chart
|
| 422 |
+
hour_counts = cluster_df.groupby('Hour')['Count'].sum().reindex(range(24), fill_value=0)
|
| 423 |
+
hour_fig = px.bar(x=hour_counts.index, y=hour_counts.values,
|
| 424 |
+
title="Hourly Distribution",
|
| 425 |
+
labels={'x': 'Hour of Day', 'y': 'Transaction Count'},
|
| 426 |
+
color_discrete_sequence=['#1f77b4'])
|
| 427 |
+
hour_fig.update_layout(template="plotly_white", height=300)
|
| 428 |
+
|
| 429 |
+
# Create volume/flow distribution chart
|
| 430 |
+
if primary_metric == 'NetFlow':
|
| 431 |
+
hist_data = cluster_df['NetFlow']
|
| 432 |
+
hist_title = "Net Flow Distribution"
|
| 433 |
+
hist_label = "Net Flow"
|
| 434 |
+
else:
|
| 435 |
+
hist_data = cluster_df['Volume']
|
| 436 |
+
hist_title = "Volume Distribution"
|
| 437 |
+
hist_label = "Volume"
|
| 438 |
+
|
| 439 |
+
dist_fig = px.histogram(hist_data,
|
| 440 |
+
title=hist_title,
|
| 441 |
+
labels={'value': hist_label, 'count': 'Frequency'},
|
| 442 |
+
color_discrete_sequence=['#2ca02c'])
|
| 443 |
+
dist_fig.update_layout(template="plotly_white", height=300)
|
| 444 |
+
|
| 445 |
+
# Find related transactions
|
| 446 |
+
if not transactions_df.empty:
|
| 447 |
+
# Get timestamps from this cluster
|
| 448 |
+
cluster_times = pd.to_datetime(cluster_df.index)
|
| 449 |
+
# Create time windows for matching
|
| 450 |
+
time_windows = [(t - pd.Timedelta(hours=1), t + pd.Timedelta(hours=1)) for t in cluster_times]
|
| 451 |
+
|
| 452 |
+
# Find transactions within these time windows
|
| 453 |
+
pattern_txs = transactions_df[transactions_df[timestamp_col].apply(
|
| 454 |
+
lambda x: any((start <= x <= end) for start, end in time_windows)
|
| 455 |
+
)].copy()
|
| 456 |
+
|
| 457 |
+
# If we have too many, sample them
|
| 458 |
+
if len(pattern_txs) > 10:
|
| 459 |
+
pattern_txs = pattern_txs.sample(10)
|
| 460 |
+
|
| 461 |
+
# If we have too few, just sample from all transactions
|
| 462 |
+
if len(pattern_txs) < 5 and len(transactions_df) >= 5:
|
| 463 |
+
pattern_txs = transactions_df.sample(min(5, len(transactions_df)))
|
| 464 |
+
else:
|
| 465 |
+
pattern_txs = pd.DataFrame()
|
| 466 |
+
|
| 467 |
+
# Comprehensive pattern dictionary
|
| 468 |
+
pattern = {
|
| 469 |
+
"name": behavior,
|
| 470 |
+
"description": f"This pattern shows {behavior.lower()} activity.",
|
| 471 |
+
"strategy": "Unknown",
|
| 472 |
+
"risk_profile": "Unknown",
|
| 473 |
+
"time_insight": "Unknown",
|
| 474 |
+
"cluster_id": i,
|
| 475 |
+
"metrics": pattern_metrics,
|
| 476 |
+
"occurrence_count": len(cluster_df),
|
| 477 |
+
"volume_metric": volume_metric,
|
| 478 |
+
"confidence": confidence,
|
| 479 |
+
"impact": 0.0,
|
| 480 |
+
"charts": {
|
| 481 |
+
"main": main_fig,
|
| 482 |
+
"hourly_distribution": hour_fig,
|
| 483 |
+
"value_distribution": dist_fig
|
| 484 |
+
},
|
| 485 |
+
"examples": pattern_txs
|
| 486 |
+
}
|
| 487 |
+
|
| 488 |
+
patterns.append(pattern)
|
| 489 |
+
|
| 490 |
+
# Cache results for future reuse
|
| 491 |
+
if cache_key:
|
| 492 |
+
self._pattern_cache[cache_key] = patterns
|
| 493 |
+
|
| 494 |
+
return patterns
|
| 495 |
+
|
| 496 |
+
except Exception as e:
|
| 497 |
+
import logging
|
| 498 |
+
logging.warning(f"Error during pattern identification: {str(e)}")
|
| 499 |
+
return []
|
| 500 |
+
|
| 501 |
+
# Create enhanced pattern detection method with visualization capabilities
|
| 502 |
+
if primary_metric == 'NetFlow':
|
| 503 |
+
main_fig = px.scatter(cluster_df, x=cluster_df.index, y='NetFlow',
|
| 504 |
+
size='Count', color='Cluster',
|
| 505 |
+
title=f"Pattern {i+1}: {behavior}",
|
| 506 |
+
labels={'NetFlow': 'Net Token Flow', 'index': 'Time'},
|
| 507 |
+
color_discrete_sequence=['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd'])
|
| 508 |
+
|
| 509 |
+
# Add a trend line
|
| 510 |
+
main_fig.add_trace(go.Scatter(
|
| 511 |
+
x=cluster_df.index,
|
| 512 |
+
y=cluster_df['NetFlow'].rolling(window=3, min_periods=1).mean(),
|
| 513 |
+
mode='lines',
|
| 514 |
+
name='Trend',
|
| 515 |
+
line=dict(width=2, dash='dash', color='rgba(0,0,0,0.5)')
|
| 516 |
+
))
|
| 517 |
+
|
| 518 |
+
# Add a zero reference line
|
| 519 |
+
main_fig.add_shape(
|
| 520 |
+
type="line",
|
| 521 |
+
x0=cluster_df.index.min(),
|
| 522 |
+
y0=0,
|
| 523 |
+
x1=cluster_df.index.max(),
|
| 524 |
+
y1=0,
|
| 525 |
+
line=dict(color="red", width=1, dash="dot"),
|
| 526 |
+
)
|
| 527 |
+
else:
|
| 528 |
+
main_fig = px.scatter(cluster_df, x=cluster_df.index, y='Volume',
|
| 529 |
+
size='Count', color='Cluster',
|
| 530 |
+
title=f"Pattern {i+1}: {behavior}",
|
| 531 |
+
labels={'Volume': 'Transaction Volume', 'index': 'Time'},
|
| 532 |
+
color_discrete_sequence=['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd'])
|
| 533 |
+
|
| 534 |
+
# Add a trend line
|
| 535 |
+
main_fig.add_trace(go.Scatter(
|
| 536 |
+
x=cluster_df.index,
|
| 537 |
+
y=cluster_df['Volume'].rolling(window=3, min_periods=1).mean(),
|
| 538 |
+
mode='lines',
|
| 539 |
+
name='Trend',
|
| 540 |
+
line=dict(width=2, dash='dash', color='rgba(0,0,0,0.5)')
|
| 541 |
+
))
|
| 542 |
+
|
| 543 |
+
main_fig.update_layout(
|
| 544 |
+
template="plotly_white",
|
| 545 |
+
legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1),
|
| 546 |
+
margin=dict(l=20, r=20, t=50, b=20),
|
| 547 |
+
height=400
|
| 548 |
+
)
|
| 549 |
+
|
| 550 |
+
# Create hourly distribution chart
|
| 551 |
+
hour_counts = cluster_df.groupby('Hour')['Count'].sum().reindex(range(24), fill_value=0)
|
| 552 |
+
hour_fig = px.bar(x=hour_counts.index, y=hour_counts.values,
|
| 553 |
+
title="Hourly Distribution",
|
| 554 |
+
labels={'x': 'Hour of Day', 'y': 'Transaction Count'},
|
| 555 |
+
color_discrete_sequence=['#1f77b4'])
|
| 556 |
+
hour_fig.update_layout(template="plotly_white", height=300)
|
| 557 |
+
|
| 558 |
+
# Create volume/flow distribution chart
|
| 559 |
+
if primary_metric == 'NetFlow':
|
| 560 |
+
hist_data = cluster_df['NetFlow']
|
| 561 |
+
hist_title = "Net Flow Distribution"
|
| 562 |
+
hist_label = "Net Flow"
|
| 563 |
+
else:
|
| 564 |
+
hist_data = cluster_df['Volume']
|
| 565 |
+
hist_title = "Volume Distribution"
|
| 566 |
+
hist_label = "Volume"
|
| 567 |
+
|
| 568 |
+
dist_fig = px.histogram(hist_data,
|
| 569 |
+
title=hist_title,
|
| 570 |
+
labels={'value': hist_label, 'count': 'Frequency'},
|
| 571 |
+
color_discrete_sequence=['#2ca02c'])
|
| 572 |
+
dist_fig.update_layout(template="plotly_white", height=300)
|
| 573 |
+
|
| 574 |
+
# Find related transactions
|
| 575 |
+
if not transactions_df.empty:
|
| 576 |
+
# Get timestamps from this cluster
|
| 577 |
+
cluster_times = pd.to_datetime(cluster_df.index)
|
| 578 |
+
# Create time windows for matching
|
| 579 |
+
time_windows = [(t - pd.Timedelta(hours=1), t + pd.Timedelta(hours=1)) for t in cluster_times]
|
| 580 |
+
|
| 581 |
+
# Find transactions within these time windows
|
| 582 |
+
pattern_txs = transactions_df[transactions_df[timestamp_col].apply(
|
| 583 |
+
lambda x: any((start <= x <= end) for start, end in time_windows)
|
| 584 |
+
)].copy()
|
| 585 |
+
|
| 586 |
+
# If we have too many, sample them
|
| 587 |
+
if len(pattern_txs) > 10:
|
| 588 |
+
pattern_txs = pattern_txs.sample(10)
|
| 589 |
+
|
| 590 |
+
# If we have too few, just sample from all transactions
|
| 591 |
+
if len(pattern_txs) < 5 and len(transactions_df) >= 5:
|
| 592 |
+
pattern_txs = transactions_df.sample(min(5, len(transactions_df)))
|
| 593 |
+
else:
|
| 594 |
+
pattern_txs = pd.DataFrame()
|
| 595 |
+
|
| 596 |
+
# Comprehensive pattern dictionary
|
| 597 |
+
pattern = {
|
| 598 |
+
"name": behavior,
|
| 599 |
+
"description": description,
|
| 600 |
+
"strategy": strategy,
|
| 601 |
+
"risk_profile": risk_profile,
|
| 602 |
+
"time_insight": time_insight,
|
| 603 |
+
"cluster_id": i,
|
| 604 |
+
"metrics": pattern_metrics,
|
| 605 |
+
"occurrence_count": len(cluster_df),
|
| 606 |
+
"volume_metric": volume_metric,
|
| 607 |
+
"confidence": confidence,
|
| 608 |
+
"charts": {
|
| 609 |
+
"main": main_fig,
|
| 610 |
+
"hourly_distribution": hour_fig,
|
| 611 |
+
"value_distribution": dist_fig
|
| 612 |
+
},
|
| 613 |
+
"examples": pattern_txs
|
| 614 |
+
}
|
| 615 |
+
|
| 616 |
+
patterns.append(pattern)
|
| 617 |
+
|
| 618 |
+
return patterns
|
| 619 |
+
|
| 620 |
+
def detect_anomalous_transactions(self,
|
| 621 |
+
transactions_df: pd.DataFrame,
|
| 622 |
+
sensitivity: str = "Medium") -> pd.DataFrame:
|
| 623 |
+
"""
|
| 624 |
+
Detect anomalous transactions using statistical methods
|
| 625 |
+
|
| 626 |
+
Args:
|
| 627 |
+
transactions_df: DataFrame of transactions
|
| 628 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
| 629 |
+
|
| 630 |
+
Returns:
|
| 631 |
+
DataFrame of anomalous transactions
|
| 632 |
+
"""
|
| 633 |
+
if transactions_df.empty:
|
| 634 |
+
return pd.DataFrame()
|
| 635 |
+
|
| 636 |
+
# Ensure amount column exists
|
| 637 |
+
if 'Amount' in transactions_df.columns:
|
| 638 |
+
amount_col = 'Amount'
|
| 639 |
+
elif 'tokenAmount' in transactions_df.columns:
|
| 640 |
+
amount_col = 'tokenAmount'
|
| 641 |
+
elif 'value' in transactions_df.columns:
|
| 642 |
+
# Try to adjust for decimals if 'tokenDecimal' exists
|
| 643 |
+
if 'tokenDecimal' in transactions_df.columns:
|
| 644 |
+
transactions_df['adjustedValue'] = transactions_df['value'].astype(float) / (10 ** transactions_df['tokenDecimal'].astype(int))
|
| 645 |
+
amount_col = 'adjustedValue'
|
| 646 |
+
else:
|
| 647 |
+
amount_col = 'value'
|
| 648 |
+
else:
|
| 649 |
+
raise ValueError("Amount column not found in transactions DataFrame")
|
| 650 |
+
|
| 651 |
+
# Define sensitivity thresholds
|
| 652 |
+
if sensitivity == "Low":
|
| 653 |
+
z_threshold = 3.0 # Outliers beyond 3 standard deviations
|
| 654 |
+
elif sensitivity == "Medium":
|
| 655 |
+
z_threshold = 2.5 # Outliers beyond 2.5 standard deviations
|
| 656 |
+
else: # High
|
| 657 |
+
z_threshold = 2.0 # Outliers beyond 2 standard deviations
|
| 658 |
+
|
| 659 |
+
# Calculate z-score for amount
|
| 660 |
+
mean_amount = transactions_df[amount_col].mean()
|
| 661 |
+
std_amount = transactions_df[amount_col].std()
|
| 662 |
+
|
| 663 |
+
if std_amount == 0:
|
| 664 |
+
return pd.DataFrame()
|
| 665 |
+
|
| 666 |
+
transactions_df['z_score'] = abs((transactions_df[amount_col] - mean_amount) / std_amount)
|
| 667 |
+
|
| 668 |
+
# Flag anomalous transactions
|
| 669 |
+
anomalies = transactions_df[transactions_df['z_score'] > z_threshold].copy()
|
| 670 |
+
|
| 671 |
+
# Add risk level based on z-score
|
| 672 |
+
anomalies['risk_level'] = 'Medium'
|
| 673 |
+
anomalies.loc[anomalies['z_score'] > z_threshold * 1.5, 'risk_level'] = 'High'
|
| 674 |
+
anomalies.loc[anomalies['z_score'] <= z_threshold * 1.2, 'risk_level'] = 'Low'
|
| 675 |
+
|
| 676 |
+
return anomalies
|
| 677 |
+
|
| 678 |
+
def analyze_price_impact(self,
|
| 679 |
+
transactions_df: pd.DataFrame,
|
| 680 |
+
price_data: Dict[str, Dict[str, Any]]) -> Dict[str, Any]:
|
| 681 |
+
"""
|
| 682 |
+
Analyze the price impact of transactions with enhanced visualizations
|
| 683 |
+
|
| 684 |
+
Args:
|
| 685 |
+
transactions_df: DataFrame of transactions
|
| 686 |
+
price_data: Dictionary of price impact data for each transaction
|
| 687 |
+
|
| 688 |
+
Returns:
|
| 689 |
+
Dictionary with comprehensive price impact analysis and visualizations
|
| 690 |
+
"""
|
| 691 |
+
if transactions_df.empty or not price_data:
|
| 692 |
+
# Create an empty chart for the default case
|
| 693 |
+
empty_fig = go.Figure()
|
| 694 |
+
empty_fig.update_layout(
|
| 695 |
+
title="No Price Impact Data Available",
|
| 696 |
+
xaxis_title="Time",
|
| 697 |
+
yaxis_title="Price Impact (%)",
|
| 698 |
+
height=400,
|
| 699 |
+
template="plotly_white"
|
| 700 |
+
)
|
| 701 |
+
empty_fig.add_annotation(
|
| 702 |
+
text="No transactions found with price impact data",
|
| 703 |
+
showarrow=False,
|
| 704 |
+
font=dict(size=14)
|
| 705 |
+
)
|
| 706 |
+
|
| 707 |
+
return {
|
| 708 |
+
'avg_impact_pct': 0,
|
| 709 |
+
'max_impact_pct': 0,
|
| 710 |
+
'min_impact_pct': 0,
|
| 711 |
+
'significant_moves_count': 0,
|
| 712 |
+
'total_transactions': 0,
|
| 713 |
+
'charts': {
|
| 714 |
+
'main_chart': empty_fig,
|
| 715 |
+
'impact_distribution': empty_fig,
|
| 716 |
+
'cumulative_impact': empty_fig,
|
| 717 |
+
'hourly_impact': empty_fig
|
| 718 |
+
},
|
| 719 |
+
'transactions_with_impact': pd.DataFrame(),
|
| 720 |
+
'insights': [],
|
| 721 |
+
'impact_summary': "No price impact data available"
|
| 722 |
+
}
|
| 723 |
+
|
| 724 |
+
# Ensure timestamp column is datetime
|
| 725 |
+
if 'Timestamp' in transactions_df.columns:
|
| 726 |
+
timestamp_col = 'Timestamp'
|
| 727 |
+
elif 'timeStamp' in transactions_df.columns:
|
| 728 |
+
timestamp_col = 'timeStamp'
|
| 729 |
+
# Convert timestamp to datetime if it's not already
|
| 730 |
+
if not pd.api.types.is_datetime64_any_dtype(transactions_df[timestamp_col]):
|
| 731 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col], unit='s')
|
| 732 |
+
else:
|
| 733 |
+
raise ValueError("Timestamp column not found in transactions DataFrame")
|
| 734 |
+
|
| 735 |
+
# Combine price impact data with transactions
|
| 736 |
+
impact_data = []
|
| 737 |
+
|
| 738 |
+
for idx, row in transactions_df.iterrows():
|
| 739 |
+
tx_hash = row.get('Transaction Hash', row.get('hash', None))
|
| 740 |
+
if not tx_hash or tx_hash not in price_data:
|
| 741 |
+
continue
|
| 742 |
+
|
| 743 |
+
tx_impact = price_data[tx_hash]
|
| 744 |
+
|
| 745 |
+
if tx_impact['impact_pct'] is None:
|
| 746 |
+
continue
|
| 747 |
+
|
| 748 |
+
# Get token symbol if available
|
| 749 |
+
token_symbol = row.get('tokenSymbol', 'Unknown')
|
| 750 |
+
token_amount = row.get('value', 0)
|
| 751 |
+
if 'tokenDecimal' in row:
|
| 752 |
+
try:
|
| 753 |
+
token_amount = float(token_amount) / (10 ** int(row.get('tokenDecimal', 0)))
|
| 754 |
+
except (ValueError, TypeError):
|
| 755 |
+
token_amount = 0
|
| 756 |
+
|
| 757 |
+
impact_data.append({
|
| 758 |
+
'transaction_hash': tx_hash,
|
| 759 |
+
'timestamp': row[timestamp_col],
|
| 760 |
+
'pre_price': tx_impact['pre_price'],
|
| 761 |
+
'post_price': tx_impact['post_price'],
|
| 762 |
+
'impact_pct': tx_impact['impact_pct'],
|
| 763 |
+
'token_symbol': token_symbol,
|
| 764 |
+
'token_amount': token_amount,
|
| 765 |
+
'from': row.get('from', ''),
|
| 766 |
+
'to': row.get('to', ''),
|
| 767 |
+
'hour': row[timestamp_col].hour if isinstance(row[timestamp_col], pd.Timestamp) else 0
|
| 768 |
+
})
|
| 769 |
+
|
| 770 |
+
if not impact_data:
|
| 771 |
+
# Create an empty chart for the default case
|
| 772 |
+
empty_fig = go.Figure()
|
| 773 |
+
empty_fig.update_layout(
|
| 774 |
+
title="No Price Impact Data Available",
|
| 775 |
+
xaxis_title="Time",
|
| 776 |
+
yaxis_title="Price Impact (%)",
|
| 777 |
+
height=400,
|
| 778 |
+
template="plotly_white"
|
| 779 |
+
)
|
| 780 |
+
empty_fig.add_annotation(
|
| 781 |
+
text="No transactions found with price impact data",
|
| 782 |
+
showarrow=False,
|
| 783 |
+
font=dict(size=14)
|
| 784 |
+
)
|
| 785 |
+
|
| 786 |
+
return {
|
| 787 |
+
'avg_impact_pct': 0,
|
| 788 |
+
'max_impact_pct': 0,
|
| 789 |
+
'min_impact_pct': 0,
|
| 790 |
+
'significant_moves_count': 0,
|
| 791 |
+
'total_transactions': len(transactions_df) if not transactions_df.empty else 0,
|
| 792 |
+
'charts': {
|
| 793 |
+
'main_chart': empty_fig,
|
| 794 |
+
'impact_distribution': empty_fig,
|
| 795 |
+
'cumulative_impact': empty_fig,
|
| 796 |
+
'hourly_impact': empty_fig
|
| 797 |
+
},
|
| 798 |
+
'transactions_with_impact': pd.DataFrame(),
|
| 799 |
+
'insights': [],
|
| 800 |
+
'impact_summary': "No price impact data available"
|
| 801 |
+
}
|
| 802 |
+
|
| 803 |
+
impact_df = pd.DataFrame(impact_data)
|
| 804 |
+
|
| 805 |
+
# Calculate aggregate metrics
|
| 806 |
+
avg_impact = impact_df['impact_pct'].mean()
|
| 807 |
+
max_impact = impact_df['impact_pct'].max()
|
| 808 |
+
min_impact = impact_df['impact_pct'].min()
|
| 809 |
+
median_impact = impact_df['impact_pct'].median()
|
| 810 |
+
std_impact = impact_df['impact_pct'].std()
|
| 811 |
+
|
| 812 |
+
# Count significant moves (>1% impact)
|
| 813 |
+
significant_threshold = 1.0
|
| 814 |
+
high_impact_threshold = 3.0
|
| 815 |
+
significant_moves = len(impact_df[abs(impact_df['impact_pct']) > significant_threshold])
|
| 816 |
+
high_impact_moves = len(impact_df[abs(impact_df['impact_pct']) > high_impact_threshold])
|
| 817 |
+
positive_impacts = len(impact_df[impact_df['impact_pct'] > 0])
|
| 818 |
+
negative_impacts = len(impact_df[impact_df['impact_pct'] < 0])
|
| 819 |
+
|
| 820 |
+
# Calculate cumulative impact
|
| 821 |
+
impact_df = impact_df.sort_values('timestamp')
|
| 822 |
+
impact_df['cumulative_impact'] = impact_df['impact_pct'].cumsum()
|
| 823 |
+
|
| 824 |
+
# Generate insights
|
| 825 |
+
insights = []
|
| 826 |
+
|
| 827 |
+
# Market direction bias
|
| 828 |
+
if avg_impact > 0.5:
|
| 829 |
+
insights.append({
|
| 830 |
+
"title": "Positive Price Pressure",
|
| 831 |
+
"description": f"Transactions show an overall positive price impact of {avg_impact:.2f}%, suggesting accumulation or market strength."
|
| 832 |
+
})
|
| 833 |
+
elif avg_impact < -0.5:
|
| 834 |
+
insights.append({
|
| 835 |
+
"title": "Negative Price Pressure",
|
| 836 |
+
"description": f"Transactions show an overall negative price impact of {avg_impact:.2f}%, suggesting distribution or market weakness."
|
| 837 |
+
})
|
| 838 |
+
|
| 839 |
+
# Volatility analysis
|
| 840 |
+
if std_impact > 2.0:
|
| 841 |
+
insights.append({
|
| 842 |
+
"title": "High Market Volatility",
|
| 843 |
+
"description": f"Price impact shows high volatility (std: {std_impact:.2f}%), indicating potential market manipulation or whipsaw conditions."
|
| 844 |
+
})
|
| 845 |
+
|
| 846 |
+
# Significant impacts
|
| 847 |
+
if high_impact_moves > 0:
|
| 848 |
+
insights.append({
|
| 849 |
+
"title": "High Impact Transactions",
|
| 850 |
+
"description": f"Detected {high_impact_moves} high-impact transactions (>{high_impact_threshold}% price change), indicating potential market-moving activity."
|
| 851 |
+
})
|
| 852 |
+
|
| 853 |
+
# Temporal patterns
|
| 854 |
+
hourly_impact = impact_df.groupby('hour')['impact_pct'].mean()
|
| 855 |
+
if len(hourly_impact) > 0:
|
| 856 |
+
max_hour = hourly_impact.abs().idxmax()
|
| 857 |
+
max_hour_impact = hourly_impact[max_hour]
|
| 858 |
+
insights.append({
|
| 859 |
+
"title": "Time-Based Pattern",
|
| 860 |
+
"description": f"Highest price impact occurs around {max_hour}:00 with an average of {max_hour_impact:.2f}%."
|
| 861 |
+
})
|
| 862 |
+
|
| 863 |
+
# Create impact summary text
|
| 864 |
+
impact_summary = f"Analysis of {len(impact_df)} price-impacting transactions shows an average impact of {avg_impact:.2f}% "
|
| 865 |
+
impact_summary += f"(range: {min_impact:.2f}% to {max_impact:.2f}%). "
|
| 866 |
+
impact_summary += f"Found {significant_moves} significant price moves and {high_impact_moves} high-impact transactions. "
|
| 867 |
+
if positive_impacts > negative_impacts:
|
| 868 |
+
impact_summary += f"There is a bias towards positive price impact ({positive_impacts} positive vs {negative_impacts} negative)."
|
| 869 |
+
elif negative_impacts > positive_impacts:
|
| 870 |
+
impact_summary += f"There is a bias towards negative price impact ({negative_impacts} negative vs {positive_impacts} positive)."
|
| 871 |
+
else:
|
| 872 |
+
impact_summary += "The price impact is balanced between positive and negative moves."
|
| 873 |
+
|
| 874 |
+
# Create enhanced main visualization
|
| 875 |
+
main_fig = go.Figure()
|
| 876 |
+
|
| 877 |
+
# Add scatter plot for impact
|
| 878 |
+
main_fig.add_trace(go.Scatter(
|
| 879 |
+
x=impact_df['timestamp'],
|
| 880 |
+
y=impact_df['impact_pct'],
|
| 881 |
+
mode='markers+lines',
|
| 882 |
+
marker=dict(
|
| 883 |
+
size=impact_df['impact_pct'].abs() * 1.5 + 5,
|
| 884 |
+
color=impact_df['impact_pct'],
|
| 885 |
+
colorscale='RdBu_r',
|
| 886 |
+
line=dict(width=1),
|
| 887 |
+
symbol=['circle' if val >= 0 else 'diamond' for val in impact_df['impact_pct']]
|
| 888 |
+
),
|
| 889 |
+
text=[
|
| 890 |
+
f"TX: {tx[:8]}...{tx[-6:]}<br>" +
|
| 891 |
+
f"Impact: {impact:.2f}%<br>" +
|
| 892 |
+
f"Token: {token} ({amount:.4f})<br>" +
|
| 893 |
+
f"From: {src[:6]}...{src[-4:]}<br>" +
|
| 894 |
+
f"To: {dst[:6]}...{dst[-4:]}"
|
| 895 |
+
for tx, impact, token, amount, src, dst in zip(
|
| 896 |
+
impact_df['transaction_hash'],
|
| 897 |
+
impact_df['impact_pct'],
|
| 898 |
+
impact_df['token_symbol'],
|
| 899 |
+
impact_df['token_amount'],
|
| 900 |
+
impact_df['from'],
|
| 901 |
+
impact_df['to']
|
| 902 |
+
)
|
| 903 |
+
],
|
| 904 |
+
hovertemplate='%{text}<br>Time: %{x}<extra></extra>',
|
| 905 |
+
name='Price Impact'
|
| 906 |
+
))
|
| 907 |
+
|
| 908 |
+
# Add a moving average trendline
|
| 909 |
+
window_size = max(3, len(impact_df) // 10) # Dynamic window size
|
| 910 |
+
if len(impact_df) >= window_size:
|
| 911 |
+
impact_df['ma'] = impact_df['impact_pct'].rolling(window=window_size, min_periods=1).mean()
|
| 912 |
+
main_fig.add_trace(go.Scatter(
|
| 913 |
+
x=impact_df['timestamp'],
|
| 914 |
+
y=impact_df['ma'],
|
| 915 |
+
mode='lines',
|
| 916 |
+
line=dict(width=2, color='rgba(255,165,0,0.7)'),
|
| 917 |
+
name=f'Moving Avg ({window_size} period)'
|
| 918 |
+
))
|
| 919 |
+
|
| 920 |
+
# Add a zero line for reference
|
| 921 |
+
main_fig.add_shape(
|
| 922 |
+
type='line',
|
| 923 |
+
x0=impact_df['timestamp'].min(),
|
| 924 |
+
y0=0,
|
| 925 |
+
x1=impact_df['timestamp'].max(),
|
| 926 |
+
y1=0,
|
| 927 |
+
line=dict(color='gray', width=1, dash='dash')
|
| 928 |
+
)
|
| 929 |
+
|
| 930 |
+
# Add colored regions for significant impact
|
| 931 |
+
|
| 932 |
+
# Add green band for normal price movement
|
| 933 |
+
main_fig.add_shape(
|
| 934 |
+
type='rect',
|
| 935 |
+
x0=impact_df['timestamp'].min(),
|
| 936 |
+
y0=-significant_threshold,
|
| 937 |
+
x1=impact_df['timestamp'].max(),
|
| 938 |
+
y1=significant_threshold,
|
| 939 |
+
fillcolor='rgba(0,255,0,0.1)',
|
| 940 |
+
line=dict(width=0),
|
| 941 |
+
layer='below'
|
| 942 |
+
)
|
| 943 |
+
|
| 944 |
+
# Add warning bands for higher impact movements
|
| 945 |
+
main_fig.add_shape(
|
| 946 |
+
type='rect',
|
| 947 |
+
x0=impact_df['timestamp'].min(),
|
| 948 |
+
y0=significant_threshold,
|
| 949 |
+
x1=impact_df['timestamp'].max(),
|
| 950 |
+
y1=high_impact_threshold,
|
| 951 |
+
fillcolor='rgba(255,255,0,0.1)',
|
| 952 |
+
line=dict(width=0),
|
| 953 |
+
layer='below'
|
| 954 |
+
)
|
| 955 |
+
|
| 956 |
+
main_fig.add_shape(
|
| 957 |
+
type='rect',
|
| 958 |
+
x0=impact_df['timestamp'].min(),
|
| 959 |
+
y0=-high_impact_threshold,
|
| 960 |
+
x1=impact_df['timestamp'].max(),
|
| 961 |
+
y1=-significant_threshold,
|
| 962 |
+
fillcolor='rgba(255,255,0,0.1)',
|
| 963 |
+
line=dict(width=0),
|
| 964 |
+
layer='below'
|
| 965 |
+
)
|
| 966 |
+
|
| 967 |
+
# Add high impact regions
|
| 968 |
+
main_fig.add_shape(
|
| 969 |
+
type='rect',
|
| 970 |
+
x0=impact_df['timestamp'].min(),
|
| 971 |
+
y0=high_impact_threshold,
|
| 972 |
+
x1=impact_df['timestamp'].max(),
|
| 973 |
+
y1=max(high_impact_threshold * 2, max_impact * 1.1),
|
| 974 |
+
fillcolor='rgba(255,0,0,0.1)',
|
| 975 |
+
line=dict(width=0),
|
| 976 |
+
layer='below'
|
| 977 |
+
)
|
| 978 |
+
|
| 979 |
+
main_fig.add_shape(
|
| 980 |
+
type='rect',
|
| 981 |
+
x0=impact_df['timestamp'].min(),
|
| 982 |
+
y0=min(high_impact_threshold * -2, min_impact * 1.1),
|
| 983 |
+
x1=impact_df['timestamp'].max(),
|
| 984 |
+
y1=-high_impact_threshold,
|
| 985 |
+
fillcolor='rgba(255,0,0,0.1)',
|
| 986 |
+
line=dict(width=0),
|
| 987 |
+
layer='below'
|
| 988 |
+
)
|
| 989 |
+
|
| 990 |
+
main_fig.update_layout(
|
| 991 |
+
title='Price Impact of Whale Transactions',
|
| 992 |
+
xaxis_title='Timestamp',
|
| 993 |
+
yaxis_title='Price Impact (%)',
|
| 994 |
+
hovermode='closest',
|
| 995 |
+
template="plotly_white",
|
| 996 |
+
legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1),
|
| 997 |
+
margin=dict(l=20, r=20, t=50, b=20)
|
| 998 |
+
)
|
| 999 |
+
|
| 1000 |
+
# Create impact distribution histogram
|
| 1001 |
+
dist_fig = px.histogram(
|
| 1002 |
+
impact_df['impact_pct'],
|
| 1003 |
+
nbins=20,
|
| 1004 |
+
labels={'value': 'Price Impact (%)', 'count': 'Frequency'},
|
| 1005 |
+
title='Distribution of Price Impact',
|
| 1006 |
+
color_discrete_sequence=['#3366CC']
|
| 1007 |
+
)
|
| 1008 |
+
|
| 1009 |
+
# Add a vertical line at the mean
|
| 1010 |
+
dist_fig.add_vline(x=avg_impact, line_dash="dash", line_color="red")
|
| 1011 |
+
dist_fig.add_annotation(x=avg_impact, y=0.85, yref="paper", text=f"Mean: {avg_impact:.2f}%",
|
| 1012 |
+
showarrow=True, arrowhead=2, arrowcolor="red", ax=40)
|
| 1013 |
+
|
| 1014 |
+
# Add a vertical line at zero
|
| 1015 |
+
dist_fig.add_vline(x=0, line_dash="solid", line_color="black")
|
| 1016 |
+
|
| 1017 |
+
dist_fig.update_layout(
|
| 1018 |
+
template="plotly_white",
|
| 1019 |
+
bargap=0.1,
|
| 1020 |
+
height=350
|
| 1021 |
+
)
|
| 1022 |
+
|
| 1023 |
+
# Create cumulative impact chart
|
| 1024 |
+
cumul_fig = go.Figure()
|
| 1025 |
+
cumul_fig.add_trace(go.Scatter(
|
| 1026 |
+
x=impact_df['timestamp'],
|
| 1027 |
+
y=impact_df['cumulative_impact'],
|
| 1028 |
+
mode='lines',
|
| 1029 |
+
fill='tozeroy',
|
| 1030 |
+
line=dict(width=2, color='#2ca02c'),
|
| 1031 |
+
name='Cumulative Impact'
|
| 1032 |
+
))
|
| 1033 |
+
|
| 1034 |
+
cumul_fig.update_layout(
|
| 1035 |
+
title='Cumulative Price Impact Over Time',
|
| 1036 |
+
xaxis_title='Timestamp',
|
| 1037 |
+
yaxis_title='Cumulative Price Impact (%)',
|
| 1038 |
+
template="plotly_white",
|
| 1039 |
+
height=350
|
| 1040 |
+
)
|
| 1041 |
+
|
| 1042 |
+
# Create hourly impact analysis
|
| 1043 |
+
hourly_impact = impact_df.groupby('hour')['impact_pct'].agg(['mean', 'count', 'std']).reset_index()
|
| 1044 |
+
hourly_impact = hourly_impact.sort_values('hour')
|
| 1045 |
+
|
| 1046 |
+
hour_fig = go.Figure()
|
| 1047 |
+
hour_fig.add_trace(go.Bar(
|
| 1048 |
+
x=hourly_impact['hour'],
|
| 1049 |
+
y=hourly_impact['mean'],
|
| 1050 |
+
error_y=dict(type='data', array=hourly_impact['std'], visible=True),
|
| 1051 |
+
marker_color=hourly_impact['mean'].apply(lambda x: 'green' if x > 0 else 'red'),
|
| 1052 |
+
name='Average Impact'
|
| 1053 |
+
))
|
| 1054 |
+
|
| 1055 |
+
hour_fig.update_layout(
|
| 1056 |
+
title='Price Impact by Hour of Day',
|
| 1057 |
+
xaxis_title='Hour of Day',
|
| 1058 |
+
yaxis_title='Average Price Impact (%)',
|
| 1059 |
+
template="plotly_white",
|
| 1060 |
+
height=350,
|
| 1061 |
+
xaxis=dict(tickmode='linear', tick0=0, dtick=2)
|
| 1062 |
+
)
|
| 1063 |
+
|
| 1064 |
+
# Join with original transactions
|
| 1065 |
+
transactions_df = transactions_df.copy()
|
| 1066 |
+
transactions_df['Timestamp_key'] = transactions_df[timestamp_col]
|
| 1067 |
+
impact_df['Timestamp_key'] = impact_df['timestamp']
|
| 1068 |
+
|
| 1069 |
+
merged_df = pd.merge(
|
| 1070 |
+
transactions_df,
|
| 1071 |
+
impact_df[['Timestamp_key', 'impact_pct', 'pre_price', 'post_price', 'cumulative_impact']],
|
| 1072 |
+
on='Timestamp_key',
|
| 1073 |
+
how='left'
|
| 1074 |
+
)
|
| 1075 |
+
|
| 1076 |
+
# Final result with enhanced output
|
| 1077 |
+
return {
|
| 1078 |
+
'avg_impact_pct': avg_impact,
|
| 1079 |
+
'max_impact_pct': max_impact,
|
| 1080 |
+
'min_impact_pct': min_impact,
|
| 1081 |
+
'median_impact_pct': median_impact,
|
| 1082 |
+
'std_impact_pct': std_impact,
|
| 1083 |
+
'significant_moves_count': significant_moves,
|
| 1084 |
+
'high_impact_moves_count': high_impact_moves,
|
| 1085 |
+
'positive_impacts_count': positive_impacts,
|
| 1086 |
+
'negative_impacts_count': negative_impacts,
|
| 1087 |
+
'total_transactions': len(transactions_df),
|
| 1088 |
+
'charts': {
|
| 1089 |
+
'main_chart': main_fig,
|
| 1090 |
+
'impact_distribution': dist_fig,
|
| 1091 |
+
'cumulative_impact': cumul_fig,
|
| 1092 |
+
'hourly_impact': hour_fig
|
| 1093 |
+
},
|
| 1094 |
+
'transactions_with_impact': merged_df,
|
| 1095 |
+
'insights': insights,
|
| 1096 |
+
'impact_summary': impact_summary
|
| 1097 |
+
}
|
| 1098 |
+
|
| 1099 |
+
def detect_wash_trading(self,
|
| 1100 |
+
transactions_df: pd.DataFrame,
|
| 1101 |
+
addresses: List[str],
|
| 1102 |
+
time_window_minutes: int = 60,
|
| 1103 |
+
sensitivity: str = "Medium") -> List[Dict[str, Any]]:
|
| 1104 |
+
"""
|
| 1105 |
+
Detect potential wash trading between addresses
|
| 1106 |
+
|
| 1107 |
+
Args:
|
| 1108 |
+
transactions_df: DataFrame of transactions
|
| 1109 |
+
addresses: List of addresses to analyze
|
| 1110 |
+
time_window_minutes: Time window for detecting wash trades
|
| 1111 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
| 1112 |
+
|
| 1113 |
+
Returns:
|
| 1114 |
+
List of potential wash trading incidents
|
| 1115 |
+
"""
|
| 1116 |
+
if transactions_df.empty or not addresses:
|
| 1117 |
+
return []
|
| 1118 |
+
|
| 1119 |
+
# Ensure from/to columns exist
|
| 1120 |
+
if 'From' in transactions_df.columns and 'To' in transactions_df.columns:
|
| 1121 |
+
from_col, to_col = 'From', 'To'
|
| 1122 |
+
elif 'from' in transactions_df.columns and 'to' in transactions_df.columns:
|
| 1123 |
+
from_col, to_col = 'from', 'to'
|
| 1124 |
+
else:
|
| 1125 |
+
raise ValueError("From/To columns not found in transactions DataFrame")
|
| 1126 |
+
|
| 1127 |
+
# Ensure timestamp column exists
|
| 1128 |
+
if 'Timestamp' in transactions_df.columns:
|
| 1129 |
+
timestamp_col = 'Timestamp'
|
| 1130 |
+
elif 'timeStamp' in transactions_df.columns:
|
| 1131 |
+
timestamp_col = 'timeStamp'
|
| 1132 |
+
else:
|
| 1133 |
+
raise ValueError("Timestamp column not found in transactions DataFrame")
|
| 1134 |
+
|
| 1135 |
+
# Ensure timestamp is datetime
|
| 1136 |
+
if not pd.api.types.is_datetime64_any_dtype(transactions_df[timestamp_col]):
|
| 1137 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col])
|
| 1138 |
+
|
| 1139 |
+
# Define sensitivity thresholds
|
| 1140 |
+
if sensitivity == "Low":
|
| 1141 |
+
min_cycles = 3 # Minimum number of back-and-forth transactions
|
| 1142 |
+
max_time_diff = 120 # Maximum minutes between transactions
|
| 1143 |
+
elif sensitivity == "Medium":
|
| 1144 |
+
min_cycles = 2
|
| 1145 |
+
max_time_diff = 60
|
| 1146 |
+
else: # High
|
| 1147 |
+
min_cycles = 1
|
| 1148 |
+
max_time_diff = 30
|
| 1149 |
+
|
| 1150 |
+
# Filter transactions involving the addresses
|
| 1151 |
+
address_txs = transactions_df[
|
| 1152 |
+
(transactions_df[from_col].isin(addresses)) |
|
| 1153 |
+
(transactions_df[to_col].isin(addresses))
|
| 1154 |
+
].copy()
|
| 1155 |
+
|
| 1156 |
+
if address_txs.empty:
|
| 1157 |
+
return []
|
| 1158 |
+
|
| 1159 |
+
# Sort by timestamp
|
| 1160 |
+
address_txs = address_txs.sort_values(by=timestamp_col)
|
| 1161 |
+
|
| 1162 |
+
# Detect cycles of transactions between same addresses
|
| 1163 |
+
wash_trades = []
|
| 1164 |
+
|
| 1165 |
+
for addr1 in addresses:
|
| 1166 |
+
for addr2 in addresses:
|
| 1167 |
+
if addr1 == addr2:
|
| 1168 |
+
continue
|
| 1169 |
+
|
| 1170 |
+
# Find transactions from addr1 to addr2
|
| 1171 |
+
a1_to_a2 = address_txs[
|
| 1172 |
+
(address_txs[from_col] == addr1) &
|
| 1173 |
+
(address_txs[to_col] == addr2)
|
| 1174 |
+
]
|
| 1175 |
+
|
| 1176 |
+
# Find transactions from addr2 to addr1
|
| 1177 |
+
a2_to_a1 = address_txs[
|
| 1178 |
+
(address_txs[from_col] == addr2) &
|
| 1179 |
+
(address_txs[to_col] == addr1)
|
| 1180 |
+
]
|
| 1181 |
+
|
| 1182 |
+
if a1_to_a2.empty or a2_to_a1.empty:
|
| 1183 |
+
continue
|
| 1184 |
+
|
| 1185 |
+
# Check for back-and-forth patterns
|
| 1186 |
+
cycles = 0
|
| 1187 |
+
evidence = []
|
| 1188 |
+
|
| 1189 |
+
for _, tx1 in a1_to_a2.iterrows():
|
| 1190 |
+
tx1_time = tx1[timestamp_col]
|
| 1191 |
+
|
| 1192 |
+
# Find return transactions within the time window
|
| 1193 |
+
return_txs = a2_to_a1[
|
| 1194 |
+
(a2_to_a1[timestamp_col] > tx1_time) &
|
| 1195 |
+
(a2_to_a1[timestamp_col] <= tx1_time + pd.Timedelta(minutes=max_time_diff))
|
| 1196 |
+
]
|
| 1197 |
+
|
| 1198 |
+
if not return_txs.empty:
|
| 1199 |
+
cycles += 1
|
| 1200 |
+
evidence.append(tx1)
|
| 1201 |
+
evidence.append(return_txs.iloc[0])
|
| 1202 |
+
|
| 1203 |
+
if cycles >= min_cycles:
|
| 1204 |
+
# Create visualization
|
| 1205 |
+
if evidence:
|
| 1206 |
+
evidence_df = pd.DataFrame(evidence)
|
| 1207 |
+
fig = px.scatter(
|
| 1208 |
+
evidence_df,
|
| 1209 |
+
x=timestamp_col,
|
| 1210 |
+
y=evidence_df.get('Amount', evidence_df.get('tokenAmount', evidence_df.get('value', 0))),
|
| 1211 |
+
color=from_col,
|
| 1212 |
+
title=f"Potential Wash Trading Between {addr1[:8]}... and {addr2[:8]}..."
|
| 1213 |
+
)
|
| 1214 |
+
else:
|
| 1215 |
+
fig = None
|
| 1216 |
+
|
| 1217 |
+
wash_trades.append({
|
| 1218 |
+
"type": "Wash Trading",
|
| 1219 |
+
"addresses": [addr1, addr2],
|
| 1220 |
+
"risk_level": "High" if cycles >= min_cycles * 2 else "Medium",
|
| 1221 |
+
"description": f"Detected {cycles} cycles of back-and-forth transactions between addresses",
|
| 1222 |
+
"detection_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
|
| 1223 |
+
"title": f"Wash Trading Pattern ({cycles} cycles)",
|
| 1224 |
+
"evidence": pd.DataFrame(evidence) if evidence else None,
|
| 1225 |
+
"chart": fig
|
| 1226 |
+
})
|
| 1227 |
+
|
| 1228 |
+
return wash_trades
|
| 1229 |
+
|
| 1230 |
+
def detect_pump_and_dump(self,
|
| 1231 |
+
transactions_df: pd.DataFrame,
|
| 1232 |
+
price_data: Dict[str, Dict[str, Any]],
|
| 1233 |
+
sensitivity: str = "Medium") -> List[Dict[str, Any]]:
|
| 1234 |
+
"""
|
| 1235 |
+
Detect potential pump and dump schemes
|
| 1236 |
+
|
| 1237 |
+
Args:
|
| 1238 |
+
transactions_df: DataFrame of transactions
|
| 1239 |
+
price_data: Dictionary of price impact data for each transaction
|
| 1240 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
| 1241 |
+
|
| 1242 |
+
Returns:
|
| 1243 |
+
List of potential pump and dump incidents
|
| 1244 |
+
"""
|
| 1245 |
+
if transactions_df.empty or not price_data:
|
| 1246 |
+
return []
|
| 1247 |
+
|
| 1248 |
+
# Ensure timestamp column exists
|
| 1249 |
+
if 'Timestamp' in transactions_df.columns:
|
| 1250 |
+
timestamp_col = 'Timestamp'
|
| 1251 |
+
elif 'timeStamp' in transactions_df.columns:
|
| 1252 |
+
timestamp_col = 'timeStamp'
|
| 1253 |
+
else:
|
| 1254 |
+
raise ValueError("Timestamp column not found in transactions DataFrame")
|
| 1255 |
+
|
| 1256 |
+
# Ensure from/to columns exist
|
| 1257 |
+
if 'From' in transactions_df.columns and 'To' in transactions_df.columns:
|
| 1258 |
+
from_col, to_col = 'From', 'To'
|
| 1259 |
+
elif 'from' in transactions_df.columns and 'to' in transactions_df.columns:
|
| 1260 |
+
from_col, to_col = 'from', 'to'
|
| 1261 |
+
else:
|
| 1262 |
+
raise ValueError("From/To columns not found in transactions DataFrame")
|
| 1263 |
+
|
| 1264 |
+
# Ensure timestamp is datetime
|
| 1265 |
+
if not pd.api.types.is_datetime64_any_dtype(transactions_df[timestamp_col]):
|
| 1266 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col])
|
| 1267 |
+
|
| 1268 |
+
# Define sensitivity thresholds
|
| 1269 |
+
if sensitivity == "Low":
|
| 1270 |
+
accumulation_threshold = 5 # Number of buys to consider accumulation
|
| 1271 |
+
pump_threshold = 10.0 # % price increase to trigger pump
|
| 1272 |
+
dump_threshold = -8.0 # % price decrease to trigger dump
|
| 1273 |
+
elif sensitivity == "Medium":
|
| 1274 |
+
accumulation_threshold = 3
|
| 1275 |
+
pump_threshold = 7.0
|
| 1276 |
+
dump_threshold = -5.0
|
| 1277 |
+
else: # High
|
| 1278 |
+
accumulation_threshold = 2
|
| 1279 |
+
pump_threshold = 5.0
|
| 1280 |
+
dump_threshold = -3.0
|
| 1281 |
+
|
| 1282 |
+
# Combine price impact data with transactions
|
| 1283 |
+
txs_with_impact = []
|
| 1284 |
+
|
| 1285 |
+
for idx, row in transactions_df.iterrows():
|
| 1286 |
+
tx_hash = row.get('Transaction Hash', row.get('hash', None))
|
| 1287 |
+
if not tx_hash or tx_hash not in price_data:
|
| 1288 |
+
continue
|
| 1289 |
+
|
| 1290 |
+
tx_impact = price_data[tx_hash]
|
| 1291 |
+
|
| 1292 |
+
if tx_impact['impact_pct'] is None:
|
| 1293 |
+
continue
|
| 1294 |
+
|
| 1295 |
+
txs_with_impact.append({
|
| 1296 |
+
'transaction_hash': tx_hash,
|
| 1297 |
+
'timestamp': row[timestamp_col],
|
| 1298 |
+
'from': row[from_col],
|
| 1299 |
+
'to': row[to_col],
|
| 1300 |
+
'pre_price': tx_impact['pre_price'],
|
| 1301 |
+
'post_price': tx_impact['post_price'],
|
| 1302 |
+
'impact_pct': tx_impact['impact_pct']
|
| 1303 |
+
})
|
| 1304 |
+
|
| 1305 |
+
if not txs_with_impact:
|
| 1306 |
+
return []
|
| 1307 |
+
|
| 1308 |
+
impact_df = pd.DataFrame(txs_with_impact)
|
| 1309 |
+
impact_df = impact_df.sort_values(by='timestamp')
|
| 1310 |
+
|
| 1311 |
+
# Look for accumulation phases followed by price pumps and then dumps
|
| 1312 |
+
pump_and_dumps = []
|
| 1313 |
+
|
| 1314 |
+
# Group by address to analyze per wallet
|
| 1315 |
+
address_groups = {}
|
| 1316 |
+
|
| 1317 |
+
for from_addr in impact_df['from'].unique():
|
| 1318 |
+
address_groups[from_addr] = impact_df[impact_df['from'] == from_addr]
|
| 1319 |
+
|
| 1320 |
+
for to_addr in impact_df['to'].unique():
|
| 1321 |
+
if to_addr in address_groups:
|
| 1322 |
+
address_groups[to_addr] = pd.concat([
|
| 1323 |
+
address_groups[to_addr],
|
| 1324 |
+
impact_df[impact_df['to'] == to_addr]
|
| 1325 |
+
])
|
| 1326 |
+
else:
|
| 1327 |
+
address_groups[to_addr] = impact_df[impact_df['to'] == to_addr]
|
| 1328 |
+
|
| 1329 |
+
for address, addr_df in address_groups.items():
|
| 1330 |
+
# Skip if not enough transactions
|
| 1331 |
+
if len(addr_df) < accumulation_threshold + 2:
|
| 1332 |
+
continue
|
| 1333 |
+
|
| 1334 |
+
# Look for continuous price increase followed by sharp drop
|
| 1335 |
+
window_size = min(len(addr_df), 10)
|
| 1336 |
+
for i in range(len(addr_df) - window_size + 1):
|
| 1337 |
+
window = addr_df.iloc[i:i+window_size]
|
| 1338 |
+
|
| 1339 |
+
# Get cumulative price change in window
|
| 1340 |
+
if len(window) >= 2:
|
| 1341 |
+
first_price = window.iloc[0]['pre_price']
|
| 1342 |
+
last_price = window.iloc[-1]['post_price']
|
| 1343 |
+
|
| 1344 |
+
if first_price is None or last_price is None:
|
| 1345 |
+
continue
|
| 1346 |
+
|
| 1347 |
+
cumulative_change = ((last_price - first_price) / first_price) * 100
|
| 1348 |
+
|
| 1349 |
+
# Check for pump phase
|
| 1350 |
+
max_price = window['post_price'].max()
|
| 1351 |
+
max_idx = window['post_price'].idxmax()
|
| 1352 |
+
|
| 1353 |
+
if max_idx < len(window) - 1:
|
| 1354 |
+
max_to_end = ((window.iloc[-1]['post_price'] - max_price) / max_price) * 100
|
| 1355 |
+
|
| 1356 |
+
# If we have a pump followed by a dump
|
| 1357 |
+
if (cumulative_change > pump_threshold or
|
| 1358 |
+
any(window['impact_pct'] > pump_threshold)) and max_to_end < dump_threshold:
|
| 1359 |
+
|
| 1360 |
+
# Create chart
|
| 1361 |
+
fig = go.Figure()
|
| 1362 |
+
|
| 1363 |
+
# Plot price line
|
| 1364 |
+
times = [t.timestamp() for t in window['timestamp']]
|
| 1365 |
+
prices = []
|
| 1366 |
+
for _, row in window.iterrows():
|
| 1367 |
+
prices.append(row['pre_price'])
|
| 1368 |
+
prices.append(row['post_price'])
|
| 1369 |
+
|
| 1370 |
+
times_expanded = []
|
| 1371 |
+
for t in times:
|
| 1372 |
+
times_expanded.append(t - 60) # 1 min before
|
| 1373 |
+
times_expanded.append(t + 60) # 1 min after
|
| 1374 |
+
|
| 1375 |
+
fig.add_trace(go.Scatter(
|
| 1376 |
+
x=times_expanded,
|
| 1377 |
+
y=prices,
|
| 1378 |
+
mode='lines+markers',
|
| 1379 |
+
name='Price',
|
| 1380 |
+
line=dict(color='blue')
|
| 1381 |
+
))
|
| 1382 |
+
|
| 1383 |
+
# Highlight pump and dump phases
|
| 1384 |
+
max_time_idx = window.index.get_loc(max_idx)
|
| 1385 |
+
pump_x = times_expanded[:max_time_idx*2+2]
|
| 1386 |
+
pump_y = prices[:max_time_idx*2+2]
|
| 1387 |
+
|
| 1388 |
+
dump_x = times_expanded[max_time_idx*2:]
|
| 1389 |
+
dump_y = prices[max_time_idx*2:]
|
| 1390 |
+
|
| 1391 |
+
fig.add_trace(go.Scatter(
|
| 1392 |
+
x=pump_x,
|
| 1393 |
+
y=pump_y,
|
| 1394 |
+
mode='lines',
|
| 1395 |
+
line=dict(color='green', width=3),
|
| 1396 |
+
name='Pump Phase'
|
| 1397 |
+
))
|
| 1398 |
+
|
| 1399 |
+
fig.add_trace(go.Scatter(
|
| 1400 |
+
x=dump_x,
|
| 1401 |
+
y=dump_y,
|
| 1402 |
+
mode='lines',
|
| 1403 |
+
line=dict(color='red', width=3),
|
| 1404 |
+
name='Dump Phase'
|
| 1405 |
+
))
|
| 1406 |
+
|
| 1407 |
+
fig.update_layout(
|
| 1408 |
+
title='Potential Pump and Dump Pattern',
|
| 1409 |
+
xaxis_title='Time',
|
| 1410 |
+
yaxis_title='Price',
|
| 1411 |
+
hovermode='closest'
|
| 1412 |
+
)
|
| 1413 |
+
|
| 1414 |
+
pump_and_dumps.append({
|
| 1415 |
+
"type": "Pump and Dump",
|
| 1416 |
+
"addresses": [address],
|
| 1417 |
+
"risk_level": "High" if max_to_end < dump_threshold * 1.5 else "Medium",
|
| 1418 |
+
"description": f"Price pumped {cumulative_change:.2f}% before dropping {max_to_end:.2f}%",
|
| 1419 |
+
"detection_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
|
| 1420 |
+
"title": f"Pump ({cumulative_change:.1f}%) and Dump ({max_to_end:.1f}%)",
|
| 1421 |
+
"evidence": window,
|
| 1422 |
+
"chart": fig
|
| 1423 |
+
})
|
| 1424 |
+
|
| 1425 |
+
return pump_and_dumps
|
modules/detection.py
ADDED
|
@@ -0,0 +1,684 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import pandas as pd
|
| 2 |
+
import numpy as np
|
| 3 |
+
from datetime import datetime, timedelta
|
| 4 |
+
from typing import Dict, List, Optional, Union, Any, Tuple
|
| 5 |
+
import plotly.graph_objects as go
|
| 6 |
+
import plotly.express as px
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
class ManipulationDetector:
|
| 10 |
+
"""
|
| 11 |
+
Detect potential market manipulation patterns in whale transactions
|
| 12 |
+
"""
|
| 13 |
+
|
| 14 |
+
def __init__(self):
|
| 15 |
+
# Define known manipulation patterns
|
| 16 |
+
self.patterns = {
|
| 17 |
+
"pump_and_dump": {
|
| 18 |
+
"description": "Rapid buys followed by coordinated sell-offs, causing price to first rise then crash",
|
| 19 |
+
"risk_factor": 0.8
|
| 20 |
+
},
|
| 21 |
+
"wash_trading": {
|
| 22 |
+
"description": "Self-trading across multiple addresses to create false impression of market activity",
|
| 23 |
+
"risk_factor": 0.9
|
| 24 |
+
},
|
| 25 |
+
"spoofing": {
|
| 26 |
+
"description": "Large orders placed then canceled before execution to manipulate price",
|
| 27 |
+
"risk_factor": 0.7
|
| 28 |
+
},
|
| 29 |
+
"layering": {
|
| 30 |
+
"description": "Multiple orders at different price levels to create false impression of market depth",
|
| 31 |
+
"risk_factor": 0.6
|
| 32 |
+
},
|
| 33 |
+
"momentum_ignition": {
|
| 34 |
+
"description": "Creating sharp price moves to trigger other participants' momentum-based trading",
|
| 35 |
+
"risk_factor": 0.5
|
| 36 |
+
}
|
| 37 |
+
}
|
| 38 |
+
|
| 39 |
+
def detect_wash_trading(self,
|
| 40 |
+
transactions_df: pd.DataFrame,
|
| 41 |
+
addresses: List[str],
|
| 42 |
+
sensitivity: str = "Medium",
|
| 43 |
+
lookback_hours: int = 24) -> List[Dict[str, Any]]:
|
| 44 |
+
"""
|
| 45 |
+
Detect potential wash trading between addresses
|
| 46 |
+
|
| 47 |
+
Args:
|
| 48 |
+
transactions_df: DataFrame of transactions
|
| 49 |
+
addresses: List of addresses to analyze
|
| 50 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
| 51 |
+
lookback_hours: Hours to look back for wash trading patterns
|
| 52 |
+
|
| 53 |
+
Returns:
|
| 54 |
+
List of potential wash trading alerts
|
| 55 |
+
"""
|
| 56 |
+
if transactions_df.empty or not addresses:
|
| 57 |
+
return []
|
| 58 |
+
|
| 59 |
+
# Ensure from/to columns exist
|
| 60 |
+
if 'From' in transactions_df.columns and 'To' in transactions_df.columns:
|
| 61 |
+
from_col, to_col = 'From', 'To'
|
| 62 |
+
elif 'from' in transactions_df.columns and 'to' in transactions_df.columns:
|
| 63 |
+
from_col, to_col = 'from', 'to'
|
| 64 |
+
else:
|
| 65 |
+
raise ValueError("From/To columns not found in transactions DataFrame")
|
| 66 |
+
|
| 67 |
+
# Ensure timestamp column exists
|
| 68 |
+
if 'Timestamp' in transactions_df.columns:
|
| 69 |
+
timestamp_col = 'Timestamp'
|
| 70 |
+
elif 'timeStamp' in transactions_df.columns:
|
| 71 |
+
timestamp_col = 'timeStamp'
|
| 72 |
+
else:
|
| 73 |
+
raise ValueError("Timestamp column not found in transactions DataFrame")
|
| 74 |
+
|
| 75 |
+
# Ensure timestamp is datetime
|
| 76 |
+
if not pd.api.types.is_datetime64_any_dtype(transactions_df[timestamp_col]):
|
| 77 |
+
if isinstance(transactions_df[timestamp_col].iloc[0], (int, float)):
|
| 78 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col], unit='s')
|
| 79 |
+
else:
|
| 80 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col])
|
| 81 |
+
|
| 82 |
+
# Define sensitivity thresholds
|
| 83 |
+
if sensitivity == "Low":
|
| 84 |
+
min_cycles = 3 # Minimum number of back-and-forth transactions
|
| 85 |
+
max_time_diff = 120 # Maximum minutes between transactions
|
| 86 |
+
elif sensitivity == "Medium":
|
| 87 |
+
min_cycles = 2
|
| 88 |
+
max_time_diff = 60
|
| 89 |
+
else: # High
|
| 90 |
+
min_cycles = 1
|
| 91 |
+
max_time_diff = 30
|
| 92 |
+
|
| 93 |
+
# Filter transactions by lookback period
|
| 94 |
+
lookback_time = datetime.now() - timedelta(hours=lookback_hours)
|
| 95 |
+
recent_txs = transactions_df[transactions_df[timestamp_col] >= lookback_time]
|
| 96 |
+
|
| 97 |
+
if recent_txs.empty:
|
| 98 |
+
return []
|
| 99 |
+
|
| 100 |
+
# Filter transactions involving the addresses
|
| 101 |
+
address_txs = recent_txs[
|
| 102 |
+
(recent_txs[from_col].isin(addresses)) |
|
| 103 |
+
(recent_txs[to_col].isin(addresses))
|
| 104 |
+
].copy()
|
| 105 |
+
|
| 106 |
+
if address_txs.empty:
|
| 107 |
+
return []
|
| 108 |
+
|
| 109 |
+
# Sort by timestamp
|
| 110 |
+
address_txs = address_txs.sort_values(by=timestamp_col)
|
| 111 |
+
|
| 112 |
+
# Detect cycles of transactions between same addresses
|
| 113 |
+
wash_trades = []
|
| 114 |
+
|
| 115 |
+
for addr1 in addresses:
|
| 116 |
+
for addr2 in addresses:
|
| 117 |
+
if addr1 == addr2:
|
| 118 |
+
continue
|
| 119 |
+
|
| 120 |
+
# Find transactions from addr1 to addr2
|
| 121 |
+
a1_to_a2 = address_txs[
|
| 122 |
+
(address_txs[from_col] == addr1) &
|
| 123 |
+
(address_txs[to_col] == addr2)
|
| 124 |
+
]
|
| 125 |
+
|
| 126 |
+
# Find transactions from addr2 to addr1
|
| 127 |
+
a2_to_a1 = address_txs[
|
| 128 |
+
(address_txs[from_col] == addr2) &
|
| 129 |
+
(address_txs[to_col] == addr1)
|
| 130 |
+
]
|
| 131 |
+
|
| 132 |
+
if a1_to_a2.empty or a2_to_a1.empty:
|
| 133 |
+
continue
|
| 134 |
+
|
| 135 |
+
# Check for back-and-forth patterns
|
| 136 |
+
cycles = 0
|
| 137 |
+
evidence = []
|
| 138 |
+
|
| 139 |
+
for _, tx1 in a1_to_a2.iterrows():
|
| 140 |
+
tx1_time = tx1[timestamp_col]
|
| 141 |
+
|
| 142 |
+
# Find return transactions within the time window
|
| 143 |
+
return_txs = a2_to_a1[
|
| 144 |
+
(a2_to_a1[timestamp_col] > tx1_time) &
|
| 145 |
+
(a2_to_a1[timestamp_col] <= tx1_time + pd.Timedelta(minutes=max_time_diff))
|
| 146 |
+
]
|
| 147 |
+
|
| 148 |
+
if not return_txs.empty:
|
| 149 |
+
cycles += 1
|
| 150 |
+
evidence.append(tx1)
|
| 151 |
+
evidence.append(return_txs.iloc[0])
|
| 152 |
+
|
| 153 |
+
if cycles >= min_cycles:
|
| 154 |
+
# Create visualization
|
| 155 |
+
if evidence:
|
| 156 |
+
evidence_df = pd.DataFrame(evidence)
|
| 157 |
+
|
| 158 |
+
# Get amount column
|
| 159 |
+
if 'Amount' in evidence_df.columns:
|
| 160 |
+
amount_col = 'Amount'
|
| 161 |
+
elif 'tokenAmount' in evidence_df.columns:
|
| 162 |
+
amount_col = 'tokenAmount'
|
| 163 |
+
elif 'value' in evidence_df.columns:
|
| 164 |
+
# Try to adjust for decimals if 'tokenDecimal' exists
|
| 165 |
+
if 'tokenDecimal' in evidence_df.columns:
|
| 166 |
+
evidence_df['adjustedValue'] = evidence_df['value'].astype(float) / (10 ** evidence_df['tokenDecimal'].astype(int))
|
| 167 |
+
amount_col = 'adjustedValue'
|
| 168 |
+
else:
|
| 169 |
+
amount_col = 'value'
|
| 170 |
+
else:
|
| 171 |
+
amount_col = None
|
| 172 |
+
|
| 173 |
+
# Create figure if amount column exists
|
| 174 |
+
if amount_col:
|
| 175 |
+
fig = px.scatter(
|
| 176 |
+
evidence_df,
|
| 177 |
+
x=timestamp_col,
|
| 178 |
+
y=amount_col,
|
| 179 |
+
color=from_col,
|
| 180 |
+
title=f"Potential Wash Trading Between {addr1[:8]}... and {addr2[:8]}..."
|
| 181 |
+
)
|
| 182 |
+
else:
|
| 183 |
+
fig = None
|
| 184 |
+
else:
|
| 185 |
+
fig = None
|
| 186 |
+
|
| 187 |
+
wash_trades.append({
|
| 188 |
+
"type": "Wash Trading",
|
| 189 |
+
"addresses": [addr1, addr2],
|
| 190 |
+
"risk_level": "High" if cycles >= min_cycles * 2 else "Medium",
|
| 191 |
+
"description": f"Detected {cycles} cycles of back-and-forth transactions between addresses",
|
| 192 |
+
"detection_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
|
| 193 |
+
"title": f"Wash Trading Pattern ({cycles} cycles)",
|
| 194 |
+
"evidence": pd.DataFrame(evidence) if evidence else None,
|
| 195 |
+
"chart": fig
|
| 196 |
+
})
|
| 197 |
+
|
| 198 |
+
return wash_trades
|
| 199 |
+
|
| 200 |
+
def detect_pump_and_dump(self,
|
| 201 |
+
transactions_df: pd.DataFrame,
|
| 202 |
+
price_data: Dict[str, Dict[str, Any]],
|
| 203 |
+
sensitivity: str = "Medium") -> List[Dict[str, Any]]:
|
| 204 |
+
"""
|
| 205 |
+
Detect potential pump and dump schemes
|
| 206 |
+
|
| 207 |
+
Args:
|
| 208 |
+
transactions_df: DataFrame of transactions
|
| 209 |
+
price_data: Dictionary of price impact data for each transaction
|
| 210 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
| 211 |
+
|
| 212 |
+
Returns:
|
| 213 |
+
List of potential pump and dump alerts
|
| 214 |
+
"""
|
| 215 |
+
if transactions_df.empty or not price_data:
|
| 216 |
+
return []
|
| 217 |
+
|
| 218 |
+
# Ensure timestamp column exists
|
| 219 |
+
if 'Timestamp' in transactions_df.columns:
|
| 220 |
+
timestamp_col = 'Timestamp'
|
| 221 |
+
elif 'timeStamp' in transactions_df.columns:
|
| 222 |
+
timestamp_col = 'timeStamp'
|
| 223 |
+
else:
|
| 224 |
+
raise ValueError("Timestamp column not found in transactions DataFrame")
|
| 225 |
+
|
| 226 |
+
# Ensure from/to columns exist
|
| 227 |
+
if 'From' in transactions_df.columns and 'To' in transactions_df.columns:
|
| 228 |
+
from_col, to_col = 'From', 'To'
|
| 229 |
+
elif 'from' in transactions_df.columns and 'to' in transactions_df.columns:
|
| 230 |
+
from_col, to_col = 'from', 'to'
|
| 231 |
+
else:
|
| 232 |
+
raise ValueError("From/To columns not found in transactions DataFrame")
|
| 233 |
+
|
| 234 |
+
# Ensure timestamp is datetime
|
| 235 |
+
if not pd.api.types.is_datetime64_any_dtype(transactions_df[timestamp_col]):
|
| 236 |
+
if isinstance(transactions_df[timestamp_col].iloc[0], (int, float)):
|
| 237 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col], unit='s')
|
| 238 |
+
else:
|
| 239 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col])
|
| 240 |
+
|
| 241 |
+
# Define sensitivity thresholds
|
| 242 |
+
if sensitivity == "Low":
|
| 243 |
+
accumulation_threshold = 5 # Number of buys to consider accumulation
|
| 244 |
+
pump_threshold = 10.0 # % price increase to trigger pump
|
| 245 |
+
dump_threshold = -8.0 # % price decrease to trigger dump
|
| 246 |
+
elif sensitivity == "Medium":
|
| 247 |
+
accumulation_threshold = 3
|
| 248 |
+
pump_threshold = 7.0
|
| 249 |
+
dump_threshold = -5.0
|
| 250 |
+
else: # High
|
| 251 |
+
accumulation_threshold = 2
|
| 252 |
+
pump_threshold = 5.0
|
| 253 |
+
dump_threshold = -3.0
|
| 254 |
+
|
| 255 |
+
# Combine price impact data with transactions
|
| 256 |
+
txs_with_impact = []
|
| 257 |
+
|
| 258 |
+
for idx, row in transactions_df.iterrows():
|
| 259 |
+
tx_hash = row.get('Transaction Hash', row.get('hash', None))
|
| 260 |
+
if not tx_hash or tx_hash not in price_data:
|
| 261 |
+
continue
|
| 262 |
+
|
| 263 |
+
tx_impact = price_data[tx_hash]
|
| 264 |
+
|
| 265 |
+
if tx_impact['impact_pct'] is None:
|
| 266 |
+
continue
|
| 267 |
+
|
| 268 |
+
txs_with_impact.append({
|
| 269 |
+
'transaction_hash': tx_hash,
|
| 270 |
+
'timestamp': row[timestamp_col],
|
| 271 |
+
'from': row[from_col],
|
| 272 |
+
'to': row[to_col],
|
| 273 |
+
'pre_price': tx_impact['pre_price'],
|
| 274 |
+
'post_price': tx_impact['post_price'],
|
| 275 |
+
'impact_pct': tx_impact['impact_pct']
|
| 276 |
+
})
|
| 277 |
+
|
| 278 |
+
if not txs_with_impact:
|
| 279 |
+
return []
|
| 280 |
+
|
| 281 |
+
impact_df = pd.DataFrame(txs_with_impact)
|
| 282 |
+
impact_df = impact_df.sort_values(by='timestamp')
|
| 283 |
+
|
| 284 |
+
# Look for accumulation phases followed by price pumps and then dumps
|
| 285 |
+
pump_and_dumps = []
|
| 286 |
+
|
| 287 |
+
# Group by address to analyze per wallet
|
| 288 |
+
address_groups = {}
|
| 289 |
+
|
| 290 |
+
for from_addr in impact_df['from'].unique():
|
| 291 |
+
address_groups[from_addr] = impact_df[impact_df['from'] == from_addr]
|
| 292 |
+
|
| 293 |
+
for to_addr in impact_df['to'].unique():
|
| 294 |
+
if to_addr in address_groups:
|
| 295 |
+
address_groups[to_addr] = pd.concat([
|
| 296 |
+
address_groups[to_addr],
|
| 297 |
+
impact_df[impact_df['to'] == to_addr]
|
| 298 |
+
])
|
| 299 |
+
else:
|
| 300 |
+
address_groups[to_addr] = impact_df[impact_df['to'] == to_addr]
|
| 301 |
+
|
| 302 |
+
for address, addr_df in address_groups.items():
|
| 303 |
+
# Skip if not enough transactions
|
| 304 |
+
if len(addr_df) < accumulation_threshold + 2:
|
| 305 |
+
continue
|
| 306 |
+
|
| 307 |
+
# Look for continuous price increase followed by sharp drop
|
| 308 |
+
window_size = min(len(addr_df), 10)
|
| 309 |
+
for i in range(len(addr_df) - window_size + 1):
|
| 310 |
+
window = addr_df.iloc[i:i+window_size]
|
| 311 |
+
|
| 312 |
+
# Get cumulative price change in window
|
| 313 |
+
if len(window) >= 2:
|
| 314 |
+
first_price = window.iloc[0]['pre_price']
|
| 315 |
+
last_price = window.iloc[-1]['post_price']
|
| 316 |
+
|
| 317 |
+
if first_price is None or last_price is None:
|
| 318 |
+
continue
|
| 319 |
+
|
| 320 |
+
cumulative_change = ((last_price - first_price) / first_price) * 100
|
| 321 |
+
|
| 322 |
+
# Check for pump phase
|
| 323 |
+
max_price = window['post_price'].max()
|
| 324 |
+
max_idx = window['post_price'].idxmax()
|
| 325 |
+
|
| 326 |
+
if max_idx < len(window) - 1:
|
| 327 |
+
max_to_end = ((window.iloc[-1]['post_price'] - max_price) / max_price) * 100
|
| 328 |
+
|
| 329 |
+
# If we have a pump followed by a dump
|
| 330 |
+
if (cumulative_change > pump_threshold or
|
| 331 |
+
any(window['impact_pct'] > pump_threshold)) and max_to_end < dump_threshold:
|
| 332 |
+
|
| 333 |
+
# Create chart
|
| 334 |
+
fig = go.Figure()
|
| 335 |
+
|
| 336 |
+
# Plot price line
|
| 337 |
+
times = [t.timestamp() for t in window['timestamp']]
|
| 338 |
+
prices = []
|
| 339 |
+
for _, row in window.iterrows():
|
| 340 |
+
prices.append(row['pre_price'])
|
| 341 |
+
prices.append(row['post_price'])
|
| 342 |
+
|
| 343 |
+
times_expanded = []
|
| 344 |
+
for t in times:
|
| 345 |
+
times_expanded.append(t - 60) # 1 min before
|
| 346 |
+
times_expanded.append(t + 60) # 1 min after
|
| 347 |
+
|
| 348 |
+
fig.add_trace(go.Scatter(
|
| 349 |
+
x=times_expanded,
|
| 350 |
+
y=prices,
|
| 351 |
+
mode='lines+markers',
|
| 352 |
+
name='Price',
|
| 353 |
+
line=dict(color='blue')
|
| 354 |
+
))
|
| 355 |
+
|
| 356 |
+
# Highlight pump and dump phases
|
| 357 |
+
max_time_idx = window.index.get_loc(max_idx)
|
| 358 |
+
pump_x = times_expanded[:max_time_idx*2+2]
|
| 359 |
+
pump_y = prices[:max_time_idx*2+2]
|
| 360 |
+
|
| 361 |
+
dump_x = times_expanded[max_time_idx*2:]
|
| 362 |
+
dump_y = prices[max_time_idx*2:]
|
| 363 |
+
|
| 364 |
+
fig.add_trace(go.Scatter(
|
| 365 |
+
x=pump_x,
|
| 366 |
+
y=pump_y,
|
| 367 |
+
mode='lines',
|
| 368 |
+
line=dict(color='green', width=3),
|
| 369 |
+
name='Pump Phase'
|
| 370 |
+
))
|
| 371 |
+
|
| 372 |
+
fig.add_trace(go.Scatter(
|
| 373 |
+
x=dump_x,
|
| 374 |
+
y=dump_y,
|
| 375 |
+
mode='lines',
|
| 376 |
+
line=dict(color='red', width=3),
|
| 377 |
+
name='Dump Phase'
|
| 378 |
+
))
|
| 379 |
+
|
| 380 |
+
fig.update_layout(
|
| 381 |
+
title='Potential Pump and Dump Pattern',
|
| 382 |
+
xaxis_title='Time',
|
| 383 |
+
yaxis_title='Price',
|
| 384 |
+
hovermode='closest'
|
| 385 |
+
)
|
| 386 |
+
|
| 387 |
+
pump_and_dumps.append({
|
| 388 |
+
"type": "Pump and Dump",
|
| 389 |
+
"addresses": [address],
|
| 390 |
+
"risk_level": "High" if max_to_end < dump_threshold * 1.5 else "Medium",
|
| 391 |
+
"description": f"Price pumped {cumulative_change:.2f}% before dropping {max_to_end:.2f}%",
|
| 392 |
+
"detection_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
|
| 393 |
+
"title": f"Pump ({cumulative_change:.1f}%) and Dump ({max_to_end:.1f}%)",
|
| 394 |
+
"evidence": window,
|
| 395 |
+
"chart": fig
|
| 396 |
+
})
|
| 397 |
+
|
| 398 |
+
return pump_and_dumps
|
| 399 |
+
|
| 400 |
+
def detect_spoofing(self,
|
| 401 |
+
transactions_df: pd.DataFrame,
|
| 402 |
+
order_book_data: Optional[pd.DataFrame] = None,
|
| 403 |
+
sensitivity: str = "Medium") -> List[Dict[str, Any]]:
|
| 404 |
+
"""
|
| 405 |
+
Detect potential spoofing (placing and quickly canceling large orders)
|
| 406 |
+
|
| 407 |
+
Args:
|
| 408 |
+
transactions_df: DataFrame of transactions
|
| 409 |
+
order_book_data: Optional DataFrame of order book data
|
| 410 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
| 411 |
+
|
| 412 |
+
Returns:
|
| 413 |
+
List of potential spoofing alerts
|
| 414 |
+
"""
|
| 415 |
+
# Note: This is a placeholder since we don't have direct order book data
|
| 416 |
+
# In a real implementation, this would analyze order placement and cancellations
|
| 417 |
+
|
| 418 |
+
# For now, return an empty list as we can't detect spoofing without order book data
|
| 419 |
+
return []
|
| 420 |
+
|
| 421 |
+
def detect_layering(self,
|
| 422 |
+
transactions_df: pd.DataFrame,
|
| 423 |
+
order_book_data: Optional[pd.DataFrame] = None,
|
| 424 |
+
sensitivity: str = "Medium") -> List[Dict[str, Any]]:
|
| 425 |
+
"""
|
| 426 |
+
Detect potential layering (placing multiple orders at different price levels)
|
| 427 |
+
|
| 428 |
+
Args:
|
| 429 |
+
transactions_df: DataFrame of transactions
|
| 430 |
+
order_book_data: Optional DataFrame of order book data
|
| 431 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
| 432 |
+
|
| 433 |
+
Returns:
|
| 434 |
+
List of potential layering alerts
|
| 435 |
+
"""
|
| 436 |
+
# Note: This is a placeholder since we don't have direct order book data
|
| 437 |
+
# In a real implementation, this would analyze order book depth and patterns
|
| 438 |
+
|
| 439 |
+
# For now, return an empty list as we can't detect layering without order book data
|
| 440 |
+
return []
|
| 441 |
+
|
| 442 |
+
def detect_momentum_ignition(self,
|
| 443 |
+
transactions_df: pd.DataFrame,
|
| 444 |
+
price_data: Dict[str, Dict[str, Any]],
|
| 445 |
+
sensitivity: str = "Medium") -> List[Dict[str, Any]]:
|
| 446 |
+
"""
|
| 447 |
+
Detect potential momentum ignition (creating sharp price moves)
|
| 448 |
+
|
| 449 |
+
Args:
|
| 450 |
+
transactions_df: DataFrame of transactions
|
| 451 |
+
price_data: Dictionary of price impact data for each transaction
|
| 452 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
| 453 |
+
|
| 454 |
+
Returns:
|
| 455 |
+
List of potential momentum ignition alerts
|
| 456 |
+
"""
|
| 457 |
+
if transactions_df.empty or not price_data:
|
| 458 |
+
return []
|
| 459 |
+
|
| 460 |
+
# Ensure timestamp column exists
|
| 461 |
+
if 'Timestamp' in transactions_df.columns:
|
| 462 |
+
timestamp_col = 'Timestamp'
|
| 463 |
+
elif 'timeStamp' in transactions_df.columns:
|
| 464 |
+
timestamp_col = 'timeStamp'
|
| 465 |
+
else:
|
| 466 |
+
raise ValueError("Timestamp column not found in transactions DataFrame")
|
| 467 |
+
|
| 468 |
+
# Ensure timestamp is datetime
|
| 469 |
+
if not pd.api.types.is_datetime64_any_dtype(transactions_df[timestamp_col]):
|
| 470 |
+
if isinstance(transactions_df[timestamp_col].iloc[0], (int, float)):
|
| 471 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col], unit='s')
|
| 472 |
+
else:
|
| 473 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col])
|
| 474 |
+
|
| 475 |
+
# Define sensitivity thresholds
|
| 476 |
+
if sensitivity == "Low":
|
| 477 |
+
impact_threshold = 15.0 # % price impact to trigger alert
|
| 478 |
+
time_window_minutes = 5 # Time window to look for follow-up transactions
|
| 479 |
+
elif sensitivity == "Medium":
|
| 480 |
+
impact_threshold = 10.0
|
| 481 |
+
time_window_minutes = 10
|
| 482 |
+
else: # High
|
| 483 |
+
impact_threshold = 5.0
|
| 484 |
+
time_window_minutes = 15
|
| 485 |
+
|
| 486 |
+
# Combine price impact data with transactions
|
| 487 |
+
txs_with_impact = []
|
| 488 |
+
|
| 489 |
+
for idx, row in transactions_df.iterrows():
|
| 490 |
+
tx_hash = row.get('Transaction Hash', row.get('hash', None))
|
| 491 |
+
if not tx_hash or tx_hash not in price_data:
|
| 492 |
+
continue
|
| 493 |
+
|
| 494 |
+
tx_impact = price_data[tx_hash]
|
| 495 |
+
|
| 496 |
+
if tx_impact['impact_pct'] is None:
|
| 497 |
+
continue
|
| 498 |
+
|
| 499 |
+
txs_with_impact.append({
|
| 500 |
+
'transaction_hash': tx_hash,
|
| 501 |
+
'timestamp': row[timestamp_col],
|
| 502 |
+
'from': row.get('From', row.get('from', 'Unknown')),
|
| 503 |
+
'to': row.get('To', row.get('to', 'Unknown')),
|
| 504 |
+
'pre_price': tx_impact['pre_price'],
|
| 505 |
+
'post_price': tx_impact['post_price'],
|
| 506 |
+
'impact_pct': tx_impact['impact_pct']
|
| 507 |
+
})
|
| 508 |
+
|
| 509 |
+
if not txs_with_impact:
|
| 510 |
+
return []
|
| 511 |
+
|
| 512 |
+
impact_df = pd.DataFrame(txs_with_impact)
|
| 513 |
+
impact_df = impact_df.sort_values(by='timestamp')
|
| 514 |
+
|
| 515 |
+
# Look for large price impacts followed by increased trading activity
|
| 516 |
+
momentum_alerts = []
|
| 517 |
+
|
| 518 |
+
# Find high-impact transactions
|
| 519 |
+
high_impact_txs = impact_df[abs(impact_df['impact_pct']) > impact_threshold]
|
| 520 |
+
|
| 521 |
+
for idx, high_impact_tx in high_impact_txs.iterrows():
|
| 522 |
+
tx_time = high_impact_tx['timestamp']
|
| 523 |
+
|
| 524 |
+
# Look for increased trading activity after the high-impact transaction
|
| 525 |
+
follow_up_window = impact_df[
|
| 526 |
+
(impact_df['timestamp'] > tx_time) &
|
| 527 |
+
(impact_df['timestamp'] <= tx_time + pd.Timedelta(minutes=time_window_minutes))
|
| 528 |
+
]
|
| 529 |
+
|
| 530 |
+
# Compare activity to baseline (same time window before the transaction)
|
| 531 |
+
baseline_window = impact_df[
|
| 532 |
+
(impact_df['timestamp'] < tx_time) &
|
| 533 |
+
(impact_df['timestamp'] >= tx_time - pd.Timedelta(minutes=time_window_minutes))
|
| 534 |
+
]
|
| 535 |
+
|
| 536 |
+
if len(follow_up_window) > len(baseline_window) * 1.5 and len(follow_up_window) >= 3:
|
| 537 |
+
# Create chart
|
| 538 |
+
fig = go.Figure()
|
| 539 |
+
|
| 540 |
+
# Plot price timeline
|
| 541 |
+
all_relevant_txs = pd.concat([
|
| 542 |
+
pd.DataFrame([high_impact_tx]),
|
| 543 |
+
follow_up_window,
|
| 544 |
+
baseline_window
|
| 545 |
+
]).sort_values(by='timestamp')
|
| 546 |
+
|
| 547 |
+
# Create time series for price
|
| 548 |
+
timestamps = all_relevant_txs['timestamp']
|
| 549 |
+
prices = []
|
| 550 |
+
for _, row in all_relevant_txs.iterrows():
|
| 551 |
+
prices.append(row['pre_price'])
|
| 552 |
+
prices.append(row['post_price'])
|
| 553 |
+
|
| 554 |
+
times_expanded = []
|
| 555 |
+
for t in timestamps:
|
| 556 |
+
times_expanded.append(t - pd.Timedelta(seconds=30))
|
| 557 |
+
times_expanded.append(t + pd.Timedelta(seconds=30))
|
| 558 |
+
|
| 559 |
+
# Plot price line
|
| 560 |
+
fig.add_trace(go.Scatter(
|
| 561 |
+
x=times_expanded[:len(prices)], # In case of any length mismatch
|
| 562 |
+
y=prices[:len(times_expanded)],
|
| 563 |
+
mode='lines',
|
| 564 |
+
name='Price'
|
| 565 |
+
))
|
| 566 |
+
|
| 567 |
+
# Highlight the high-impact transaction
|
| 568 |
+
fig.add_trace(go.Scatter(
|
| 569 |
+
x=[high_impact_tx['timestamp']],
|
| 570 |
+
y=[high_impact_tx['post_price']],
|
| 571 |
+
mode='markers',
|
| 572 |
+
marker=dict(
|
| 573 |
+
size=15,
|
| 574 |
+
color='red',
|
| 575 |
+
symbol='circle'
|
| 576 |
+
),
|
| 577 |
+
name='Momentum Ignition'
|
| 578 |
+
))
|
| 579 |
+
|
| 580 |
+
# Highlight the follow-up transactions
|
| 581 |
+
if not follow_up_window.empty:
|
| 582 |
+
fig.add_trace(go.Scatter(
|
| 583 |
+
x=follow_up_window['timestamp'],
|
| 584 |
+
y=follow_up_window['post_price'],
|
| 585 |
+
mode='markers',
|
| 586 |
+
marker=dict(
|
| 587 |
+
size=10,
|
| 588 |
+
color='orange',
|
| 589 |
+
symbol='circle'
|
| 590 |
+
),
|
| 591 |
+
name='Follow-up Activity'
|
| 592 |
+
))
|
| 593 |
+
|
| 594 |
+
fig.update_layout(
|
| 595 |
+
title='Potential Momentum Ignition Pattern',
|
| 596 |
+
xaxis_title='Time',
|
| 597 |
+
yaxis_title='Price',
|
| 598 |
+
hovermode='closest'
|
| 599 |
+
)
|
| 600 |
+
|
| 601 |
+
momentum_alerts.append({
|
| 602 |
+
"type": "Momentum Ignition",
|
| 603 |
+
"addresses": [high_impact_tx['from']],
|
| 604 |
+
"risk_level": "High" if abs(high_impact_tx['impact_pct']) > impact_threshold * 1.5 else "Medium",
|
| 605 |
+
"description": f"Large {high_impact_tx['impact_pct']:.2f}% price move followed by {len(follow_up_window)} transactions in {time_window_minutes} minutes (vs {len(baseline_window)} in baseline)",
|
| 606 |
+
"detection_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
|
| 607 |
+
"title": f"Momentum Ignition ({high_impact_tx['impact_pct']:.1f}% price move)",
|
| 608 |
+
"evidence": pd.concat([pd.DataFrame([high_impact_tx]), follow_up_window]),
|
| 609 |
+
"chart": fig
|
| 610 |
+
})
|
| 611 |
+
|
| 612 |
+
return momentum_alerts
|
| 613 |
+
|
| 614 |
+
def run_all_detections(self,
|
| 615 |
+
transactions_df: pd.DataFrame,
|
| 616 |
+
addresses: List[str],
|
| 617 |
+
price_data: Dict[str, Dict[str, Any]] = None,
|
| 618 |
+
order_book_data: Optional[pd.DataFrame] = None,
|
| 619 |
+
sensitivity: str = "Medium") -> List[Dict[str, Any]]:
|
| 620 |
+
"""
|
| 621 |
+
Run all manipulation detection algorithms
|
| 622 |
+
|
| 623 |
+
Args:
|
| 624 |
+
transactions_df: DataFrame of transactions
|
| 625 |
+
addresses: List of addresses to analyze
|
| 626 |
+
price_data: Optional dictionary of price impact data for each transaction
|
| 627 |
+
order_book_data: Optional DataFrame of order book data
|
| 628 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
| 629 |
+
|
| 630 |
+
Returns:
|
| 631 |
+
List of potential manipulation alerts
|
| 632 |
+
"""
|
| 633 |
+
if transactions_df.empty:
|
| 634 |
+
return []
|
| 635 |
+
|
| 636 |
+
all_alerts = []
|
| 637 |
+
|
| 638 |
+
# Detect wash trading
|
| 639 |
+
wash_trading_alerts = self.detect_wash_trading(
|
| 640 |
+
transactions_df=transactions_df,
|
| 641 |
+
addresses=addresses,
|
| 642 |
+
sensitivity=sensitivity
|
| 643 |
+
)
|
| 644 |
+
all_alerts.extend(wash_trading_alerts)
|
| 645 |
+
|
| 646 |
+
# Detect pump and dump (if price data available)
|
| 647 |
+
if price_data:
|
| 648 |
+
pump_and_dump_alerts = self.detect_pump_and_dump(
|
| 649 |
+
transactions_df=transactions_df,
|
| 650 |
+
price_data=price_data,
|
| 651 |
+
sensitivity=sensitivity
|
| 652 |
+
)
|
| 653 |
+
all_alerts.extend(pump_and_dump_alerts)
|
| 654 |
+
|
| 655 |
+
# Detect momentum ignition (if price data available)
|
| 656 |
+
momentum_alerts = self.detect_momentum_ignition(
|
| 657 |
+
transactions_df=transactions_df,
|
| 658 |
+
price_data=price_data,
|
| 659 |
+
sensitivity=sensitivity
|
| 660 |
+
)
|
| 661 |
+
all_alerts.extend(momentum_alerts)
|
| 662 |
+
|
| 663 |
+
# Detect spoofing (if order book data available)
|
| 664 |
+
if order_book_data is not None:
|
| 665 |
+
spoofing_alerts = self.detect_spoofing(
|
| 666 |
+
transactions_df=transactions_df,
|
| 667 |
+
order_book_data=order_book_data,
|
| 668 |
+
sensitivity=sensitivity
|
| 669 |
+
)
|
| 670 |
+
all_alerts.extend(spoofing_alerts)
|
| 671 |
+
|
| 672 |
+
# Detect layering (if order book data available)
|
| 673 |
+
layering_alerts = self.detect_layering(
|
| 674 |
+
transactions_df=transactions_df,
|
| 675 |
+
order_book_data=order_book_data,
|
| 676 |
+
sensitivity=sensitivity
|
| 677 |
+
)
|
| 678 |
+
all_alerts.extend(layering_alerts)
|
| 679 |
+
|
| 680 |
+
# Sort alerts by risk level
|
| 681 |
+
risk_order = {"High": 0, "Medium": 1, "Low": 2}
|
| 682 |
+
all_alerts.sort(key=lambda x: risk_order.get(x.get("risk_level", "Low"), 3))
|
| 683 |
+
|
| 684 |
+
return all_alerts
|
modules/tools.py
ADDED
|
@@ -0,0 +1,373 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import json
|
| 2 |
+
import pandas as pd
|
| 3 |
+
from datetime import datetime
|
| 4 |
+
from typing import Dict, List, Optional, Union, Any, Tuple
|
| 5 |
+
|
| 6 |
+
from langchain.tools import tool
|
| 7 |
+
from modules.api_client import ArbiscanClient, GeminiClient
|
| 8 |
+
from modules.data_processor import DataProcessor
|
| 9 |
+
|
| 10 |
+
# Tools for Arbiscan API
|
| 11 |
+
class ArbiscanTools:
|
| 12 |
+
def __init__(self, arbiscan_client: ArbiscanClient):
|
| 13 |
+
self.client = arbiscan_client
|
| 14 |
+
|
| 15 |
+
@tool("get_token_transfers")
|
| 16 |
+
def get_token_transfers(self, address: str, contract_address: Optional[str] = None) -> str:
|
| 17 |
+
"""
|
| 18 |
+
Get ERC-20 token transfers for a specific address
|
| 19 |
+
|
| 20 |
+
Args:
|
| 21 |
+
address: Wallet address
|
| 22 |
+
contract_address: Optional token contract address to filter by
|
| 23 |
+
|
| 24 |
+
Returns:
|
| 25 |
+
List of token transfers as JSON string
|
| 26 |
+
"""
|
| 27 |
+
transfers = self.client.get_token_transfers(
|
| 28 |
+
address=address,
|
| 29 |
+
contract_address=contract_address
|
| 30 |
+
)
|
| 31 |
+
return json.dumps(transfers)
|
| 32 |
+
|
| 33 |
+
@tool("get_token_balance")
|
| 34 |
+
def get_token_balance(self, address: str, contract_address: str) -> str:
|
| 35 |
+
"""
|
| 36 |
+
Get the current balance of a specific token for an address
|
| 37 |
+
|
| 38 |
+
Args:
|
| 39 |
+
address: Wallet address
|
| 40 |
+
contract_address: Token contract address
|
| 41 |
+
|
| 42 |
+
Returns:
|
| 43 |
+
Token balance
|
| 44 |
+
"""
|
| 45 |
+
balance = self.client.get_token_balance(
|
| 46 |
+
address=address,
|
| 47 |
+
contract_address=contract_address
|
| 48 |
+
)
|
| 49 |
+
return balance
|
| 50 |
+
|
| 51 |
+
@tool("get_normal_transactions")
|
| 52 |
+
def get_normal_transactions(self, address: str) -> str:
|
| 53 |
+
"""
|
| 54 |
+
Get normal transactions (ETH/ARB transfers) for a specific address
|
| 55 |
+
|
| 56 |
+
Args:
|
| 57 |
+
address: Wallet address
|
| 58 |
+
|
| 59 |
+
Returns:
|
| 60 |
+
List of normal transactions as JSON string
|
| 61 |
+
"""
|
| 62 |
+
transactions = self.client.get_normal_transactions(address=address)
|
| 63 |
+
return json.dumps(transactions)
|
| 64 |
+
|
| 65 |
+
@tool("get_internal_transactions")
|
| 66 |
+
def get_internal_transactions(self, address: str) -> str:
|
| 67 |
+
"""
|
| 68 |
+
Get internal transactions for a specific address
|
| 69 |
+
|
| 70 |
+
Args:
|
| 71 |
+
address: Wallet address
|
| 72 |
+
|
| 73 |
+
Returns:
|
| 74 |
+
List of internal transactions as JSON string
|
| 75 |
+
"""
|
| 76 |
+
transactions = self.client.get_internal_transactions(address=address)
|
| 77 |
+
return json.dumps(transactions)
|
| 78 |
+
|
| 79 |
+
@tool("fetch_whale_transactions")
|
| 80 |
+
def fetch_whale_transactions(self,
|
| 81 |
+
addresses: List[str],
|
| 82 |
+
token_address: Optional[str] = None,
|
| 83 |
+
min_token_amount: Optional[float] = None,
|
| 84 |
+
min_usd_value: Optional[float] = None) -> str:
|
| 85 |
+
"""
|
| 86 |
+
Fetch whale transactions for a list of addresses
|
| 87 |
+
|
| 88 |
+
Args:
|
| 89 |
+
addresses: List of wallet addresses
|
| 90 |
+
token_address: Optional token contract address to filter by
|
| 91 |
+
min_token_amount: Minimum token amount
|
| 92 |
+
min_usd_value: Minimum USD value
|
| 93 |
+
|
| 94 |
+
Returns:
|
| 95 |
+
DataFrame of whale transactions as JSON string
|
| 96 |
+
"""
|
| 97 |
+
transactions_df = self.client.fetch_whale_transactions(
|
| 98 |
+
addresses=addresses,
|
| 99 |
+
token_address=token_address,
|
| 100 |
+
min_token_amount=min_token_amount,
|
| 101 |
+
min_usd_value=min_usd_value
|
| 102 |
+
)
|
| 103 |
+
return transactions_df.to_json(orient="records")
|
| 104 |
+
|
| 105 |
+
|
| 106 |
+
# Tools for Gemini API
|
| 107 |
+
class GeminiTools:
|
| 108 |
+
def __init__(self, gemini_client: GeminiClient):
|
| 109 |
+
self.client = gemini_client
|
| 110 |
+
|
| 111 |
+
@tool("get_current_price")
|
| 112 |
+
def get_current_price(self, symbol: str) -> str:
|
| 113 |
+
"""
|
| 114 |
+
Get the current price of a token
|
| 115 |
+
|
| 116 |
+
Args:
|
| 117 |
+
symbol: Token symbol (e.g., "ETHUSD")
|
| 118 |
+
|
| 119 |
+
Returns:
|
| 120 |
+
Current price
|
| 121 |
+
"""
|
| 122 |
+
price = self.client.get_current_price(symbol=symbol)
|
| 123 |
+
return str(price) if price is not None else "Price not found"
|
| 124 |
+
|
| 125 |
+
@tool("get_historical_prices")
|
| 126 |
+
def get_historical_prices(self,
|
| 127 |
+
symbol: str,
|
| 128 |
+
start_time: str,
|
| 129 |
+
end_time: str) -> str:
|
| 130 |
+
"""
|
| 131 |
+
Get historical prices for a token within a time range
|
| 132 |
+
|
| 133 |
+
Args:
|
| 134 |
+
symbol: Token symbol (e.g., "ETHUSD")
|
| 135 |
+
start_time: Start datetime in ISO format
|
| 136 |
+
end_time: End datetime in ISO format
|
| 137 |
+
|
| 138 |
+
Returns:
|
| 139 |
+
DataFrame of historical prices as JSON string
|
| 140 |
+
"""
|
| 141 |
+
# Parse datetime strings
|
| 142 |
+
start_time_dt = datetime.fromisoformat(start_time.replace('Z', '+00:00'))
|
| 143 |
+
end_time_dt = datetime.fromisoformat(end_time.replace('Z', '+00:00'))
|
| 144 |
+
|
| 145 |
+
prices_df = self.client.get_historical_prices(
|
| 146 |
+
symbol=symbol,
|
| 147 |
+
start_time=start_time_dt,
|
| 148 |
+
end_time=end_time_dt
|
| 149 |
+
)
|
| 150 |
+
|
| 151 |
+
if prices_df is not None:
|
| 152 |
+
return prices_df.to_json(orient="records")
|
| 153 |
+
else:
|
| 154 |
+
return "[]"
|
| 155 |
+
|
| 156 |
+
@tool("get_price_impact")
|
| 157 |
+
def get_price_impact(self,
|
| 158 |
+
symbol: str,
|
| 159 |
+
transaction_time: str,
|
| 160 |
+
lookback_minutes: int = 5,
|
| 161 |
+
lookahead_minutes: int = 5) -> str:
|
| 162 |
+
"""
|
| 163 |
+
Analyze the price impact before and after a transaction
|
| 164 |
+
|
| 165 |
+
Args:
|
| 166 |
+
symbol: Token symbol (e.g., "ETHUSD")
|
| 167 |
+
transaction_time: Transaction datetime in ISO format
|
| 168 |
+
lookback_minutes: Minutes to look back before the transaction
|
| 169 |
+
lookahead_minutes: Minutes to look ahead after the transaction
|
| 170 |
+
|
| 171 |
+
Returns:
|
| 172 |
+
Price impact data as JSON string
|
| 173 |
+
"""
|
| 174 |
+
# Parse datetime string
|
| 175 |
+
transaction_time_dt = datetime.fromisoformat(transaction_time.replace('Z', '+00:00'))
|
| 176 |
+
|
| 177 |
+
impact_data = self.client.get_price_impact(
|
| 178 |
+
symbol=symbol,
|
| 179 |
+
transaction_time=transaction_time_dt,
|
| 180 |
+
lookback_minutes=lookback_minutes,
|
| 181 |
+
lookahead_minutes=lookahead_minutes
|
| 182 |
+
)
|
| 183 |
+
|
| 184 |
+
# Convert to JSON string
|
| 185 |
+
result = {
|
| 186 |
+
"pre_price": impact_data["pre_price"],
|
| 187 |
+
"post_price": impact_data["post_price"],
|
| 188 |
+
"impact_pct": impact_data["impact_pct"]
|
| 189 |
+
}
|
| 190 |
+
return json.dumps(result)
|
| 191 |
+
|
| 192 |
+
|
| 193 |
+
# Tools for Data Processor
|
| 194 |
+
class DataProcessorTools:
|
| 195 |
+
def __init__(self, data_processor: DataProcessor):
|
| 196 |
+
self.processor = data_processor
|
| 197 |
+
|
| 198 |
+
@tool("aggregate_transactions")
|
| 199 |
+
def aggregate_transactions(self,
|
| 200 |
+
transactions_json: str,
|
| 201 |
+
time_window: str = 'D') -> str:
|
| 202 |
+
"""
|
| 203 |
+
Aggregate transactions by time window
|
| 204 |
+
|
| 205 |
+
Args:
|
| 206 |
+
transactions_json: JSON string of transactions
|
| 207 |
+
time_window: Time window for aggregation (e.g., 'D' for day, 'H' for hour)
|
| 208 |
+
|
| 209 |
+
Returns:
|
| 210 |
+
Aggregated DataFrame as JSON string
|
| 211 |
+
"""
|
| 212 |
+
# Convert JSON to DataFrame
|
| 213 |
+
transactions_df = pd.read_json(transactions_json)
|
| 214 |
+
|
| 215 |
+
# Process data
|
| 216 |
+
agg_df = self.processor.aggregate_transactions(
|
| 217 |
+
transactions_df=transactions_df,
|
| 218 |
+
time_window=time_window
|
| 219 |
+
)
|
| 220 |
+
|
| 221 |
+
# Convert result to JSON
|
| 222 |
+
return agg_df.to_json(orient="records")
|
| 223 |
+
|
| 224 |
+
@tool("identify_patterns")
|
| 225 |
+
def identify_patterns(self,
|
| 226 |
+
transactions_json: str,
|
| 227 |
+
n_clusters: int = 3) -> str:
|
| 228 |
+
"""
|
| 229 |
+
Identify trading patterns using clustering
|
| 230 |
+
|
| 231 |
+
Args:
|
| 232 |
+
transactions_json: JSON string of transactions
|
| 233 |
+
n_clusters: Number of clusters for K-Means
|
| 234 |
+
|
| 235 |
+
Returns:
|
| 236 |
+
List of pattern dictionaries as JSON string
|
| 237 |
+
"""
|
| 238 |
+
# Convert JSON to DataFrame
|
| 239 |
+
transactions_df = pd.read_json(transactions_json)
|
| 240 |
+
|
| 241 |
+
# Process data
|
| 242 |
+
patterns = self.processor.identify_patterns(
|
| 243 |
+
transactions_df=transactions_df,
|
| 244 |
+
n_clusters=n_clusters
|
| 245 |
+
)
|
| 246 |
+
|
| 247 |
+
# Convert result to JSON
|
| 248 |
+
result = []
|
| 249 |
+
for pattern in patterns:
|
| 250 |
+
# Convert non-serializable objects to serializable format
|
| 251 |
+
pattern_json = {
|
| 252 |
+
"name": pattern["name"],
|
| 253 |
+
"description": pattern["description"],
|
| 254 |
+
"cluster_id": pattern["cluster_id"],
|
| 255 |
+
"occurrence_count": pattern["occurrence_count"],
|
| 256 |
+
"confidence": pattern["confidence"],
|
| 257 |
+
# Skip chart_data as it's not JSON serializable
|
| 258 |
+
"examples": pattern["examples"].to_json(orient="records") if isinstance(pattern["examples"], pd.DataFrame) else []
|
| 259 |
+
}
|
| 260 |
+
result.append(pattern_json)
|
| 261 |
+
|
| 262 |
+
return json.dumps(result)
|
| 263 |
+
|
| 264 |
+
@tool("detect_anomalous_transactions")
|
| 265 |
+
def detect_anomalous_transactions(self,
|
| 266 |
+
transactions_json: str,
|
| 267 |
+
sensitivity: str = "Medium") -> str:
|
| 268 |
+
"""
|
| 269 |
+
Detect anomalous transactions using statistical methods
|
| 270 |
+
|
| 271 |
+
Args:
|
| 272 |
+
transactions_json: JSON string of transactions
|
| 273 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
| 274 |
+
|
| 275 |
+
Returns:
|
| 276 |
+
DataFrame of anomalous transactions as JSON string
|
| 277 |
+
"""
|
| 278 |
+
# Convert JSON to DataFrame
|
| 279 |
+
transactions_df = pd.read_json(transactions_json)
|
| 280 |
+
|
| 281 |
+
# Process data
|
| 282 |
+
anomalies_df = self.processor.detect_anomalous_transactions(
|
| 283 |
+
transactions_df=transactions_df,
|
| 284 |
+
sensitivity=sensitivity
|
| 285 |
+
)
|
| 286 |
+
|
| 287 |
+
# Convert result to JSON
|
| 288 |
+
return anomalies_df.to_json(orient="records")
|
| 289 |
+
|
| 290 |
+
@tool("analyze_price_impact")
|
| 291 |
+
def analyze_price_impact(self,
|
| 292 |
+
transactions_json: str,
|
| 293 |
+
price_data_json: str) -> str:
|
| 294 |
+
"""
|
| 295 |
+
Analyze the price impact of transactions
|
| 296 |
+
|
| 297 |
+
Args:
|
| 298 |
+
transactions_json: JSON string of transactions
|
| 299 |
+
price_data_json: JSON string of price impact data
|
| 300 |
+
|
| 301 |
+
Returns:
|
| 302 |
+
Price impact analysis as JSON string
|
| 303 |
+
"""
|
| 304 |
+
# Convert JSON to DataFrame
|
| 305 |
+
transactions_df = pd.read_json(transactions_json)
|
| 306 |
+
|
| 307 |
+
# Convert price_data_json to dictionary
|
| 308 |
+
price_data = json.loads(price_data_json)
|
| 309 |
+
|
| 310 |
+
# Process data
|
| 311 |
+
impact_analysis = self.processor.analyze_price_impact(
|
| 312 |
+
transactions_df=transactions_df,
|
| 313 |
+
price_data=price_data
|
| 314 |
+
)
|
| 315 |
+
|
| 316 |
+
# Convert result to JSON (excluding non-serializable objects)
|
| 317 |
+
result = {
|
| 318 |
+
"avg_impact_pct": impact_analysis.get("avg_impact_pct"),
|
| 319 |
+
"max_impact_pct": impact_analysis.get("max_impact_pct"),
|
| 320 |
+
"min_impact_pct": impact_analysis.get("min_impact_pct"),
|
| 321 |
+
"significant_moves_count": impact_analysis.get("significant_moves_count"),
|
| 322 |
+
"total_transactions": impact_analysis.get("total_transactions"),
|
| 323 |
+
# Skip impact_chart as it's not JSON serializable
|
| 324 |
+
"transactions_with_impact": impact_analysis.get("transactions_with_impact").to_json(orient="records") if "transactions_with_impact" in impact_analysis else []
|
| 325 |
+
}
|
| 326 |
+
|
| 327 |
+
return json.dumps(result)
|
| 328 |
+
|
| 329 |
+
@tool("detect_wash_trading")
|
| 330 |
+
def detect_wash_trading(self,
|
| 331 |
+
transactions_json: str,
|
| 332 |
+
addresses_json: str,
|
| 333 |
+
sensitivity: str = "Medium") -> str:
|
| 334 |
+
"""
|
| 335 |
+
Detect potential wash trading between addresses
|
| 336 |
+
|
| 337 |
+
Args:
|
| 338 |
+
transactions_json: JSON string of transactions
|
| 339 |
+
addresses_json: JSON string of addresses to analyze
|
| 340 |
+
sensitivity: Detection sensitivity ("Low", "Medium", "High")
|
| 341 |
+
|
| 342 |
+
Returns:
|
| 343 |
+
List of potential wash trading incidents as JSON string
|
| 344 |
+
"""
|
| 345 |
+
# Convert JSON to DataFrame
|
| 346 |
+
transactions_df = pd.read_json(transactions_json)
|
| 347 |
+
|
| 348 |
+
# Convert addresses_json to list
|
| 349 |
+
addresses = json.loads(addresses_json)
|
| 350 |
+
|
| 351 |
+
# Process data
|
| 352 |
+
wash_trades = self.processor.detect_wash_trading(
|
| 353 |
+
transactions_df=transactions_df,
|
| 354 |
+
addresses=addresses,
|
| 355 |
+
sensitivity=sensitivity
|
| 356 |
+
)
|
| 357 |
+
|
| 358 |
+
# Convert result to JSON (excluding non-serializable objects)
|
| 359 |
+
result = []
|
| 360 |
+
for trade in wash_trades:
|
| 361 |
+
trade_json = {
|
| 362 |
+
"type": trade["type"],
|
| 363 |
+
"addresses": trade["addresses"],
|
| 364 |
+
"risk_level": trade["risk_level"],
|
| 365 |
+
"description": trade["description"],
|
| 366 |
+
"detection_time": trade["detection_time"],
|
| 367 |
+
"title": trade["title"],
|
| 368 |
+
"evidence": trade["evidence"].to_json(orient="records") if isinstance(trade["evidence"], pd.DataFrame) else []
|
| 369 |
+
# Skip chart as it's not JSON serializable
|
| 370 |
+
}
|
| 371 |
+
result.append(trade_json)
|
| 372 |
+
|
| 373 |
+
return json.dumps(result)
|
modules/visualizer.py
ADDED
|
@@ -0,0 +1,638 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import pandas as pd
|
| 2 |
+
import numpy as np
|
| 3 |
+
import plotly.graph_objects as go
|
| 4 |
+
import plotly.express as px
|
| 5 |
+
from datetime import datetime, timedelta
|
| 6 |
+
from typing import Dict, List, Optional, Union, Any, Tuple
|
| 7 |
+
import io
|
| 8 |
+
import base64
|
| 9 |
+
import matplotlib.pyplot as plt
|
| 10 |
+
from matplotlib.backends.backend_pdf import PdfPages
|
| 11 |
+
from reportlab.lib.pagesizes import letter
|
| 12 |
+
from reportlab.pdfgen import canvas
|
| 13 |
+
from reportlab.lib import colors
|
| 14 |
+
from reportlab.platypus import SimpleDocTemplate, Table, TableStyle, Paragraph, Spacer
|
| 15 |
+
from reportlab.lib.styles import getSampleStyleSheet
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
class Visualizer:
|
| 19 |
+
"""
|
| 20 |
+
Generate visualizations and reports for whale transaction data
|
| 21 |
+
"""
|
| 22 |
+
|
| 23 |
+
def __init__(self):
|
| 24 |
+
self.color_map = {
|
| 25 |
+
"buy": "green",
|
| 26 |
+
"sell": "red",
|
| 27 |
+
"transfer": "blue",
|
| 28 |
+
"other": "gray"
|
| 29 |
+
}
|
| 30 |
+
|
| 31 |
+
def create_transaction_timeline(self, transactions_df: pd.DataFrame) -> go.Figure:
|
| 32 |
+
"""
|
| 33 |
+
Create a timeline visualization of transactions
|
| 34 |
+
|
| 35 |
+
Args:
|
| 36 |
+
transactions_df: DataFrame of transactions
|
| 37 |
+
|
| 38 |
+
Returns:
|
| 39 |
+
Plotly figure object
|
| 40 |
+
"""
|
| 41 |
+
if transactions_df.empty:
|
| 42 |
+
fig = go.Figure()
|
| 43 |
+
fig.update_layout(
|
| 44 |
+
title="No Transaction Data Available",
|
| 45 |
+
xaxis_title="Date",
|
| 46 |
+
yaxis_title="Action",
|
| 47 |
+
height=400,
|
| 48 |
+
template="plotly_white"
|
| 49 |
+
)
|
| 50 |
+
fig.add_annotation(
|
| 51 |
+
text="No transaction data available for timeline",
|
| 52 |
+
showarrow=False,
|
| 53 |
+
font=dict(size=14)
|
| 54 |
+
)
|
| 55 |
+
return fig
|
| 56 |
+
|
| 57 |
+
try:
|
| 58 |
+
# Ensure timestamp column exists
|
| 59 |
+
if 'Timestamp' in transactions_df.columns:
|
| 60 |
+
timestamp_col = 'Timestamp'
|
| 61 |
+
elif 'timeStamp' in transactions_df.columns:
|
| 62 |
+
timestamp_col = 'timeStamp'
|
| 63 |
+
# Convert timestamp to datetime if it's not already
|
| 64 |
+
if not pd.api.types.is_datetime64_any_dtype(transactions_df[timestamp_col]):
|
| 65 |
+
try:
|
| 66 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col].astype(float), unit='s')
|
| 67 |
+
except Exception as e:
|
| 68 |
+
print(f"Error converting timestamp: {str(e)}")
|
| 69 |
+
transactions_df[timestamp_col] = pd.date_range(start='2025-01-01', periods=len(transactions_df), freq='H')
|
| 70 |
+
else:
|
| 71 |
+
# Create a dummy timestamp if none exists
|
| 72 |
+
transactions_df['dummy_timestamp'] = pd.date_range(start='2025-01-01', periods=len(transactions_df), freq='H')
|
| 73 |
+
timestamp_col = 'dummy_timestamp'
|
| 74 |
+
|
| 75 |
+
# Create figure
|
| 76 |
+
fig = go.Figure()
|
| 77 |
+
|
| 78 |
+
# Add transactions to timeline
|
| 79 |
+
for idx, row in transactions_df.iterrows():
|
| 80 |
+
# Determine transaction type
|
| 81 |
+
if 'From' in transactions_df.columns and 'To' in transactions_df.columns:
|
| 82 |
+
from_col, to_col = 'From', 'To'
|
| 83 |
+
else:
|
| 84 |
+
from_col, to_col = 'from', 'to'
|
| 85 |
+
|
| 86 |
+
tx_type = "other"
|
| 87 |
+
hover_text = ""
|
| 88 |
+
|
| 89 |
+
if pd.isna(row[from_col]) or row[from_col] == '0x0000000000000000000000000000000000000000':
|
| 90 |
+
tx_type = "buy"
|
| 91 |
+
hover_text = f"Buy: {row[to_col]}"
|
| 92 |
+
elif pd.isna(row[to_col]) or row[to_col] == '0x0000000000000000000000000000000000000000':
|
| 93 |
+
tx_type = "sell"
|
| 94 |
+
hover_text = f"Sell: {row[from_col]}"
|
| 95 |
+
else:
|
| 96 |
+
tx_type = "transfer"
|
| 97 |
+
hover_text = f"Transfer: {row[from_col]} β {row[to_col]}"
|
| 98 |
+
|
| 99 |
+
# Add amount to hover text if available
|
| 100 |
+
if 'Amount' in row:
|
| 101 |
+
hover_text += f"<br>Amount: {row['Amount']}"
|
| 102 |
+
elif 'value' in row:
|
| 103 |
+
hover_text += f"<br>Value: {row['value']}"
|
| 104 |
+
|
| 105 |
+
# Add token info if available
|
| 106 |
+
if 'tokenSymbol' in row:
|
| 107 |
+
hover_text += f"<br>Token: {row['tokenSymbol']}"
|
| 108 |
+
|
| 109 |
+
# Add transaction to timeline
|
| 110 |
+
fig.add_trace(go.Scatter(
|
| 111 |
+
x=[row[timestamp_col]],
|
| 112 |
+
y=[tx_type],
|
| 113 |
+
mode='markers',
|
| 114 |
+
marker=dict(
|
| 115 |
+
size=12,
|
| 116 |
+
color=self.color_map.get(tx_type, "gray"),
|
| 117 |
+
line=dict(width=1, color='black')
|
| 118 |
+
),
|
| 119 |
+
name=tx_type,
|
| 120 |
+
text=hover_text,
|
| 121 |
+
hoverinfo='text'
|
| 122 |
+
))
|
| 123 |
+
|
| 124 |
+
# Update layout
|
| 125 |
+
fig.update_layout(
|
| 126 |
+
title='Whale Transaction Timeline',
|
| 127 |
+
xaxis_title='Time',
|
| 128 |
+
yaxis_title='Transaction Type',
|
| 129 |
+
height=400,
|
| 130 |
+
template='plotly_white',
|
| 131 |
+
showlegend=True,
|
| 132 |
+
hovermode='closest'
|
| 133 |
+
)
|
| 134 |
+
|
| 135 |
+
return fig
|
| 136 |
+
|
| 137 |
+
except Exception as e:
|
| 138 |
+
# If any error occurs, return a figure with error information
|
| 139 |
+
print(f"Error creating transaction timeline: {str(e)}")
|
| 140 |
+
fig = go.Figure()
|
| 141 |
+
fig.update_layout(
|
| 142 |
+
title="Error in Transaction Timeline",
|
| 143 |
+
xaxis_title="",
|
| 144 |
+
yaxis_title="",
|
| 145 |
+
height=400,
|
| 146 |
+
template="plotly_white"
|
| 147 |
+
)
|
| 148 |
+
fig.add_annotation(
|
| 149 |
+
text=f"Error generating timeline: {str(e)}",
|
| 150 |
+
showarrow=False,
|
| 151 |
+
font=dict(size=14, color="red")
|
| 152 |
+
)
|
| 153 |
+
return fig
|
| 154 |
+
|
| 155 |
+
def create_volume_chart(self, transactions_df: pd.DataFrame, time_window: str = 'D') -> go.Figure:
|
| 156 |
+
"""
|
| 157 |
+
Create a volume chart aggregated by time window
|
| 158 |
+
|
| 159 |
+
Args:
|
| 160 |
+
transactions_df: DataFrame of transactions
|
| 161 |
+
time_window: Time window for aggregation (e.g., 'D' for day, 'H' for hour)
|
| 162 |
+
|
| 163 |
+
Returns:
|
| 164 |
+
Plotly figure object
|
| 165 |
+
"""
|
| 166 |
+
# Create an empty figure with appropriate message if no data
|
| 167 |
+
if transactions_df.empty:
|
| 168 |
+
fig = go.Figure()
|
| 169 |
+
fig.update_layout(
|
| 170 |
+
title="No Transaction Data Available",
|
| 171 |
+
xaxis_title="Date",
|
| 172 |
+
yaxis_title="Volume",
|
| 173 |
+
height=400,
|
| 174 |
+
template="plotly_white"
|
| 175 |
+
)
|
| 176 |
+
fig.add_annotation(
|
| 177 |
+
text="No transactions found for volume analysis",
|
| 178 |
+
showarrow=False,
|
| 179 |
+
font=dict(size=14)
|
| 180 |
+
)
|
| 181 |
+
return fig
|
| 182 |
+
|
| 183 |
+
try:
|
| 184 |
+
# Create a deep copy to avoid modifying the original
|
| 185 |
+
df = transactions_df.copy()
|
| 186 |
+
|
| 187 |
+
# Ensure timestamp column exists and convert to datetime
|
| 188 |
+
if 'Timestamp' in df.columns:
|
| 189 |
+
timestamp_col = 'Timestamp'
|
| 190 |
+
elif 'timeStamp' in df.columns:
|
| 191 |
+
timestamp_col = 'timeStamp'
|
| 192 |
+
else:
|
| 193 |
+
# Create a dummy timestamp if none exists
|
| 194 |
+
df['dummy_timestamp'] = pd.date_range(start='2025-01-01', periods=len(df), freq='H')
|
| 195 |
+
timestamp_col = 'dummy_timestamp'
|
| 196 |
+
|
| 197 |
+
# Convert timestamp to datetime safely
|
| 198 |
+
if not pd.api.types.is_datetime64_any_dtype(df[timestamp_col]):
|
| 199 |
+
try:
|
| 200 |
+
df[timestamp_col] = pd.to_datetime(df[timestamp_col].astype(float), unit='s')
|
| 201 |
+
except Exception as e:
|
| 202 |
+
print(f"Error converting timestamp: {str(e)}")
|
| 203 |
+
df[timestamp_col] = pd.date_range(start='2025-01-01', periods=len(df), freq='H')
|
| 204 |
+
|
| 205 |
+
# Ensure amount column exists
|
| 206 |
+
if 'Amount' in df.columns:
|
| 207 |
+
amount_col = 'Amount'
|
| 208 |
+
elif 'tokenAmount' in df.columns:
|
| 209 |
+
amount_col = 'tokenAmount'
|
| 210 |
+
elif 'value' in df.columns:
|
| 211 |
+
# Try to adjust for decimals if 'tokenDecimal' exists
|
| 212 |
+
if 'tokenDecimal' in df.columns:
|
| 213 |
+
df['adjustedValue'] = df['value'].astype(float) / (10 ** df['tokenDecimal'].astype(int))
|
| 214 |
+
amount_col = 'adjustedValue'
|
| 215 |
+
else:
|
| 216 |
+
amount_col = 'value'
|
| 217 |
+
else:
|
| 218 |
+
# Create a dummy amount column if none exists
|
| 219 |
+
df['dummy_amount'] = 1.0
|
| 220 |
+
amount_col = 'dummy_amount'
|
| 221 |
+
|
| 222 |
+
# Alternative approach: manually aggregate by date to avoid index issues
|
| 223 |
+
df['date'] = df[timestamp_col].dt.date
|
| 224 |
+
|
| 225 |
+
# Group by date
|
| 226 |
+
volume_data = df.groupby('date').agg({
|
| 227 |
+
amount_col: 'sum',
|
| 228 |
+
timestamp_col: 'count'
|
| 229 |
+
}).reset_index()
|
| 230 |
+
|
| 231 |
+
volume_data.columns = ['Date', 'Volume', 'Count']
|
| 232 |
+
|
| 233 |
+
# Create figure
|
| 234 |
+
fig = go.Figure()
|
| 235 |
+
|
| 236 |
+
# Add volume bars
|
| 237 |
+
fig.add_trace(go.Bar(
|
| 238 |
+
x=volume_data['Date'],
|
| 239 |
+
y=volume_data['Volume'],
|
| 240 |
+
name='Volume',
|
| 241 |
+
marker_color='blue',
|
| 242 |
+
opacity=0.7
|
| 243 |
+
))
|
| 244 |
+
|
| 245 |
+
# Add transaction count line
|
| 246 |
+
fig.add_trace(go.Scatter(
|
| 247 |
+
x=volume_data['Date'],
|
| 248 |
+
y=volume_data['Count'],
|
| 249 |
+
name='Transaction Count',
|
| 250 |
+
mode='lines+markers',
|
| 251 |
+
marker=dict(color='red'),
|
| 252 |
+
yaxis='y2'
|
| 253 |
+
))
|
| 254 |
+
|
| 255 |
+
# Update layout
|
| 256 |
+
fig.update_layout(
|
| 257 |
+
title="Transaction Volume Over Time",
|
| 258 |
+
xaxis_title="Date",
|
| 259 |
+
yaxis_title="Volume",
|
| 260 |
+
yaxis2=dict(
|
| 261 |
+
title="Transaction Count",
|
| 262 |
+
overlaying="y",
|
| 263 |
+
side="right"
|
| 264 |
+
),
|
| 265 |
+
height=500,
|
| 266 |
+
template="plotly_white",
|
| 267 |
+
hovermode="x unified",
|
| 268 |
+
legend=dict(
|
| 269 |
+
orientation="h",
|
| 270 |
+
yanchor="bottom",
|
| 271 |
+
y=1.02,
|
| 272 |
+
xanchor="right",
|
| 273 |
+
x=1
|
| 274 |
+
)
|
| 275 |
+
)
|
| 276 |
+
|
| 277 |
+
return fig
|
| 278 |
+
|
| 279 |
+
except Exception as e:
|
| 280 |
+
# If any error occurs, return a figure with error information
|
| 281 |
+
print(f"Error in create_volume_chart: {str(e)}")
|
| 282 |
+
fig = go.Figure()
|
| 283 |
+
fig.update_layout(
|
| 284 |
+
title="Error in Volume Chart",
|
| 285 |
+
xaxis_title="",
|
| 286 |
+
yaxis_title="",
|
| 287 |
+
height=400,
|
| 288 |
+
template="plotly_white"
|
| 289 |
+
)
|
| 290 |
+
fig.add_annotation(
|
| 291 |
+
text=f"Error generating volume chart: {str(e)}",
|
| 292 |
+
showarrow=False,
|
| 293 |
+
font=dict(size=14, color="red")
|
| 294 |
+
)
|
| 295 |
+
return fig
|
| 296 |
+
|
| 297 |
+
def plot_volume_by_day(self, transactions_df: pd.DataFrame) -> go.Figure:
|
| 298 |
+
"""
|
| 299 |
+
Create a volume chart aggregated by day with improved visualization
|
| 300 |
+
|
| 301 |
+
Args:
|
| 302 |
+
transactions_df: DataFrame of transactions
|
| 303 |
+
|
| 304 |
+
Returns:
|
| 305 |
+
Plotly figure object
|
| 306 |
+
"""
|
| 307 |
+
# This is a wrapper around create_volume_chart that specifically uses day as the time window
|
| 308 |
+
return self.create_volume_chart(transactions_df, time_window='D')
|
| 309 |
+
|
| 310 |
+
def plot_transaction_flow(self, transactions_df: pd.DataFrame) -> go.Figure:
|
| 311 |
+
"""
|
| 312 |
+
Create a network flow visualization of transactions between wallets
|
| 313 |
+
|
| 314 |
+
Args:
|
| 315 |
+
transactions_df: DataFrame of transactions
|
| 316 |
+
|
| 317 |
+
Returns:
|
| 318 |
+
Plotly figure object
|
| 319 |
+
"""
|
| 320 |
+
if transactions_df.empty:
|
| 321 |
+
# Return empty figure if no data
|
| 322 |
+
fig = go.Figure()
|
| 323 |
+
fig.update_layout(
|
| 324 |
+
title="No Transaction Flow Data Available",
|
| 325 |
+
xaxis_title="",
|
| 326 |
+
yaxis_title="",
|
| 327 |
+
height=400,
|
| 328 |
+
template="plotly_white"
|
| 329 |
+
)
|
| 330 |
+
fig.add_annotation(
|
| 331 |
+
text="No transactions found for flow analysis",
|
| 332 |
+
showarrow=False,
|
| 333 |
+
font=dict(size=14)
|
| 334 |
+
)
|
| 335 |
+
return fig
|
| 336 |
+
|
| 337 |
+
try:
|
| 338 |
+
# Ensure from/to columns exist
|
| 339 |
+
if 'From' in transactions_df.columns and 'To' in transactions_df.columns:
|
| 340 |
+
from_col, to_col = 'From', 'To'
|
| 341 |
+
elif 'from' in transactions_df.columns and 'to' in transactions_df.columns:
|
| 342 |
+
from_col, to_col = 'from', 'to'
|
| 343 |
+
else:
|
| 344 |
+
# Create an error visualization
|
| 345 |
+
fig = go.Figure()
|
| 346 |
+
fig.update_layout(
|
| 347 |
+
title="Transaction Flow Error",
|
| 348 |
+
xaxis_title="",
|
| 349 |
+
yaxis_title="",
|
| 350 |
+
height=400,
|
| 351 |
+
template="plotly_white"
|
| 352 |
+
)
|
| 353 |
+
fig.add_annotation(
|
| 354 |
+
text="From/To columns not found in transactions data",
|
| 355 |
+
showarrow=False,
|
| 356 |
+
font=dict(size=14, color="red")
|
| 357 |
+
)
|
| 358 |
+
return fig
|
| 359 |
+
|
| 360 |
+
# Ensure amount column exists
|
| 361 |
+
if 'Amount' in transactions_df.columns:
|
| 362 |
+
amount_col = 'Amount'
|
| 363 |
+
elif 'tokenAmount' in transactions_df.columns:
|
| 364 |
+
amount_col = 'tokenAmount'
|
| 365 |
+
elif 'value' in transactions_df.columns:
|
| 366 |
+
# Try to adjust for decimals if 'tokenDecimal' exists
|
| 367 |
+
if 'tokenDecimal' in transactions_df.columns:
|
| 368 |
+
transactions_df['adjustedValue'] = transactions_df['value'].astype(float) / (10 ** transactions_df['tokenDecimal'].astype(int))
|
| 369 |
+
amount_col = 'adjustedValue'
|
| 370 |
+
else:
|
| 371 |
+
amount_col = 'value'
|
| 372 |
+
else:
|
| 373 |
+
# Create an error visualization
|
| 374 |
+
fig = go.Figure()
|
| 375 |
+
fig.update_layout(
|
| 376 |
+
title="Transaction Flow Error",
|
| 377 |
+
xaxis_title="",
|
| 378 |
+
yaxis_title="",
|
| 379 |
+
height=400,
|
| 380 |
+
template="plotly_white"
|
| 381 |
+
)
|
| 382 |
+
fig.add_annotation(
|
| 383 |
+
text="Amount column not found in transactions data",
|
| 384 |
+
showarrow=False,
|
| 385 |
+
font=dict(size=14, color="red")
|
| 386 |
+
)
|
| 387 |
+
return fig
|
| 388 |
+
|
| 389 |
+
# Aggregate flows between wallets
|
| 390 |
+
flow_df = transactions_df.groupby([from_col, to_col]).agg({
|
| 391 |
+
amount_col: ['sum', 'count']
|
| 392 |
+
}).reset_index()
|
| 393 |
+
|
| 394 |
+
flow_df.columns = [from_col, to_col, 'Value', 'Count']
|
| 395 |
+
|
| 396 |
+
# Limit to top 20 flows to keep visualization readable
|
| 397 |
+
top_flows = flow_df.sort_values('Value', ascending=False).head(20)
|
| 398 |
+
|
| 399 |
+
# Create Sankey diagram
|
| 400 |
+
# First, create a mapping of unique addresses to indices
|
| 401 |
+
all_addresses = pd.unique(top_flows[[from_col, to_col]].values.ravel('K'))
|
| 402 |
+
address_to_idx = {addr: i for i, addr in enumerate(all_addresses)}
|
| 403 |
+
|
| 404 |
+
# Create source, target, and value arrays for the Sankey diagram
|
| 405 |
+
sources = [address_to_idx[addr] for addr in top_flows[from_col]]
|
| 406 |
+
targets = [address_to_idx[addr] for addr in top_flows[to_col]]
|
| 407 |
+
values = top_flows['Value'].tolist()
|
| 408 |
+
|
| 409 |
+
# Create hover text
|
| 410 |
+
hover_text = [f"From: {src}<br>To: {tgt}<br>Value: {val:.2f}<br>Count: {cnt}"
|
| 411 |
+
for src, tgt, val, cnt in zip(top_flows[from_col], top_flows[to_col],
|
| 412 |
+
top_flows['Value'], top_flows['Count'])]
|
| 413 |
+
|
| 414 |
+
# Shorten addresses for node labels
|
| 415 |
+
node_labels = [f"{addr[:6]}...{addr[-4:]}" if len(addr) > 12 else addr
|
| 416 |
+
for addr in all_addresses]
|
| 417 |
+
|
| 418 |
+
# Create Sankey diagram figure
|
| 419 |
+
fig = go.Figure(data=[go.Sankey(
|
| 420 |
+
node=dict(
|
| 421 |
+
pad=15,
|
| 422 |
+
thickness=20,
|
| 423 |
+
line=dict(color="black", width=0.5),
|
| 424 |
+
label=node_labels,
|
| 425 |
+
color="blue"
|
| 426 |
+
),
|
| 427 |
+
link=dict(
|
| 428 |
+
source=sources,
|
| 429 |
+
target=targets,
|
| 430 |
+
value=values,
|
| 431 |
+
label=hover_text,
|
| 432 |
+
hovertemplate='%{label}<extra></extra>'
|
| 433 |
+
)
|
| 434 |
+
)])
|
| 435 |
+
|
| 436 |
+
fig.update_layout(
|
| 437 |
+
title="Whale Transaction Flow",
|
| 438 |
+
font_size=12,
|
| 439 |
+
height=600,
|
| 440 |
+
template="plotly_white"
|
| 441 |
+
)
|
| 442 |
+
|
| 443 |
+
return fig
|
| 444 |
+
|
| 445 |
+
except Exception as e:
|
| 446 |
+
# If any error occurs, return a figure with error information
|
| 447 |
+
print(f"Error in plot_transaction_flow: {str(e)}")
|
| 448 |
+
fig = go.Figure()
|
| 449 |
+
fig.update_layout(
|
| 450 |
+
title="Error in Transaction Flow",
|
| 451 |
+
xaxis_title="",
|
| 452 |
+
yaxis_title="",
|
| 453 |
+
height=400,
|
| 454 |
+
template="plotly_white"
|
| 455 |
+
)
|
| 456 |
+
fig.add_annotation(
|
| 457 |
+
text=f"Error generating transaction flow: {str(e)}",
|
| 458 |
+
showarrow=False,
|
| 459 |
+
font=dict(size=14, color="red")
|
| 460 |
+
)
|
| 461 |
+
return fig
|
| 462 |
+
|
| 463 |
+
def generate_pdf_report(self,
|
| 464 |
+
transactions_df: pd.DataFrame,
|
| 465 |
+
patterns: List[Dict[str, Any]] = None,
|
| 466 |
+
price_impact: Dict[str, Any] = None,
|
| 467 |
+
alerts: List[Dict[str, Any]] = None,
|
| 468 |
+
title: str = "Whale Analysis Report",
|
| 469 |
+
start_date: datetime = None,
|
| 470 |
+
end_date: datetime = None) -> bytes:
|
| 471 |
+
"""
|
| 472 |
+
Generate a PDF report of whale activity
|
| 473 |
+
|
| 474 |
+
Args:
|
| 475 |
+
transactions_df: DataFrame of transactions
|
| 476 |
+
patterns: List of pattern dictionaries
|
| 477 |
+
price_impact: Dictionary of price impact analysis
|
| 478 |
+
alerts: List of alert dictionaries
|
| 479 |
+
title: Report title
|
| 480 |
+
start_date: Start date for report period
|
| 481 |
+
end_date: End date for report period
|
| 482 |
+
|
| 483 |
+
Returns:
|
| 484 |
+
PDF report as bytes
|
| 485 |
+
"""
|
| 486 |
+
buffer = io.BytesIO()
|
| 487 |
+
doc = SimpleDocTemplate(buffer, pagesize=letter)
|
| 488 |
+
elements = []
|
| 489 |
+
|
| 490 |
+
# Add title
|
| 491 |
+
styles = getSampleStyleSheet()
|
| 492 |
+
elements.append(Paragraph(title, styles['Title']))
|
| 493 |
+
|
| 494 |
+
# Add date range
|
| 495 |
+
if start_date and end_date:
|
| 496 |
+
date_range = f"Period: {start_date.strftime('%Y-%m-%d')} to {end_date.strftime('%Y-%m-%d')}"
|
| 497 |
+
elements.append(Paragraph(date_range, styles['Heading2']))
|
| 498 |
+
|
| 499 |
+
elements.append(Spacer(1, 12))
|
| 500 |
+
|
| 501 |
+
# Add transaction summary
|
| 502 |
+
if not transactions_df.empty:
|
| 503 |
+
elements.append(Paragraph("Transaction Summary", styles['Heading2']))
|
| 504 |
+
summary_data = [
|
| 505 |
+
["Total Transactions", str(len(transactions_df))],
|
| 506 |
+
["Unique Addresses", str(len(pd.unique(transactions_df['from'].tolist() + transactions_df['to'].tolist())))]
|
| 507 |
+
]
|
| 508 |
+
|
| 509 |
+
# Add token breakdown if available
|
| 510 |
+
if 'tokenSymbol' in transactions_df.columns:
|
| 511 |
+
token_counts = transactions_df['tokenSymbol'].value_counts()
|
| 512 |
+
summary_data.append(["Most Common Token", f"{token_counts.index[0]} ({token_counts.iloc[0]} txns)"])
|
| 513 |
+
|
| 514 |
+
summary_table = Table(summary_data)
|
| 515 |
+
summary_table.setStyle(TableStyle([
|
| 516 |
+
('BACKGROUND', (0, 0), (0, -1), colors.lightgrey),
|
| 517 |
+
('GRID', (0, 0), (-1, -1), 1, colors.black),
|
| 518 |
+
('PADDING', (0, 0), (-1, -1), 6),
|
| 519 |
+
]))
|
| 520 |
+
elements.append(summary_table)
|
| 521 |
+
elements.append(Spacer(1, 12))
|
| 522 |
+
|
| 523 |
+
# Add pattern analysis
|
| 524 |
+
if patterns:
|
| 525 |
+
elements.append(Paragraph("Trading Patterns Detected", styles['Heading2']))
|
| 526 |
+
for i, pattern in enumerate(patterns):
|
| 527 |
+
pattern_text = f"Pattern {i+1}: {pattern.get('name', 'Unnamed')}\n"
|
| 528 |
+
pattern_text += f"Description: {pattern.get('description', 'No description')}\n"
|
| 529 |
+
if 'risk_profile' in pattern:
|
| 530 |
+
pattern_text += f"Risk Profile: {pattern['risk_profile']}\n"
|
| 531 |
+
if 'confidence' in pattern:
|
| 532 |
+
pattern_text += f"Confidence: {pattern['confidence']:.2f}\n"
|
| 533 |
+
|
| 534 |
+
elements.append(Paragraph(pattern_text, styles['Normal']))
|
| 535 |
+
elements.append(Spacer(1, 6))
|
| 536 |
+
|
| 537 |
+
elements.append(Spacer(1, 12))
|
| 538 |
+
|
| 539 |
+
# Add price impact analysis
|
| 540 |
+
if price_impact:
|
| 541 |
+
elements.append(Paragraph("Price Impact Analysis", styles['Heading2']))
|
| 542 |
+
impact_text = ""
|
| 543 |
+
if 'avg_impact' in price_impact:
|
| 544 |
+
impact_text += f"Average Impact: {price_impact['avg_impact']:.2f}%\n"
|
| 545 |
+
if 'max_impact' in price_impact:
|
| 546 |
+
impact_text += f"Maximum Impact: {price_impact['max_impact']:.2f}%\n"
|
| 547 |
+
if 'insights' in price_impact:
|
| 548 |
+
impact_text += f"Insights: {price_impact['insights']}\n"
|
| 549 |
+
|
| 550 |
+
elements.append(Paragraph(impact_text, styles['Normal']))
|
| 551 |
+
elements.append(Spacer(1, 12))
|
| 552 |
+
|
| 553 |
+
# Add alerts
|
| 554 |
+
if alerts:
|
| 555 |
+
elements.append(Paragraph("Alerts", styles['Heading2']))
|
| 556 |
+
for alert in alerts:
|
| 557 |
+
alert_text = f"{alert.get('level', 'Info')}: {alert.get('message', 'No details')}"
|
| 558 |
+
elements.append(Paragraph(alert_text, styles['Normal']))
|
| 559 |
+
elements.append(Spacer(1, 6))
|
| 560 |
+
|
| 561 |
+
# Build the PDF
|
| 562 |
+
doc.build(elements)
|
| 563 |
+
buffer.seek(0)
|
| 564 |
+
return buffer.getvalue()
|
| 565 |
+
|
| 566 |
+
def generate_csv_report(self,
|
| 567 |
+
transactions_df: pd.DataFrame,
|
| 568 |
+
report_type: str = "Transaction Summary") -> str:
|
| 569 |
+
"""
|
| 570 |
+
Generate a CSV report of transaction data
|
| 571 |
+
|
| 572 |
+
Args:
|
| 573 |
+
transactions_df: DataFrame of transactions
|
| 574 |
+
report_type: Type of report to generate
|
| 575 |
+
|
| 576 |
+
Returns:
|
| 577 |
+
CSV data as string
|
| 578 |
+
"""
|
| 579 |
+
if transactions_df.empty:
|
| 580 |
+
return "No data available for report"
|
| 581 |
+
|
| 582 |
+
if report_type == "Transaction Summary":
|
| 583 |
+
# Return basic transaction summary
|
| 584 |
+
return transactions_df.to_csv(index=False)
|
| 585 |
+
elif report_type == "Daily Volume":
|
| 586 |
+
# Get timestamp column
|
| 587 |
+
if 'Timestamp' in transactions_df.columns:
|
| 588 |
+
timestamp_col = 'Timestamp'
|
| 589 |
+
elif 'timeStamp' in transactions_df.columns:
|
| 590 |
+
timestamp_col = 'timeStamp'
|
| 591 |
+
# Convert timestamp to datetime if needed
|
| 592 |
+
if not pd.api.types.is_datetime64_any_dtype(transactions_df[timestamp_col]):
|
| 593 |
+
try:
|
| 594 |
+
transactions_df[timestamp_col] = pd.to_datetime(transactions_df[timestamp_col].astype(float), unit='s')
|
| 595 |
+
except:
|
| 596 |
+
return "Error processing timestamp data"
|
| 597 |
+
else:
|
| 598 |
+
return "Timestamp column not found"
|
| 599 |
+
|
| 600 |
+
# Get amount column
|
| 601 |
+
if 'Amount' in transactions_df.columns:
|
| 602 |
+
amount_col = 'Amount'
|
| 603 |
+
elif 'tokenAmount' in transactions_df.columns:
|
| 604 |
+
amount_col = 'tokenAmount'
|
| 605 |
+
elif 'value' in transactions_df.columns:
|
| 606 |
+
amount_col = 'value'
|
| 607 |
+
else:
|
| 608 |
+
return "Amount column not found"
|
| 609 |
+
|
| 610 |
+
# Aggregate by day
|
| 611 |
+
transactions_df['date'] = transactions_df[timestamp_col].dt.date
|
| 612 |
+
daily_volume = transactions_df.groupby('date').agg({
|
| 613 |
+
amount_col: 'sum',
|
| 614 |
+
'hash': 'count' # Assuming 'hash' exists for all transactions
|
| 615 |
+
}).reset_index()
|
| 616 |
+
|
| 617 |
+
daily_volume.columns = ['Date', 'Volume', 'Transactions']
|
| 618 |
+
return daily_volume.to_csv(index=False)
|
| 619 |
+
else:
|
| 620 |
+
return "Unknown report type"
|
| 621 |
+
|
| 622 |
+
def generate_png_chart(self,
|
| 623 |
+
fig: go.Figure,
|
| 624 |
+
width: int = 1200,
|
| 625 |
+
height: int = 800) -> bytes:
|
| 626 |
+
"""
|
| 627 |
+
Convert a Plotly figure to PNG image data
|
| 628 |
+
|
| 629 |
+
Args:
|
| 630 |
+
fig: Plotly figure object
|
| 631 |
+
width: Image width in pixels
|
| 632 |
+
height: Image height in pixels
|
| 633 |
+
|
| 634 |
+
Returns:
|
| 635 |
+
PNG image as bytes
|
| 636 |
+
"""
|
| 637 |
+
img_bytes = fig.to_image(format="png", width=width, height=height)
|
| 638 |
+
return img_bytes
|
requirements.txt
ADDED
|
@@ -0,0 +1,12 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
streamlit==1.30.0
|
| 2 |
+
pandas==2.1.1
|
| 3 |
+
numpy==1.26.0
|
| 4 |
+
matplotlib==3.8.0
|
| 5 |
+
plotly==5.18.0
|
| 6 |
+
python-dotenv==1.0.0
|
| 7 |
+
requests==2.31.0
|
| 8 |
+
scikit-learn==1.3.1
|
| 9 |
+
crewai>=0.28.0
|
| 10 |
+
langchain>=0.1.0,<0.2.0
|
| 11 |
+
reportlab==4.0.5
|
| 12 |
+
weasyprint==60.1
|
test_api.py
ADDED
|
@@ -0,0 +1,205 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import os
|
| 2 |
+
import sys
|
| 3 |
+
import json
|
| 4 |
+
import urllib.request
|
| 5 |
+
import urllib.parse
|
| 6 |
+
import urllib.error
|
| 7 |
+
from urllib.error import URLError, HTTPError
|
| 8 |
+
|
| 9 |
+
# Simple dotenv implementation since the module may not be available
|
| 10 |
+
def load_dotenv():
|
| 11 |
+
try:
|
| 12 |
+
with open('.env', 'r') as file:
|
| 13 |
+
for line in file:
|
| 14 |
+
line = line.strip()
|
| 15 |
+
if not line or line.startswith('#') or '=' not in line:
|
| 16 |
+
continue
|
| 17 |
+
key, value = line.split('=', 1)
|
| 18 |
+
os.environ[key] = value
|
| 19 |
+
except Exception as e:
|
| 20 |
+
print(f"Error loading .env file: {e}")
|
| 21 |
+
return False
|
| 22 |
+
return True
|
| 23 |
+
|
| 24 |
+
# Load environment variables
|
| 25 |
+
load_dotenv()
|
| 26 |
+
|
| 27 |
+
# Get API key from .env
|
| 28 |
+
ARBISCAN_API_KEY = os.getenv("ARBISCAN_API_KEY")
|
| 29 |
+
if not ARBISCAN_API_KEY:
|
| 30 |
+
print("ERROR: ARBISCAN_API_KEY not found in .env file")
|
| 31 |
+
sys.exit(1)
|
| 32 |
+
|
| 33 |
+
print(f"Using Arbiscan API Key: {ARBISCAN_API_KEY[:5]}...")
|
| 34 |
+
|
| 35 |
+
# Test addresses (known active ones)
|
| 36 |
+
TEST_ADDRESSES = [
|
| 37 |
+
"0x5d8908afee1df9f7f0830105f8be828f97ce9e68", # Arbitrum Treasury
|
| 38 |
+
"0x2b1ad6184a6b0fac06bd225ed37c2abc04415ff4", # Large holder
|
| 39 |
+
"0xc47ff7f9efb3ef39c33a2c492a1372418d399ec2", # Active trader
|
| 40 |
+
]
|
| 41 |
+
|
| 42 |
+
# User-provided addresses (from command line arguments)
|
| 43 |
+
if len(sys.argv) > 1:
|
| 44 |
+
USER_ADDRESSES = sys.argv[1:]
|
| 45 |
+
TEST_ADDRESSES.extend(USER_ADDRESSES)
|
| 46 |
+
print(f"Added user-provided addresses: {USER_ADDRESSES}")
|
| 47 |
+
|
| 48 |
+
def test_api_key():
|
| 49 |
+
"""Test if the API key is valid"""
|
| 50 |
+
base_url = "https://api.arbiscan.io/api"
|
| 51 |
+
params = {
|
| 52 |
+
"module": "stats",
|
| 53 |
+
"action": "ethsupply",
|
| 54 |
+
"apikey": ARBISCAN_API_KEY
|
| 55 |
+
}
|
| 56 |
+
|
| 57 |
+
try:
|
| 58 |
+
print("\n===== TESTING API KEY =====")
|
| 59 |
+
# Construct URL with parameters
|
| 60 |
+
query_string = urllib.parse.urlencode(params)
|
| 61 |
+
url = f"{base_url}?{query_string}"
|
| 62 |
+
print(f"Making request to: {url}")
|
| 63 |
+
|
| 64 |
+
# Make the request
|
| 65 |
+
with urllib.request.urlopen(url) as response:
|
| 66 |
+
response_data = response.read().decode('utf-8')
|
| 67 |
+
data = json.loads(response_data)
|
| 68 |
+
|
| 69 |
+
print(f"Response status code: {response.status}")
|
| 70 |
+
print(f"Response JSON status: {data.get('status')}")
|
| 71 |
+
print(f"Response message: {data.get('message', 'No message')}")
|
| 72 |
+
|
| 73 |
+
if data.get("status") == "1":
|
| 74 |
+
print("β
API KEY IS VALID")
|
| 75 |
+
return True
|
| 76 |
+
else:
|
| 77 |
+
print("β API KEY IS INVALID OR HAS ISSUES")
|
| 78 |
+
if "API Key" in data.get("message", ""):
|
| 79 |
+
print(f"Error message: {data.get('message')}")
|
| 80 |
+
print("β You need to register for an API key at https://arbiscan.io/myapikey")
|
| 81 |
+
return False
|
| 82 |
+
|
| 83 |
+
except HTTPError as e:
|
| 84 |
+
print(f"β HTTP Error: {e.code} - {e.reason}")
|
| 85 |
+
return False
|
| 86 |
+
except URLError as e:
|
| 87 |
+
print(f"β URL Error: {e.reason}")
|
| 88 |
+
return False
|
| 89 |
+
except Exception as e:
|
| 90 |
+
print(f"β Error testing API key: {str(e)}")
|
| 91 |
+
return False
|
| 92 |
+
|
| 93 |
+
def test_address(address):
|
| 94 |
+
"""Test if an address has transactions on Arbitrum"""
|
| 95 |
+
base_url = "https://api.arbiscan.io/api"
|
| 96 |
+
|
| 97 |
+
# Test for token transfers
|
| 98 |
+
params_token = {
|
| 99 |
+
"module": "account",
|
| 100 |
+
"action": "tokentx",
|
| 101 |
+
"address": address,
|
| 102 |
+
"startblock": "0",
|
| 103 |
+
"endblock": "99999999",
|
| 104 |
+
"page": "1",
|
| 105 |
+
"offset": "10", # Just get 10 for testing
|
| 106 |
+
"sort": "desc",
|
| 107 |
+
"apikey": ARBISCAN_API_KEY
|
| 108 |
+
}
|
| 109 |
+
|
| 110 |
+
# Test for normal transactions
|
| 111 |
+
params_normal = {
|
| 112 |
+
"module": "account",
|
| 113 |
+
"action": "txlist",
|
| 114 |
+
"address": address,
|
| 115 |
+
"startblock": "0",
|
| 116 |
+
"endblock": "99999999",
|
| 117 |
+
"page": "1",
|
| 118 |
+
"offset": "10", # Just get 10 for testing
|
| 119 |
+
"sort": "desc",
|
| 120 |
+
"apikey": ARBISCAN_API_KEY
|
| 121 |
+
}
|
| 122 |
+
|
| 123 |
+
print(f"\n===== TESTING ADDRESS: {address} =====")
|
| 124 |
+
|
| 125 |
+
# Check token transfers
|
| 126 |
+
try:
|
| 127 |
+
print("Testing token transfers...")
|
| 128 |
+
# Construct URL with parameters
|
| 129 |
+
query_string = urllib.parse.urlencode(params_token)
|
| 130 |
+
url = f"{base_url}?{query_string}"
|
| 131 |
+
|
| 132 |
+
# Make the request
|
| 133 |
+
with urllib.request.urlopen(url) as response:
|
| 134 |
+
response_data = response.read().decode('utf-8')
|
| 135 |
+
data = json.loads(response_data)
|
| 136 |
+
|
| 137 |
+
if data.get("status") == "1":
|
| 138 |
+
transfers = data.get("result", [])
|
| 139 |
+
print(f"β
Found {len(transfers)} token transfers")
|
| 140 |
+
if transfers:
|
| 141 |
+
print(f"First transfer: {json.dumps(transfers[0], indent=2)[:200]}...")
|
| 142 |
+
else:
|
| 143 |
+
print(f"β No token transfers found: {data.get('message', 'Unknown error')}")
|
| 144 |
+
|
| 145 |
+
except HTTPError as e:
|
| 146 |
+
print(f"β HTTP Error: {e.code} - {e.reason}")
|
| 147 |
+
except URLError as e:
|
| 148 |
+
print(f"β URL Error: {e.reason}")
|
| 149 |
+
except Exception as e:
|
| 150 |
+
print(f"β Error testing token transfers: {str(e)}")
|
| 151 |
+
|
| 152 |
+
# Check normal transactions
|
| 153 |
+
try:
|
| 154 |
+
print("\nTesting normal transactions...")
|
| 155 |
+
# Construct URL with parameters
|
| 156 |
+
query_string = urllib.parse.urlencode(params_normal)
|
| 157 |
+
url = f"{base_url}?{query_string}"
|
| 158 |
+
|
| 159 |
+
# Make the request
|
| 160 |
+
with urllib.request.urlopen(url) as response:
|
| 161 |
+
response_data = response.read().decode('utf-8')
|
| 162 |
+
data = json.loads(response_data)
|
| 163 |
+
|
| 164 |
+
if data.get("status") == "1":
|
| 165 |
+
transactions = data.get("result", [])
|
| 166 |
+
print(f"β
Found {len(transactions)} normal transactions")
|
| 167 |
+
if transactions:
|
| 168 |
+
print(f"First transaction: {json.dumps(transactions[0], indent=2)[:200]}...")
|
| 169 |
+
else:
|
| 170 |
+
print(f"β No normal transactions found: {data.get('message', 'Unknown error')}")
|
| 171 |
+
|
| 172 |
+
except HTTPError as e:
|
| 173 |
+
print(f"β HTTP Error: {e.code} - {e.reason}")
|
| 174 |
+
except URLError as e:
|
| 175 |
+
print(f"β URL Error: {e.reason}")
|
| 176 |
+
except Exception as e:
|
| 177 |
+
print(f"β Error testing normal transactions: {str(e)}")
|
| 178 |
+
|
| 179 |
+
def main():
|
| 180 |
+
"""Main function to run tests"""
|
| 181 |
+
print("=================================================")
|
| 182 |
+
print("Arbitrum API Diagnostic Tool")
|
| 183 |
+
print("=================================================")
|
| 184 |
+
|
| 185 |
+
# Test the API key first
|
| 186 |
+
api_valid = test_api_key()
|
| 187 |
+
|
| 188 |
+
if not api_valid:
|
| 189 |
+
print("\nβ οΈ Please update your API key in the .env file")
|
| 190 |
+
print("Register for an API key at https://arbiscan.io/myapikey")
|
| 191 |
+
return
|
| 192 |
+
|
| 193 |
+
# Test each address
|
| 194 |
+
for address in TEST_ADDRESSES:
|
| 195 |
+
test_address(address)
|
| 196 |
+
|
| 197 |
+
print("\n=================================================")
|
| 198 |
+
print("RECOMMENDATIONS:")
|
| 199 |
+
print("1. If your API key is invalid, update it in the .env file")
|
| 200 |
+
print("2. If test addresses work but yours don't, your addresses might not have activity on Arbitrum")
|
| 201 |
+
print("3. Use one of the working test addresses in your app for testing")
|
| 202 |
+
print("=================================================")
|
| 203 |
+
|
| 204 |
+
if __name__ == "__main__":
|
| 205 |
+
main()
|