File size: 21,709 Bytes
7a92197 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 |
# Cancer@Home v2 - Architecture Diagram
## System Architecture
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β WEB BROWSER β
β http://localhost:5000 β
ββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββββ
β
β HTTP/WebSocket
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FRONTEND (HTML5/CSS3/JS) β
β ββββββββββββ¬βββββββββββ¬βββββββββββ¬βββββββββββ¬βββββββββββββββββββββββ β
β βDashboard β Neo4j β BOINC β GDC β Pipeline β β
β β View β Viz β Tasks β Data β Tools β β
β ββββββββββββ΄βββββββββββ΄βββββββββββ΄βββββββββββ΄βββββββββββββββββββββββ β
β β
β Technologies: D3.js, Chart.js, Vanilla JavaScript β
ββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββββββ
β
β REST API + GraphQL
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β BACKEND (FastAPI + Python) β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β API Layer β β
β β β’ REST Endpoints (/api/*) β β
β β β’ GraphQL Endpoint (/graphql) β β
β β β’ WebSocket Support β β
β β β’ Swagger Documentation (/docs) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β β Python Modules β
β βΌ β
β ββββββββββββ¬βββββββββββ¬βββββββββββ¬βββββββββββ¬ββββββββββββββββββββββ β
β β BOINC β GDC β Neo4j β Pipeline β Utilities β β
β β Client β Client β DB β Tools β β β
β ββββββββββββ΄βββββββββββ΄βββββββββββ΄βββββββββββ΄ββββββββββββββββββββββ β
βββββββββ¬βββββββββββ¬βββββββββββ¬βββββββββββ¬βββββββββββββββββββββββββββββββββ
β β β β
β β β β
βΌ βΌ βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DATA & SERVICES LAYER β
β β
β ββββββββββββββββββββββ ββββββββββββββββββββββ ββββββββββββββββββββ β
β β Neo4j Graph β β BOINC Server β β GDC Portal β β
β β Database β β (Distributed) β β (External) β β
β β β β β β β β
β β Port: 7687 (Bolt) β β Local/Remote β β api.gdc.cancer β
β β 7474 (HTTP) β β Task Processing β β .gov β β
β β β β β β β β
β β β’ Genes β β β’ Variant Calling β β β’ TCGA Data β β
β β β’ Mutations β β β’ BLAST Search β β β’ TARGET Data β β
β β β’ Patients β β β’ Alignment β β β’ Clinical Data β β
β β β’ Cancer Types β β β’ Annotation β β β’ Genomic Files β β
β ββββββββββββββββββββββ ββββββββββββββββββββββ ββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β Bioinformatics Tools (Local) ββ
β β ββ
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββ ββ
β β β FASTQ β β BLAST β β Variant Caller β ββ
β β β Processor β β Runner β β β ββ
β β β β β β β β ββ
β β β β’ QC β β β’ BLASTN β β β’ VCF Generation β ββ
β β β β’ Filtering β β β’ BLASTP β β β’ Annotation β ββ
β β β β’ Trimming β β β’ Parsing β β β’ TMB Calculation β ββ
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββ ββ
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FILE STORAGE β
β β
β data/ β
β βββ gdc/ # Downloaded GDC files β
β βββ boinc/ # BOINC task data β
β βββ processed/ # Analysis results β
β β βββ fastq/ β
β β βββ blast/ β
β β βββ variants/ β
β βββ cache/ # Temporary files β
β β
β logs/ # Application logs β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
## Data Flow Diagram
```
ββββββββββββββββ
β User β
β Browser β
ββββββββ¬ββββββββ
β 1. Request
βΌ
ββββββββββββββββββββββββββββββββββββ
β Dashboard β
β (View Gene/Mutation Data) β
ββββββββ¬ββββββββββββββββββββββββββββ
β 2. GraphQL Query
βΌ
ββββββββββββββββββββββββββββββββββββ
β FastAPI Backend β
β - Parse Query β
β - Validate Request β
ββββββββ¬ββββββββββββββββββββββββββββ
β 3. Cypher Query
βΌ
ββββββββββββββββββββββββββββββββββββ
β Neo4j Database β
β - Execute Graph Query β
β - Traverse Relationships β
β - Aggregate Results β
ββββββββ¬ββββββββββββββββββββββββββββ
β 4. Graph Data
βΌ
ββββββββββββββββββββββββββββββββββββ
β GraphQL Resolver β
β - Transform Data β
β - Format Response β
ββββββββ¬ββββββββββββββββββββββββββββ
β 5. JSON Response
βΌ
ββββββββββββββββββββββββββββββββββββ
β Frontend Visualization β
β - Render Graph β
β - Display Charts β
β - Show Statistics β
ββββββββββββββββββββββββββββββββββββ
```
## BOINC Task Processing Flow
```
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β Submit β β Queue β β Execute β
β Task βββββββΆβ Task βββββββΆβ Analysis β
β β β β β β
ββββββββββββββββ ββββββββββββββββ ββββββββ¬ββββββββ
β
βΌ
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β Store β β Import to β β Generate β
β Results ββββββββ Neo4j ββββββββ Results β
β β β β β β
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
```
## Neo4j Graph Schema
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Neo4j Graph Model β
β β
β ββββββββββββ ββββββββββββ β
β β Gene β β Mutation β β
β ββββββββββββ€ ββββββββββββ€ β
β β gene_id βββββββββAFFECTSββββββmut_id β β
β β symbol β β chr β β
β β name β β position β β
β β chr β β ref β β
β ββββββββββββ β alt β β
β ββββββ²ββββββ β
β β β
β β HAS_MUTATION β
β β β
β ββββββββββββ ββββββ΄ββββββ β
β β Cancer β β Patient β β
β β Type β ββββββββββββ€ β
β ββββββββββββ€ βpatient_idβ β
β βcancer_id β β age β β
β β name ββββDIAGNOSED_WITHββββ gender β β
β β tissue β β race β β
β ββββββββββββ β status β β
β ββββββββββββ β
β β
β Relationships: β
β β’ Gene β AFFECTS β Mutation β
β β’ Patient β HAS_MUTATION β Mutation β
β β’ Patient β DIAGNOSED_WITH β CancerType β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
## Technology Stack
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Technology Layers β
β β
β Frontend: β
β β’ HTML5, CSS3, JavaScript (ES6+) β
β β’ D3.js (Graph Visualization) β
β β’ Chart.js (Charts & Analytics) β
β β’ Responsive Design β
β β
β Backend: β
β β’ Python 3.8+ β
β β’ FastAPI (Web Framework) β
β β’ Uvicorn (ASGI Server) β
β β’ Strawberry (GraphQL) β
β β
β Database: β
β β’ Neo4j 5.13 (Graph Database) β
β β’ Bolt Protocol β
β β’ APOC & GDS Plugins β
β β
β Data Processing: β
β β’ Biopython (Sequence Analysis) β
β β’ NumPy & Pandas (Data Manipulation) β
β β’ BLAST+ (Sequence Alignment) β
β β
β Infrastructure: β
β β’ Docker & Docker Compose β
β β’ YAML Configuration β
β β’ Python Virtual Environments β
β β
β External APIs: β
β β’ GDC Portal API (Cancer Data) β
β β’ BOINC RPC (Distributed Computing) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
## Deployment Architecture
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Local Development β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Host Machine β β
β β β β
β β βββββββββββββββββββ ββββββββββββββββββββββββ β β
β β β Python venv β β Docker Desktop β β β
β β β Port 5000 β β β β β
β β β β β ββββββββββββββββββ β β β
β β β β’ FastAPI β β β Neo4j β β β β
β β β β’ Backend API ββββββββββΆβ β Port 7474 β β β β
β β β β’ GraphQL β β β Port 7687 β β β β
β β β β’ WebSocket β β ββββββββββββββββββ β β β
β β β β β β β β
β β βββββββββββββββββββ ββββββββββββββββββββββββ β β
β β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β Access URLs: β
β β’ http://localhost:5000 - Main Application β
β β’ http://localhost:5000/docs - API Documentation β
β β’ http://localhost:5000/graphql - GraphQL Playground β
β β’ http://localhost:7474 - Neo4j Browser β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
|