File size: 5,451 Bytes
2be1f47
 
 
48557e9
2be1f47
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
---

title: ODIN  Operational Drilling Intelligence Network
emoji: 🛢️
colorFrom: gray
colorTo: green
sdk: gradio
sdk_version: 6.9.0
app_file: app.py
pinned: true
license: mit
---


# ODIN — Operational Drilling Intelligence Network

> Multi-agent AI system for subsurface and drilling engineering analysis
> Built on the public Equinor Volve Field dataset · SPE GCS 2026 ML Challenge

---

## Overview

ODIN is a CrewAI-powered multi-agent system that answers complex drilling engineering questions by reasoning over structured data (WITSML, EDM) and unstructured reports (Daily Drilling Reports). It combines real-time data retrieval, RAG over domain knowledge, and a Gradio chat interface with inline Plotly visualizations.

**Key capabilities:**
- Drill phase distribution & NPT breakdown analysis
- ROP / WOB / RPM performance profiling
- Cross-well KPI comparison
- BHA configuration review and handover summaries
- Stuck-pipe and wellbore stability root-cause analysis
- Evidence-cited answers with confidence levels

---

## Architecture

```

User Query



Orchestrator (orchestrator.py)

    │  Classifies query → lean or full crew


    ├── LEAN (chart / compare queries, ~40s)

    │     Analyst  ──► Lead (Odin)


    └── FULL (deep analysis, ~80s)

          Lead  ──► Analyst  ──► Historian  ──► Lead (Odin)

```

**Agents:**
| Agent | Role |
|---|---|
| **Odin (Lead)** | Synthesizes findings, grounds in Volve KB |
| **Data Analyst** | Runs DDR / WITSML / EDM queries & Python charts |
| **Historian** | Searches operational history, validates stats |

**Tools available to agents:**
- `DDR_Query` — Daily Drilling Report search
- `WITSML_Analyst` — Realtime drilling log analysis
- `EDM_Technical_Query` — Casing, BHA, formation data
- `CrossWell_Comparison` — Multi-well KPI comparison
- `VolveHistory_SearchTool` — RAG over Volve campaign history
- `python_interpreter` — Pandas + Plotly for custom charts

---

## Tech Stack

| Layer | Technology |
|---|---|
| LLM | Google Gemini 2.5 Flash (via `google-generativeai`) |
| Agent framework | CrewAI 1.10 |
| RAG / Vector store | ChromaDB + `sentence-transformers` |
| Data processing | Pandas, NumPy, PDFPlumber |
| Visualisation | Plotly (HTML) + Kaleido (PNG) |
| UI | Gradio 6 |

---

## Data

This project uses the **Equinor Volve Field open dataset** (released under the Volve Data Sharing Agreement).

> Download from: [https://www.equinor.com/energy/volve-data-sharing](https://www.equinor.com/energy/volve-data-sharing)

After downloading, extract to `data/raw/` and run the ETL pipeline:

```bash

python src/data_pipeline/run_pipeline.py

```

Then build the knowledge base:

```bash

python src/rag/build_volve_db.py

python src/rag/build_openviking_db.py

```

---

## Quickstart (judges)

```bash

# 1. Clone & install

git clone <repo-url>

cd odin

python -m venv venv

source venv/bin/activate      # Windows: venv\Scripts\activate

pip install -r requirements.txt



# 2. Download runtime data (~400 MB knowledge bases + processed CSVs)

python scripts/download_data.py



# 3. Add your Gemini API key

cp .env.example .env

# Edit .env: set GOOGLE_API_KEY=<your key>

# Free key at: https://aistudio.google.com/app/apikey



# 4. Run

python src/agents/app.py

```

Open `http://localhost:7860` in your browser.

---

## Project Structure

```

odin/

├── src/

│   ├── agents/           # Main application

│   │   ├── app.py        # Gradio UI (entry point)

│   │   ├── orchestrator.py  # Query routing & streaming

│   │   ├── crew.py       # CrewAI agent definitions & tasks

│   │   ├── tools.py      # DDR / WITSML / EDM / RAG tools

│   │   └── data_tools.py # Python interpreter tool + data helpers

│   │

│   ├── data_pipeline/    # ETL: raw Volve data → processed CSV

│   │   ├── run_pipeline.py

│   │   ├── parse_witsml_logs.py

│   │   ├── parse_ddr_xml.py

│   │   └── parse_edm.py

│   │

│   └── rag/              # Knowledge base builders

│       ├── build_volve_db.py

│       └── build_openviking_db.py


├── tests/

│   └── prompts/          # Agent prompt test cases


├── data/                 # ← NOT in git (download separately)

│   ├── raw/              # Original Volve dataset

│   ├── processed/        # ETL output (CSV / Parquet)

│   └── knowledge_base/   # ChromaDB vector stores


├── outputs/              # ← NOT in git (generated at runtime)

│   └── figures/          # Plotly charts (HTML + PNG)


├── requirements.txt

├── .env.example

└── promptfooconfig.yaml  # Evaluation harness (PromptFoo)

```

---

## Rate Limits

The system is tuned for the Gemini free tier (15 RPM):

| Crew mode | LLM calls | Target time |
|---|---|---|
| Lean (chart / compare) | ~6 calls | ~40s |
| Full (deep analysis) | ~10 calls | ~80s |

Automatic 429 retry with exponential back-off (10 → 20 → 40 → 60s) is built in.

---

## License

Source code: MIT
Volve dataset: [Volve Data Sharing Agreement](https://www.equinor.com/energy/volve-data-sharing) (not included in this repo)