File size: 3,924 Bytes
bb0719d
 
 
 
 
 
 
 
 
 
 
 
 
 
a00eb95
 
 
bb0719d
a00eb95
bb0719d
a00eb95
 
593f8ac
 
 
ba9184c
a00eb95
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5494497
 
a00eb95
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5494497
 
 
 
 
 
ba9184c
 
593f8ac
 
 
 
 
 
 
 
 
a00eb95
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
---
license: apache-2.0
datasets:
- bengsoon/volve_alpaca
language:
- en
base_model:
- Meta/Meta-Llama-3-8B
pipeline_tag: summarization
tags:
- oil-and-gas
- energy
- drilling
---
# DriLLM Summarizer

## Background
This is a fine-tuned model from [Meta/Meta-Llama-3-8B](https://huggingface.co/Meta/Meta-Llama-3-8B). The model was fine-tuned with [Volve DDR dataset](https://huggingface.co/datasets/bengsoon/volve_alpaca) using the Alpaca template, using [Axolotl](https://github.com/axolotl-ai-cloud/axolotl).

The motivation behind this model was to fine-tune an LLM that is capable of understanding the nuances of the Drilling Operations and provide 24-hour summarizations based on the inputs from Daily Drilling Reports hourly activities.

## How to use
### Sample Colab
Here's a [Google colab notebook](https://colab.research.google.com/drive/10Txp14M-yeJG3hRAB8U2ydPrWFE1bypW?usp=sharing) where you can get started with using the model 

### Recommended template for DriLLM-Summarizer:
``` python
TEMPLATE = """<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{instruction}


### Input:
{input}


### Response:

"""
```

### Inferencing using Transformers Pipeline
The code below was tested on a Google colab (with the free T4 GPU). 

``` python
import transformers
import torch

model_id = "bengsoon/DriLLM-Summarizer"

pipeline = transformers.pipeline(
    "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
)

TEMPLATE = """<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{instruction}


### Input:
{input}


### Response:

"""

INSTRUCTION = """You are a Rig Supervisor working at an oil and gas offshore drilling operation. \
Your company is currently on a drilling campaign and you are the on-site Drilling Engineer (DE). \
As a DE, one of your jobs is to oversee the operations at the drilling rigs. As such, you know the ins and outs of the operation, down to the hourly activities. \
Every day, activities are recorded either by the Driller, Mud Logger, MWD / LWD engineer or the Drilling Operations Coordinator throughout the day. \
As a DE representative for your company, you are required to prepare the 24-hour summary for the Daily Drilling Report (DDR) based on the hourly activities reported. \
You must always maintain the language of report along with the terminologies and mnemonics of the Drilling Engineer. \
Given the following activities for well XX, please prepare the 24-hour summary for the Daily Drilling Report (DDR). \
Only return the 24-hour summary, and nothing else.
"""

hourly_events = """00:00 - 11:00: Packed equipment and prepared for backload. Cleaned drillfloor and cantilever.
11:00 - 17:00: Performed are inspection with barge engineer. Cleaned and tidied offices and workspace. Demobilized all personell. End of operation
"""

input = TEMPLATE.format(instruction=INSTRUCTION, input=hourly_events)

output = pipeline(input)

print("Response: ", output[0]["generated_text"].split("### Response:")[1].strip())
# > Response:  Packed equipment and prepared for backload. Cleaned drillfloor and cantilever. Performed are inspection with barge engineer. Cleaned and tidyied offices and workspaces.
```

### Quantized model
If you are facing GPU constraints, you can try to load it with 8-bit quantization

``` python
from transformers import BitsAndBytesConfig

pipeline = transformers.pipeline(
    "text-generation", 
    model=model_id, 
    model_kwargs = {
        "torch_dtype":  torch.bfloat16, 
        "quantization_config": BitsAndBytesConfig(load_in_8bit=True), # Uncomment to use 8-bit quantization, 
    },
    device_map="auto"
)
```