mjbuehler commited on
Commit
4d141ec
·
1 Parent(s): 5e6271f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +96 -0
README.md CHANGED
@@ -9,3 +9,99 @@ For centuries, researchers have sought out ways to connect disparate areas of kn
9
 
10
  This repository also features codes for the multi-modal mechanics language model, MeLM, applied to solve various nonlinear forward and inverse problems, that can deal with a set of instructions, numbers and microstructure data. The framework is applied to various examples including bio-inspired hierarchical honeycomb design, carbon nanotube mechanics, and protein unfolding. In spite of the flexible nature of the model–which allows us to easily incorporate diverse materials, scales, and mechanical features–the model performs well across disparate forward and inverse tasks. Based on an autoregressive attention-model, MeLM effectively represents a large multi-particle system consisting of hundreds of millions of neurons, where the interaction potentials are discovered through graph-forming self-attention mechanisms that are then used to identify relationships from emergent structures, while taking advantage of synergies discovered in the training data. We show that the model can solve complex degenerate mechanics design problems and determine novel material architectures across a range of hierarchical levels, providing an avenue for materials discovery and analysis. To illustrate the use case for broader possibilities, we outline a human-machine interactive MechGPT model, here trained on a set of 1,103 Wikipedia articles related to mechanics, showing how the general framework can be used not only to solve forward and inverse problems but in addition, for complex language tasks like summarization, generation of new research concepts, and knowledge extraction. Looking beyond the demonstrations reported in this paper, we discuss other opportunities in applied mechanics and general considerations about the use of large language models in modeling, design, and analysis that can span a broad spectrum of material properties from mechanical, thermal, optical, to electronic.
11
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
10
  This repository also features codes for the multi-modal mechanics language model, MeLM, applied to solve various nonlinear forward and inverse problems, that can deal with a set of instructions, numbers and microstructure data. The framework is applied to various examples including bio-inspired hierarchical honeycomb design, carbon nanotube mechanics, and protein unfolding. In spite of the flexible nature of the model–which allows us to easily incorporate diverse materials, scales, and mechanical features–the model performs well across disparate forward and inverse tasks. Based on an autoregressive attention-model, MeLM effectively represents a large multi-particle system consisting of hundreds of millions of neurons, where the interaction potentials are discovered through graph-forming self-attention mechanisms that are then used to identify relationships from emergent structures, while taking advantage of synergies discovered in the training data. We show that the model can solve complex degenerate mechanics design problems and determine novel material architectures across a range of hierarchical levels, providing an avenue for materials discovery and analysis. To illustrate the use case for broader possibilities, we outline a human-machine interactive MechGPT model, here trained on a set of 1,103 Wikipedia articles related to mechanics, showing how the general framework can be used not only to solve forward and inverse problems but in addition, for complex language tasks like summarization, generation of new research concepts, and knowledge extraction. Looking beyond the demonstrations reported in this paper, we discuss other opportunities in applied mechanics and general considerations about the use of large language models in modeling, design, and analysis that can span a broad spectrum of material properties from mechanical, thermal, optical, to electronic.
11
 
12
+ ```
13
+ from transformers import AutoModelForSeq2SeqLM
14
+ from peft import PeftModel, PeftConfig
15
+ from transformers import BitsAndBytesConfig
16
+ from transformers import AutoTokenizer, AutoModelForCausalLM
17
+ import transformers
18
+ import torch
19
+ import numpy as np
20
+
21
+ from threading import Thread
22
+ from typing import Iterator
23
+ from transformers import TextIteratorStreamer
24
+
25
+ from transformers import GenerationConfig
26
+ import gradio as gr
27
+
28
+ from threading import Thread
29
+ from typing import Iterator
30
+ from transformers import AutoModelForCausalLM, AutoTokenizer, TextIteratorStreamer
31
+
32
+ model_name='Open-Orca/OpenOrca-Platypus2-13B'
33
+ FT_model_name = 'MechGPT-13b_v106C'
34
+
35
+ peft_model_id = f'{FT_model_name}'
36
+
37
+ bnb_config4bit = BitsAndBytesConfig(
38
+ load_in_4bit=True,
39
+ bnb_4bit_quant_type="nf4",
40
+ bnb_4bit_compute_dtype=torch.bfloat16,
41
+ )
42
+
43
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
44
+ model_base = AutoModelForCausalLM.from_pretrained(
45
+ model_name,
46
+ device_map="auto",
47
+ quantization_config= bnb_config4bit,
48
+ torch_dtype=torch.bfloat16,
49
+ trust_remote_code=True,
50
+ )
51
+ model_base.config.use_cache = False
52
+ model = PeftModel.from_pretrained(model_base, peft_model_id,
53
+ )
54
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
55
+
56
+ tokenizer.pad_token = tokenizer.eos_token
57
+ tokenizer.padding_side = "right"
58
+ ```
59
+ Inference:
60
+ ```
61
+ device='cuda'
62
+ def generate_response (text_input="Mechanics is a powerful discipline with many applications, such as ",
63
+ num_return_sequences=1,
64
+ temperature=0.4, #the higher the temperature, the more creative the model becomes
65
+ max_new_tokens=128,
66
+ num_beams=1,
67
+ top_k = 50,
68
+ top_p = 0.9,
69
+ repetition_penalty=1.,eos_token_id=2,verbatim=False,
70
+ ):
71
+
72
+ inputs = tokenizer.encode(text_input, add_special_tokens =False, return_tensors ='pt')
73
+ if verbatim:
74
+ print ("Length of input, tokenized: ", inputs.shape)
75
+ with torch.no_grad():
76
+ outputs = model.generate(input_ids=inputs.to(device),
77
+ max_new_tokens=max_new_tokens,
78
+ temperature=temperature,
79
+ num_beams=num_beams,
80
+ top_k = top_k,
81
+ top_p =top_p,
82
+ num_return_sequences = num_return_sequences, eos_token_id=eos_token_id,
83
+ do_sample =True,
84
+ repetition_penalty=repetition_penalty,
85
+ )
86
+ return tokenizer.batch_decode(outputs[:,inputs.shape[1]:].detach().cpu().numpy(), skip_special_tokens=True)
87
+ ```
88
+
89
+ Prompt template:
90
+ ```
91
+ # Single-turn `OpenChat Llama2 V1`
92
+ tokenize("You are MechGPT.<|end_of_turn|>User: Hello<|end_of_turn|>Assistant:")
93
+
94
+ # Multi-turn `OpenChat Llama2 V1`
95
+ tokenize("You are MechGPT.<|end_of_turn|>User: Hello<|end_of_turn|>Assistant: Hi<|end_of_turn|>User: How are you today?<|end_of_turn|>Assistant:")
96
+ ```
97
+
98
+ ```
99
+ generate_response ( text_input="You are MechGPT.<|end_of_turn|>User: How does hyperelastic softening affect crack speed in brittle materials?<|end_of_turn|>Assistant:",
100
+ max_new_tokens=128,
101
+ temperature=0.3, #value used to modulate the next token probabilities.
102
+ num_beams=1,
103
+ top_k = 50,
104
+ top_p = 0.9,
105
+ num_return_sequences = 1, eos_token_id=[2, 32000],
106
+ )
107
+ ```