Update README.md

Browse files

Files changed (1) hide show

README.md +96 -0

README.md CHANGED Viewed

	@@ -9,3 +9,99 @@ For centuries, researchers have sought out ways to connect disparate areas of kn
9
10	This repository also features codes for the multi-modal mechanics language model, MeLM, applied to solve various nonlinear forward and inverse problems, that can deal with a set of instructions, numbers and microstructure data. The framework is applied to various examples including bio-inspired hierarchical honeycomb design, carbon nanotube mechanics, and protein unfolding. In spite of the flexible nature of the model–which allows us to easily incorporate diverse materials, scales, and mechanical features–the model performs well across disparate forward and inverse tasks. Based on an autoregressive attention-model, MeLM effectively represents a large multi-particle system consisting of hundreds of millions of neurons, where the interaction potentials are discovered through graph-forming self-attention mechanisms that are then used to identify relationships from emergent structures, while taking advantage of synergies discovered in the training data. We show that the model can solve complex degenerate mechanics design problems and determine novel material architectures across a range of hierarchical levels, providing an avenue for materials discovery and analysis. To illustrate the use case for broader possibilities, we outline a human-machine interactive MechGPT model, here trained on a set of 1,103 Wikipedia articles related to mechanics, showing how the general framework can be used not only to solve forward and inverse problems but in addition, for complex language tasks like summarization, generation of new research concepts, and knowledge extraction. Looking beyond the demonstrations reported in this paper, we discuss other opportunities in applied mechanics and general considerations about the use of large language models in modeling, design, and analysis that can span a broad spectrum of material properties from mechanical, thermal, optical, to electronic.
11

 This repository also features codes for the multi-modal mechanics language model, MeLM, applied to solve various nonlinear forward and inverse problems, that can deal with a set of instructions, numbers and microstructure data. The framework is applied to various examples including bio-inspired hierarchical honeycomb design, carbon nanotube mechanics, and protein unfolding. In spite of the flexible nature of the model–which allows us to easily incorporate diverse materials, scales, and mechanical features–the model performs well across disparate forward and inverse tasks. Based on an autoregressive attention-model, MeLM effectively represents a large multi-particle system consisting of hundreds of millions of neurons, where the interaction potentials are discovered through graph-forming self-attention mechanisms that are then used to identify relationships from emergent structures, while taking advantage of synergies discovered in the training data. We show that the model can solve complex degenerate mechanics design problems and determine novel material architectures across a range of hierarchical levels, providing an avenue for materials discovery and analysis. To illustrate the use case for broader possibilities, we outline a human-machine interactive MechGPT model, here trained on a set of 1,103 Wikipedia articles related to mechanics, showing how the general framework can be used not only to solve forward and inverse problems but in addition, for complex language tasks like summarization, generation of new research concepts, and knowledge extraction. Looking beyond the demonstrations reported in this paper, we discuss other opportunities in applied mechanics and general considerations about the use of large language models in modeling, design, and analysis that can span a broad spectrum of material properties from mechanical, thermal, optical, to electronic.
+```
+from transformers import AutoModelForSeq2SeqLM
+from peft import PeftModel, PeftConfig
+from transformers import BitsAndBytesConfig
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import transformers
+import torch
+import numpy as np
+from threading import Thread
+from typing import Iterator
+from transformers import  TextIteratorStreamer
+from transformers import GenerationConfig
+import gradio as gr
+from threading import Thread
+from typing import Iterator
+from transformers import AutoModelForCausalLM, AutoTokenizer, TextIteratorStreamer
+model_name='Open-Orca/OpenOrca-Platypus2-13B'
+FT_model_name = 'MechGPT-13b_v106C'
+peft_model_id = f'{FT_model_name}'
+bnb_config4bit = BitsAndBytesConfig(
+    load_in_4bit=True,
+    bnb_4bit_quant_type="nf4",
+    bnb_4bit_compute_dtype=torch.bfloat16,
+)
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model_base = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    device_map="auto",
+    quantization_config= bnb_config4bit,
+    torch_dtype=torch.bfloat16,
+    trust_remote_code=True,
+)
+model_base.config.use_cache = False
+model = PeftModel.from_pretrained(model_base, peft_model_id,
+                             )
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+tokenizer.pad_token = tokenizer.eos_token
+tokenizer.padding_side = "right"
+```
+Inference:
+```
+device='cuda'
+def generate_response (text_input="Mechanics is a powerful discipline with many applications, such as ",
+                      num_return_sequences=1,
+                      temperature=0.4, #the higher the temperature, the more creative the model becomes
+                      max_new_tokens=128,
+                      num_beams=1,
+                      top_k = 50,
+                      top_p = 0.9,
+                      repetition_penalty=1.,eos_token_id=2,verbatim=False,
+                     ):
+    inputs = tokenizer.encode(text_input,  add_special_tokens  =False,  return_tensors ='pt')
+    if verbatim:
+        print ("Length of input, tokenized: ", inputs.shape)
+    with torch.no_grad():
+          outputs = model.generate(input_ids=inputs.to(device),
+                                   max_new_tokens=max_new_tokens,
+                                   temperature=temperature,
+                                   num_beams=num_beams,
+                                   top_k = top_k,
+                                   top_p =top_p,
+                                   num_return_sequences = num_return_sequences, eos_token_id=eos_token_id,
+                                   do_sample =True,
+                                   repetition_penalty=repetition_penalty,
+                                  )
+    return tokenizer.batch_decode(outputs[:,inputs.shape[1]:].detach().cpu().numpy(), skip_special_tokens=True)
+```
+Prompt template:
+```
+# Single-turn `OpenChat Llama2 V1`
+tokenize("You are MechGPT.<|end_of_turn|>User: Hello<|end_of_turn|>Assistant:")
+# Multi-turn `OpenChat Llama2 V1`
+tokenize("You are MechGPT.<|end_of_turn|>User: Hello<|end_of_turn|>Assistant: Hi<|end_of_turn|>User: How are you today?<|end_of_turn|>Assistant:")
+```
+```
+generate_response (    text_input="You are MechGPT.<|end_of_turn|>User: How does hyperelastic softening affect crack speed in brittle materials?<|end_of_turn|>Assistant:",
+                       max_new_tokens=128,
+                       temperature=0.3, #value used to modulate the next token probabilities.
+                       num_beams=1,
+                       top_k = 50,
+                       top_p = 0.9,
+                       num_return_sequences = 1, eos_token_id=[2, 32000],
+                       )
+```