tmasis commited on
Commit
35a4ac0
·
verified ·
1 Parent(s): 6e82b09

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -15
README.md CHANGED
@@ -15,29 +15,38 @@ For more details, please see the accompanying paper.
15
  ### Training data
16
  The model is trained on 13k examples from the training subset of the [GeoCoDe dataset](https://github.com/EgoLaparra/geocode-data), where the input is a complex location reference and the center coordinates of each mentioned location and the output is the location's corresponding bounding box.
17
 
18
- ### Intended uses and limitations
19
- Due to data limitations, this model has been trained and evaluated for our task only in Mainstream American English.
20
 
21
 
22
  ### Usage
23
- We have included sample code below to use the model. For the system prompt and example prompts, please see the appendices in the accompanying paper.
24
 
25
- ```
26
- from unsloth import FastLanguageModel
27
- import torch
28
 
29
- # Load model from Huggingface Hub
30
- model, tokenizer = FastLanguageModel.from_pretrained(
31
- model_name = "tmasis/geocoding-complex-location-references",
32
- max_seq_length = 2048,
33
- load_in_4bit = True)
34
- FastLanguageModel.for_inference(model)
35
 
 
 
 
 
 
 
 
 
 
36
  messages = [{"role": "system", "content": <system_prompt>},
37
  {"role": "user", "content": <prompt>}]
38
- text = tokenizer.apply_chat_template(messages, tokenizer=False,
39
- add_generation_prompt = True, enable_thinking = False)
40
- outputs = model.generate(**tokenizer(text, return_tensors="pt").to("cuda"),
 
 
 
 
 
41
  max_new_tokens=1024, temperature=0.7, top_p=0.8, top_k=20)
42
  response = tokenizer.batch_decode(outputs)[0]
 
43
  ```
 
15
  ### Training data
16
  The model is trained on 13k examples from the training subset of the [GeoCoDe dataset](https://github.com/EgoLaparra/geocode-data), where the input is a complex location reference and the center coordinates of each mentioned location and the output is the location's corresponding bounding box.
17
 
18
+ ### Limitations
19
+ Due to data limitations, this model has been trained and evaluated for our task only in Mainstream American English.
20
 
21
 
22
  ### Usage
23
+ The following code snippet illustrates how to use the model. For the system prompt we used and for example prompts, please see the appendices in the accompanying paper.
24
 
25
+ ```python
26
+ from transformers import AutoModelForCausalLM, AutoTokenizer
 
27
 
28
+ model_name = "tmasis/geocoding-complex-location-references"
 
 
 
 
 
29
 
30
+ # Load model and tokenizer from Huggingface Hub
31
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
32
+ model = AutoModelForCausalLM.from_pretrained(
33
+ model_name = model_name,
34
+ torch_dtype = "auto",
35
+ device_map = "auto"
36
+ )
37
+
38
+ # Prepare model input
39
  messages = [{"role": "system", "content": <system_prompt>},
40
  {"role": "user", "content": <prompt>}]
41
+ text = tokenizer.apply_chat_template(messages,
42
+ tokenize=False,
43
+ add_generation_prompt = True,
44
+ enable_thinking = False
45
+ )
46
+
47
+ # Conduct text generation
48
+ outputs = model.generate(**tokenizer(text, return_tensors="pt").to(model.device),
49
  max_new_tokens=1024, temperature=0.7, top_p=0.8, top_k=20)
50
  response = tokenizer.batch_decode(outputs)[0]
51
+ print(response)
52
  ```