--- base_model: - unsloth/Qwen3-14B-unsloth-bnb-4bit license: apache-2.0 language: - en tags: - geocoding - unsloth --- This fine-tuned LLM is intended for the task of geocoding complex location references, and accompanies [Coordinates from Context: Using LLMs to Ground Complex Location References](https://arxiv.org/pdf/2510.08741) (Masis & O'Connor, EACL 2026). The model is referred to as "Geoparser-augmented FT Qwen 14B" in the paper. ### Model description The base model is a quantized Qwen3-14B model (```unsloth/Qwen3-14B-unsloth-bnb-4bit```), which has been fine-tuned for geocoding, i.e. linking a location reference to an actual geographic location. The model was trained using parameter-efficient fine-tuning via low-rank adaptation. It was trained for our 'Geoparser-augmented' approach, where a separate geoparsing tool augments the inputs with the center coordinates of mentioned locations; our fine-tuned model then uses both the original location reference and the mentioned locations' coordinates to generate the described location's bounding box. For more details, please see the accompanying paper. ### Training data The model is trained on 13k examples from the training subset of the [GeoCoDe dataset](https://github.com/EgoLaparra/geocode-data), where the input is a complex location reference and the center coordinates of each mentioned location and the output is the location's corresponding bounding box. ### Limitations Due to data limitations, this model has been trained and evaluated for our task only in Mainstream American English. ### Usage (unsloth) The following code snippet illustrates how to use the model. For the system prompt we used and for example prompts, please see the appendices in the accompanying paper. ```python from unsloth import FastLanguageModel import torch model_name = "tmasis/geocoding-complex-location-references" # Load model and tokenizer from Huggingface Hub model, tokenizer = FastLanguageModel.from_pretrained( model_name = model_name, max_seq_length = 2048, load_in_4bit = True, ) FastLanguageModel.for_inference(model) # Prepare model input messages = [{"role": "system", "content": }, {"role": "user", "content": }] text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt = True, enable_thinking = False ) # Conduct text generation outputs = model.generate(**tokenizer(text, return_tensors="pt").to(model.device), max_new_tokens=1024, temperature=0.7, top_p=0.8, top_k=20) response = tokenizer.batch_decode(outputs)[0] print(response) ``` ### Usage (HuggingFace transformers) Alternatively, you can use the HuggingFace transformers library. ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "tmasis/geocoding-complex-location-references" # Load model and tokenizer from Huggingface Hub tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name = model_name, torch_dtype = "auto", device_map = "auto" ) # Prepare model input messages = [{"role": "system", "content": }, {"role": "user", "content": }] text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt = True, enable_thinking = False ) # Conduct text generation outputs = model.generate(**tokenizer(text, return_tensors="pt").to(model.device), max_new_tokens=1024, temperature=0.7, top_p=0.8, top_k=20) response = tokenizer.batch_decode(outputs)[0] print(response) ```