gpt-oss-20b-coref-resolution-lora

This is a LoRA adapter fine-tuned from openai/gpt-oss-20b for coreference resolution.

Model Description

This model takes text containing pronouns and rewrites it with all pronouns replaced by the full names of the entities they refer to.

Example

Input:

John went to the store. He bought some milk. His wife Sarah was happy when he returned.

Output:

John went to the store. John bought some milk. John's wife Sarah was happy when John returned.

Training Data

This adapter was fine-tuned on the wjbmattingly/synthetic-coref dataset.

Usage

With Transformers + PEFT (All Platforms)

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "openai/gpt-oss-20b",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "wjbmattingly/gpt-oss-20b-coref-resolution-lora")

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("openai/gpt-oss-20b")

system_prompt = """You are an expert at coreference resolution. Given a text containing pronouns and other referring expressions, rewrite the text replacing all pronouns with the full name of the entity they refer to.

Keep the text otherwise identical - only replace pronouns with the names they refer to."""

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": "John went to the store. He bought milk."},
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=512, do_sample=False)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)

Merging the Adapter (Optional)

If you want to merge the LoRA weights into the base model for faster inference:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "openai/gpt-oss-20b",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

# Load and merge LoRA adapter
model = PeftModel.from_pretrained(base_model, "wjbmattingly/gpt-oss-20b-coref-resolution-lora")
model = model.merge_and_unload()

# Now model is a regular transformers model with merged weights

With MLX on Mac

For MLX, you'll need to first merge the adapter, then convert:

# Step 1: Merge and save (run this once)
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained(
    "openai/gpt-oss-20b",
    torch_dtype=torch.float16,
    device_map="cpu",  # Use CPU to avoid GPU memory issues
)
model = PeftModel.from_pretrained(base_model, "wjbmattingly/gpt-oss-20b-coref-resolution-lora")
model = model.merge_and_unload()

# Save merged model
model.save_pretrained("./gpt-oss-20b-coref-resolution-lora-merged", safe_serialization=True)
tokenizer = AutoTokenizer.from_pretrained("openai/gpt-oss-20b")
tokenizer.save_pretrained("./gpt-oss-20b-coref-resolution-lora-merged")

Then convert to MLX format:

pip install mlx-lm

# Convert to MLX (with quantization for smaller size)
mlx_lm.convert --hf-path ./gpt-oss-20b-coref-resolution-lora-merged --mlx-path ./gpt-oss-20b-coref-resolution-lora-mlx -q

# Generate text
mlx_lm.generate --model ./gpt-oss-20b-coref-resolution-lora-mlx --prompt "Your text here"

Or use directly in Python with MLX:

from mlx_lm import load, generate

model, tokenizer = load("./gpt-oss-20b-coref-resolution-lora-mlx")

system_prompt = """You are an expert at coreference resolution. Given a text containing pronouns, rewrite the text replacing all pronouns with the full name of the entity they refer to."""

# Format prompt according to the model's chat template
prompt = f"<|system|>\n{system_prompt}<|end|>\n<|user|>\nJohn went to the store. He bought milk.<|end|>\n<|assistant|>\n"

response = generate(model, tokenizer, prompt=prompt, max_tokens=512)
print(response)

Base Model

Model: openai/gpt-oss-20b

Training Dataset

Dataset: wjbmattingly/synthetic-coref

License

This adapter inherits the license from the base model.

Downloads last month: 8

Model tree for wjbmattingly/gpt-oss-20b-coref-resolution-lora

Base model

openai/gpt-oss-20b

Adapter

(161)

this model

wjbmattingly
/

gpt-oss-20b-coref-resolution-lora