Text Generation
PEFT
Safetensors
English
Portuguese
qwen2
dpo
research-questions
epistemic-effectiveness
conversational
Instructions to use fmr34/reformulatee-reformulator-merged with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use fmr34/reformulatee-reformulator-merged with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
reformulatee-reformulator-merged
Fine-tuned version of Qwen2.5-1.5B-Instruct for epistemic reformulation of research questions — transforming vague, philosophical questions into operationalizable, methodologically grounded hypotheses.
Part of the ReformulatEE project. Live demo →
Model Description
- Model type: Causal LM (merged LoRA adapter)
- Base model: Qwen/Qwen2.5-1.5B-Instruct
- Fine-tuning method: DPO (Direct Preference Optimization) via TRL + QLoRA (4-bit)
- Language: English (Portuguese supported via MarianMT translation layer)
- License: Apache 2.0
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("fmr34/reformulatee-reformulator-merged")
tokenizer = AutoTokenizer.from_pretrained("fmr34/reformulatee-reformulator-merged")
messages = [
{"role": "system", "content": (
"You are an expert in philosophy of science. "
"Reformulate the research question to make it more epistemically tractable: "
"operationalizable, methodologically grounded, and answerable with existing tools. "
"Respond with ONLY the reformulated question."
)},
{"role": "user", "content": "Original question: What is consciousness?"},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.9, do_sample=True)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
# → "What measurable neural correlates distinguish conscious from unconscious processing?"
Training
Dataset: ~700 chosen/rejected pairs of research question reformulations
Data sources: curated pairs across philosophy, biology, cognitive science, and physics
Training: DPO + LoRA (rank=16, alpha=32, 4-bit QLoRA) on Google Colab T4 GPU (~45 min)
Epochs: 3 | Batch size: 4 | Learning rate: 5e-5
Evaluation
The model is evaluated via the Epistemic Effectiveness (EE) score:
EE(Q) = 0.05 · Respondibilidade + 0.05 · Tratabilidade + 0.90 · Não-trivialidade
Input Output EE
"What is consciousness?" "What measurable neural correlates distinguish conscious from unconscious processing?" 0.137 → 0.926
"Does free will exist?" "What neural mechanisms underlie the experience of voluntary action initiation?" 0.201 → 0.883
Limitations
Optimized for academic research questions; may underperform on highly domain-specific technical questions
Output quality depends on the input being a genuine research question (not factual queries)
English only at the model level; Portuguese requires the MarianMT translation layer from the full pipeline
Citation
@software{reformulatee_2025,
title = {ReformulatEE: Epistemic Effectiveness Reformulation},
author = {fmr34},
year = {2025},
url = {https://github.com/fmr34/ReformulatEE},
license = {Apache-2.0}
}
- Downloads last month
- -