|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: Qwen/Qwen2-7B |
|
|
library_name: peft |
|
|
datasets: |
|
|
- vmal/ConfinityChatMLv1 |
|
|
tags: |
|
|
- logical-reasoning |
|
|
- chain-of-thought |
|
|
- lora |
|
|
- peft |
|
|
- conversational |
|
|
--- |
|
|
## Overview |
|
|
|
|
|
An autoregressive language model fine-tuned on ConfinityChatMLv1 for enhanced chain-of-thought and logical reasoning in conversational settings. |
|
|
Built on Qwen2-7B using PEFT/LoRA. |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Base model:** Qwen/Qwen2-7B |
|
|
- **Library:** PEFT (LoRA) |
|
|
- **Model type:** Causal autoregressive transformer (decoder-only) |
|
|
- **Languages:** English (primary) |
|
|
- **License:** Apache-2.0 (inherits Qwen2-7B license) |
|
|
- **Finetuned from:** Qwen/Qwen2-7B |
|
|
- **Repository:** https://huggingface.co/vmal/qwen2-7b-logical-reasoning |
|
|
- **Dataset:** ConfinityChatMLv1 (~140K reasoning dialogues) |
|
|
|
|
|
--- |
|
|
|
|
|
## Uses |
|
|
|
|
|
### Direct Use |
|
|
|
|
|
- Provide step-by-step solutions to logic puzzles & math word problems |
|
|
- Assist with structured reasoning in chatbots & virtual tutors |
|
|
- Generate chain-of-thought–style explanations alongside answers |
|
|
|
|
|
### Downstream Use |
|
|
|
|
|
- Automated grading & feedback on student solutions |
|
|
- Knowledge-graph population via inference chains |
|
|
- Hybrid QA systems requiring explanation traces |
|
|
|
|
|
### Out-of-Scope |
|
|
|
|
|
- Creative/open-ended story generation |
|
|
- Highly domain-specific expert systems without further fine-tuning |
|
|
- Low-latency real-time deployment on edge devices |
|
|
|
|
|
--- |
|
|
|
|
|
## Bias, Risks & Limitations |
|
|
|
|
|
- **Inherited biases:** Cultural and gender stereotypes from pretraining corpus |
|
|
- **Hallucinations:** May produce unsupported or incorrect facts when outside training scope |
|
|
- **Overconfidence:** Can present flawed reasoning as fact, especially on adversarial or OOD tasks |
|
|
|
|
|
### Recommendations |
|
|
|
|
|
1. **Benchmark** on your specific tasks before production use. |
|
|
2. **Human-in-the-loop** review for high-stakes decisions. |
|
|
3. **Ground outputs** with retrieval systems for verifiable sources. |
|
|
|
|
|
--- |
|
|
|
|
|
## Quick Start |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
from peft import PeftModel |
|
|
|
|
|
# Load tokenizer & base model |
|
|
tokenizer = AutoTokenizer.from_pretrained( |
|
|
"vmal/qwen2-7b-logical-reasoning", |
|
|
trust_remote_code=True |
|
|
) |
|
|
base = AutoModelForCausalLM.from_pretrained( |
|
|
"Qwen/Qwen2-7B", |
|
|
trust_remote_code=True, |
|
|
device_map="auto" |
|
|
) |
|
|
# Load LoRA adapters |
|
|
model = PeftModel.from_pretrained(base, "vmal/qwen2-7b-logical-reasoning") |
|
|
|
|
|
# Inference example |
|
|
prompt = ( |
|
|
"Solve step by step: If all bloops are razzies, and some razzies are lazzies, " |
|
|
"are all bloops lazzies?" |
|
|
) |
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate(**inputs, max_new_tokens=256) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
|