|
|
--- |
|
|
base_model: openGPT-X/Teuken-7B-instruct-research-v0.4 |
|
|
license: mit |
|
|
--- |
|
|
# Teuken7B QLoRA – Grounding Act Classification |
|
|
|
|
|
This model is a fine-tuned version of [openGPT-X/Teuken-7B-instruct-research-v0.4](https://huggingface.co/openGPT-X/Teuken-7B-instruct-research-v0.4) optimized using QLoRA for efficient binary classification of German dialogue utterances into: |
|
|
|
|
|
- **advance**: Contribution that moves the dialogue forward (e.g. confirmations, follow-ups, elaborations) |
|
|
- **non_advance**: Other utterances (e.g. vague responses, misunderstandings, irrelevant comments) |
|
|
|
|
|
--- |
|
|
|
|
|
## Use Cases |
|
|
|
|
|
- Dialogue system analysis |
|
|
- Teacher-student interaction classification |
|
|
- Grounding in institutional advising or classroom discourse |
|
|
|
|
|
--- |
|
|
|
|
|
## How to Use: |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForSequenceClassification, AutoTokenizer |
|
|
import torch |
|
|
tokenizer = AutoTokenizer.from_pretrained("openGPT-X/Teuken-7B-instruct-research-v0.4") |
|
|
|
|
|
model = AutoModelForSequenceClassification.from_pretrained("MB55/teuken7b-advance-classifier") |
|
|
model.eval() |
|
|
|
|
|
def predict(text): |
|
|
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True) |
|
|
if "token_type_ids" in inputs: |
|
|
del inputs["token_type_ids"] |
|
|
with torch.no_grad(): |
|
|
outputs = model(**inputs) |
|
|
logits = outputs.logits |
|
|
predicted_class = logits.argmax(dim=-1).item() |
|
|
return predicted_class |
|
|
|
|
|
text = "Ich bin da." |
|
|
prediction = predict(text) |
|
|
|
|
|
print(f"Predicted class: {prediction}") |
|
|
|
|
|
|
|
|
|
|
|
|