kashif
/

DeepConf

 tags:
 - custom_generate
 - sampling
+---
+# DeepCONF Custom Generation Strategy
+This repository implements the DeepCONF (Deep Confidence-based Early Stopping) generation strategy for Hugging Face Transformers models, following the [Deep Think with Confidence](https://jiaweizzhao.github.io/deepconf/) approach from the paper [Deep Think with Confidence](https://huggingface.co/papers/2508.15260).
+## Overview
+DeepCONF monitors the confidence of generated tokens and stops generation when confidence falls below a threshold.
+## Parameters
+- `enable_conf` (bool): Whether to enable the DeepCONF strategy. Defaults to `False`.
+- `window_size` (int): Size of the sliding window for confidence calculation. Defaults to `2048`.
+- `threshold` (float): Confidence threshold for early stopping. Defaults to `17.0`.
+- `output_confidences` (bool): If `True` and `return_dict_in_generate=True`, returns a per-step confidence tensor alongside generated sequences for debugging/visualization.
+## Usage
+To use this custom generation strategy, you can pass it directly to the `generate` method:
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("your-model")
+tokenizer = AutoTokenizer.from_pretrained("your-model")
+inputs = tokenizer("Hello, world!", return_tensors="pt")
+# Generate with DeepCONF (Hub repo)
+outputs = model.generate(
+    **inputs,
+    enable_conf=True,
+    window_size=2048,
+    threshold=17.0,
+    output_confidences=True,           # request confidences
+    return_dict_in_generate=True,      # required to get tensors
+    max_new_tokens=100,
+    custom_generate="kashif/DeepConf",  # Hugging Face Hub repo
+    trust_remote_code=True
+)
+```
+## Requirements
+- PyTorch >= 1.13.0
+- Transformers >= 4.35.0