|
|
--- |
|
|
license: cc |
|
|
datasets: |
|
|
- clarin-pl/poquad |
|
|
language: |
|
|
- pl |
|
|
base_model: |
|
|
- radlab/polish-qa-v2 |
|
|
pipeline_tag: question-answering |
|
|
library_name: transformers |
|
|
tags: |
|
|
- qa |
|
|
- poquad |
|
|
- quant |
|
|
- bitsandbytes |
|
|
--- |
|
|
|
|
|
### Model Overview |
|
|
|
|
|
- **Model name**: `radlab/polish-qa-v2-bnb` |
|
|
- **Developer**: [radlab.dev](https://radlab.dev) |
|
|
- **Model type**: Extractive Question鈥慉nswering (QA) |
|
|
- **Base model**: `radlab/polish-qa-v` (`sdadas/polish-roberta-large-v2` fine鈥憈uned for QA) |
|
|
- **Quantization**: 8鈥慴it inference鈥憃nly quantization via **bitsandbytes** (`load_in_8bit=True`, double鈥憅uantization enabled, `qa_outputs` excluded from quantization) |
|
|
- **Maximum context size**: 512 tokens |
|
|
|
|
|
### Intended Use |
|
|
|
|
|
This model is designed for **extractive QA** on Polish text. Given a question and a context passage, |
|
|
it returns the most relevant span of the context as the answer. |
|
|
This model is bnb-quantized version of `radlab/polish-qa-v2` model. |
|
|
|
|
|
### Limitations |
|
|
|
|
|
- The model works best with contexts up to 512 tokens. Longer passages should be truncated or split. |
|
|
- 8鈥慴it quantization reduces memory usage and inference latency but may introduce a slight drop in accuracy |
|
|
compared with the full鈥憄recision model. |
|
|
- Only suitable for inference; it cannot be further fine鈥憈uned while kept in 8鈥慴it mode. |
|
|
|
|
|
### How to Use |
|
|
|
|
|
```python |
|
|
from transformers import pipeline |
|
|
|
|
|
model_path = "radlab/polish-qa-v2-bnb" |
|
|
|
|
|
qa = pipeline( |
|
|
"question-answering", |
|
|
model=model_path, |
|
|
) |
|
|
|
|
|
question = "Co b臋dzie w budowanym obiekcie?" |
|
|
context = """Pozwolenie na budow臋 zosta艂o wydane w marcu. Pierwsze prace przygotowawcze |
|
|
na terenie przy ul. Wojska Polskiego ju偶 si臋 rozpocz臋艂y. |
|
|
Dzia艂k臋 ogrodzono, pojawi艂 si臋 r贸wnie偶 monitoring, a tak偶e kontenery |
|
|
dla pracownik贸w budowy. Na ten moment nie jest znana lista sklep贸w, |
|
|
kt贸re pojawi膮 si臋 w nowym pasa偶u handlowym.""" |
|
|
|
|
|
result = qa( |
|
|
question=question, |
|
|
context=context.replace("\n", " ") |
|
|
) |
|
|
|
|
|
print(result) |
|
|
``` |
|
|
|
|
|
|
|
|
**Sample output** |
|
|
|
|
|
```json |
|
|
{ |
|
|
"score": 0.32568359375, |
|
|
"start": 259, |
|
|
"end": 268, |
|
|
"answer": "sklep贸w," |
|
|
} |
|
|
``` |
|
|
|
|
|
|
|
|
### Technical Details |
|
|
|
|
|
- **Quantization strategy**: `BitsAndBytesStrategy` (8鈥慴it, double鈥憅uant, `qa_outputs` excluded). |
|
|
- **Loading code (for reference)** |
|
|
|
|
|
```python |
|
|
from transformers import AutoConfig, BitsAndBytesConfig, AutoModelForQuestionAnswering |
|
|
|
|
|
config = AutoConfig.from_pretrained(original_path) |
|
|
bnb_cfg = BitsAndBytesConfig( |
|
|
load_in_8bit=True, |
|
|
bnb_8bit_use_double_quant=True, |
|
|
bnb_8bit_excluded_modules=["qa_outputs"], |
|
|
) |
|
|
|
|
|
model = AutoModelForQuestionAnswering.from_pretrained( |
|
|
original_path, |
|
|
config=config, |
|
|
quantization_config=bnb_cfg, |
|
|
device_map="auto", |
|
|
) |
|
|
``` |