metadata
license: cc
datasets:
- clarin-pl/poquad
language:
- pl
base_model:
- radlab/polish-qa-v2
pipeline_tag: question-answering
library_name: transformers
tags:
- qa
- poquad
- quant
- bitsandbytes
Model Overview
- Model name:
radlab/polish-qa-v2-bnb - Developer: radlab.dev
- Model type: Extractive Question鈥慉nswering (QA)
- Base model:
radlab/polish-qa-v(sdadas/polish-roberta-large-v2fine鈥憈uned for QA) - Quantization: 8鈥慴it inference鈥憃nly quantization via bitsandbytes (
load_in_8bit=True, double鈥憅uantization enabled,qa_outputsexcluded from quantization) - Maximum context size: 512 tokens
Intended Use
This model is designed for extractive QA on Polish text. Given a question and a context passage,
it returns the most relevant span of the context as the answer.
This model is bnb-quantized version of radlab/polish-qa-v2 model.
Limitations
- The model works best with contexts up to 512 tokens. Longer passages should be truncated or split.
- 8鈥慴it quantization reduces memory usage and inference latency but may introduce a slight drop in accuracy compared with the full鈥憄recision model.
- Only suitable for inference; it cannot be further fine鈥憈uned while kept in 8鈥慴it mode.
How to Use
from transformers import pipeline
model_path = "radlab/polish-qa-v2-bnb"
qa = pipeline(
"question-answering",
model=model_path,
)
question = "Co b臋dzie w budowanym obiekcie?"
context = """Pozwolenie na budow臋 zosta艂o wydane w marcu. Pierwsze prace przygotowawcze
na terenie przy ul. Wojska Polskiego ju偶 si臋 rozpocz臋艂y.
Dzia艂k臋 ogrodzono, pojawi艂 si臋 r贸wnie偶 monitoring, a tak偶e kontenery
dla pracownik贸w budowy. Na ten moment nie jest znana lista sklep贸w,
kt贸re pojawi膮 si臋 w nowym pasa偶u handlowym."""
result = qa(
question=question,
context=context.replace("\n", " ")
)
print(result)
Sample output
{
"score": 0.32568359375,
"start": 259,
"end": 268,
"answer": "sklep贸w,"
}
Technical Details
- Quantization strategy:
BitsAndBytesStrategy(8鈥慴it, double鈥憅uant,qa_outputsexcluded). - Loading code (for reference)
from transformers import AutoConfig, BitsAndBytesConfig, AutoModelForQuestionAnswering
config = AutoConfig.from_pretrained(original_path)
bnb_cfg = BitsAndBytesConfig(
load_in_8bit=True,
bnb_8bit_use_double_quant=True,
bnb_8bit_excluded_modules=["qa_outputs"],
)
model = AutoModelForQuestionAnswering.from_pretrained(
original_path,
config=config,
quantization_config=bnb_cfg,
device_map="auto",
)