enelpol/czywiesz
Preview • Updated • 121 • 2
How to use enelpol/czywiesz-context with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("feature-extraction", model="enelpol/czywiesz-context") # Load model directly
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("enelpol/czywiesz-context")
model = AutoModel.from_pretrained("enelpol/czywiesz-context")The model was created for selective question answering in Polish. I.e. it is used to find passages containing the answers to the given question.
It is used to encode the contexts (aka passages) in the DPR bi-encoder architecture. The architecture requires two separate models. The question part has to be encoded with the corresponding question encoder.
The model was created by fine-tuning Herbert base cased on "Czywiesz" dataset. Czywiesz dataset contains questions and Wikipedia articles extracted from the Polish Wikipedia.
It is the easiest to use the model with the Haystack framework.
from haystack.document_stores import FAISSDocumentStore
from haystack.retriever import DensePassageRetriever
document_store = FAISSDocumentStore(faiss_index_factory_str="Flat")
retriever = DensePassageRetriever(
document_store=document_store,
query_embedding_model="enelpol/czywiesz-question",
passage_embedding_model="enelpol/czywiesz-context"
)
for document in documents:
document_store.write_documents([document])
document_store.update_embeddings(retriever)
document_store.save("contexts.faiss")