update readme
Browse files
README.md
CHANGED
|
@@ -1,3 +1,14 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# doc2query/all-with_prefix-t5-base-v1
|
| 2 |
|
| 3 |
|
|
@@ -9,8 +20,10 @@ model_name = 'doc2query/all-with_prefix-t5-base-v1'
|
|
| 9 |
tokenizer = T5Tokenizer.from_pretrained(model_name)
|
| 10 |
model = T5ForConditionalGeneration.from_pretrained(model_name)
|
| 11 |
|
| 12 |
-
prefix = "answer2question
|
| 13 |
-
text =
|
|
|
|
|
|
|
| 14 |
|
| 15 |
input_ids = tokenizer.encode(text, max_length=384, truncation=True, return_tensors='pt')
|
| 16 |
outputs = model.generate(
|
|
@@ -49,6 +62,7 @@ The datasets include besides others:
|
|
| 49 |
This model was trained **with prefixed**: You start the text with a specific index that defines what type out output text you would like to receive. Depending on the prefix, the output is different.
|
| 50 |
|
| 51 |
E.g. the above text about Python produces the following output:
|
|
|
|
| 52 |
| Prefix | Output |
|
| 53 |
| --- | --- |
|
| 54 |
| answer2question | Why should I use python in my business? ; What is the difference between Python and.NET? ; what is the python design philosophy? |
|
|
@@ -66,5 +80,7 @@ These are all available pre-fixes:
|
|
| 66 |
- text2query
|
| 67 |
- question2question
|
| 68 |
|
|
|
|
|
|
|
| 69 |
For the datasets and weights for the different pre-fixes see `data_config.json` in this repository.
|
| 70 |
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language: en
|
| 3 |
+
datasets:
|
| 4 |
+
- sentence-transformers/reddit-title-body
|
| 5 |
+
- sentence-transformers/embedding-training-data
|
| 6 |
+
widget:
|
| 7 |
+
- text: "answer2question: Python is an interpreted, high-level and general-purpose programming language. Python's design philosophy emphasizes code readability with its notable use of significant whitespace. Its language constructs and object-oriented approach aim to help programmers write clear, logical code for small and large-scale projects."
|
| 8 |
+
|
| 9 |
+
license: apache-2.0
|
| 10 |
+
---
|
| 11 |
+
|
| 12 |
# doc2query/all-with_prefix-t5-base-v1
|
| 13 |
|
| 14 |
|
|
|
|
| 20 |
tokenizer = T5Tokenizer.from_pretrained(model_name)
|
| 21 |
model = T5ForConditionalGeneration.from_pretrained(model_name)
|
| 22 |
|
| 23 |
+
prefix = "answer2question"
|
| 24 |
+
text = "Python is an interpreted, high-level and general-purpose programming language. Python's design philosophy emphasizes code readability with its notable use of significant whitespace. Its language constructs and object-oriented approach aim to help programmers write clear, logical code for small and large-scale projects."
|
| 25 |
+
|
| 26 |
+
text = prefix+": "+text
|
| 27 |
|
| 28 |
input_ids = tokenizer.encode(text, max_length=384, truncation=True, return_tensors='pt')
|
| 29 |
outputs = model.generate(
|
|
|
|
| 62 |
This model was trained **with prefixed**: You start the text with a specific index that defines what type out output text you would like to receive. Depending on the prefix, the output is different.
|
| 63 |
|
| 64 |
E.g. the above text about Python produces the following output:
|
| 65 |
+
|
| 66 |
| Prefix | Output |
|
| 67 |
| --- | --- |
|
| 68 |
| answer2question | Why should I use python in my business? ; What is the difference between Python and.NET? ; what is the python design philosophy? |
|
|
|
|
| 80 |
- text2query
|
| 81 |
- question2question
|
| 82 |
|
| 83 |
+
|
| 84 |
+
|
| 85 |
For the datasets and weights for the different pre-fixes see `data_config.json` in this repository.
|
| 86 |
|