Text Generation
Transformers
PyTorch
English
bart
text2text-generation
rag
question answering
retrieval augmented generation
Instructions to use ansukla/task-llm with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ansukla/task-llm with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ansukla/task-llm")# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("ansukla/task-llm") model = AutoModelForSeq2SeqLM.from_pretrained("ansukla/task-llm") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use ansukla/task-llm with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ansukla/task-llm" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ansukla/task-llm", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/ansukla/task-llm
- SGLang
How to use ansukla/task-llm with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ansukla/task-llm" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ansukla/task-llm", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ansukla/task-llm" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ansukla/task-llm", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use ansukla/task-llm with Docker Model Runner:
docker model run hf.co/ansukla/task-llm
Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,91 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
pipeline_tag: question-answering
|
| 6 |
+
---
|
| 7 |
+
# Model Card for task-llm
|
| 8 |
+
|
| 9 |
+
This model supports abstractive QA tasks. Given a set of passages and a question, it tries to generate a comprehensive answer by reading the passages.
|
| 10 |
+
|
| 11 |
+
## Model Details
|
| 12 |
+
|
| 13 |
+
This model was intended to be a T5 style multi task model trained with Bart to leverage the larger context length and better performance.
|
| 14 |
+
At the moment, the only task supported by this model is abstractive qa
|
| 15 |
+
### Model Description
|
| 16 |
+
|
| 17 |
+
- **Developed by:** Ambika Sukla, Nlmatics Corp.
|
| 18 |
+
- **Model type:** Generative Language Model, Abstractive QA, QASum
|
| 19 |
+
- **Language(s) (NLP):** English
|
| 20 |
+
- **License:** Apache 2.0
|
| 21 |
+
- **Finetuned from model bart:**
|
| 22 |
+
|
| 23 |
+
## Uses
|
| 24 |
+
|
| 25 |
+
This model supports abstractive QA tasks. Given a set of passages and a question, it tries to generate a comprehensive answer by reading the passages.
|
| 26 |
+
|
| 27 |
+
## Bias, Risks, and Limitations
|
| 28 |
+
|
| 29 |
+
This model is trained with a very simple dataset and will need further fine tuning for your use cases.
|
| 30 |
+
|
| 31 |
+
### Recommendations
|
| 32 |
+
|
| 33 |
+
Fine tune the model with your own data.
|
| 34 |
+
|
| 35 |
+
## How to Get Started with the Model
|
| 36 |
+
|
| 37 |
+
Use the following prompt:
|
| 38 |
+
prompt = f"###Task: abstractive_qa \n###Question: {question} \n###Passages:{passage}"
|
| 39 |
+
|
| 40 |
+
where **question** is your query
|
| 41 |
+
and **passage** is a concatenated set of passages that needs to be considered for answering a question.
|
| 42 |
+
|
| 43 |
+
Use the code below to get started with the model:
|
| 44 |
+
|
| 45 |
+
To run this code with nlm-model-service, use the following code:
|
| 46 |
+
```
|
| 47 |
+
pip install nlm-utils
|
| 48 |
+
```
|
| 49 |
+
```
|
| 50 |
+
qa_sum_client_bart = ClassificationClient(
|
| 51 |
+
model="bart",
|
| 52 |
+
task="qa_sum",
|
| 53 |
+
url=v100Url,
|
| 54 |
+
retry=1,
|
| 55 |
+
)
|
| 56 |
+
# nlm-model-service suppports batch invocatin and you can send multiple question/passage pairs at a time.
|
| 57 |
+
questions = ["what are the adverse reactions of Dimethylsulfoxide"]
|
| 58 |
+
sentences = ["Dimethylsulfoxide Adverse reactions Garlic taste in mouth, dry skin, erythema and pruritis (2), urine discoloration, halitosis, agitation, hypotension, sedation and dizziness (13) have been reported following use of DMSO. Dimethylsulfoxide Adverse reactions: malaria and loose motion."]
|
| 59 |
+
qa_sum_client_bart(questions, sentences)
|
| 60 |
+
```
|
| 61 |
+
|
| 62 |
+
## Training Details
|
| 63 |
+
|
| 64 |
+
### Training Data
|
| 65 |
+
|
| 66 |
+
Base training data was taken from this dataset with more data added for certain usage scenarios.
|
| 67 |
+
https://github.com/microsoft/MSMARCO-Question-Answering
|
| 68 |
+
|
| 69 |
+
### Training Procedure
|
| 70 |
+
|
| 71 |
+
Coming soon.
|
| 72 |
+
|
| 73 |
+
#### Hardware
|
| 74 |
+
|
| 75 |
+
T4, V100 or A100 GPU is recommended.
|
| 76 |
+
|
| 77 |
+
## Citation
|
| 78 |
+
|
| 79 |
+
MS MARCO: A Human Generated MAchine Reading COmprehension Dataset
|
| 80 |
+
https://arxiv.org/abs/1611.09268
|
| 81 |
+
|
| 82 |
+
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
|
| 83 |
+
https://arxiv.org/abs/1910.13461
|
| 84 |
+
|
| 85 |
+
## Model Card Authors
|
| 86 |
+
|
| 87 |
+
Ambika Sukla
|
| 88 |
+
|
| 89 |
+
## Model Card Contact
|
| 90 |
+
|
| 91 |
+
ambika.sukla@nlmatics.com
|