Instructions to use Mohammed-Altaf/Medical-ChatBot with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Mohammed-Altaf/Medical-ChatBot with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Mohammed-Altaf/Medical-ChatBot")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Mohammed-Altaf/Medical-ChatBot")
model = AutoModelForCausalLM.from_pretrained("Mohammed-Altaf/Medical-ChatBot")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Mohammed-Altaf/Medical-ChatBot with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Mohammed-Altaf/Medical-ChatBot"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Mohammed-Altaf/Medical-ChatBot",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Mohammed-Altaf/Medical-ChatBot

SGLang

How to use Mohammed-Altaf/Medical-ChatBot with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Mohammed-Altaf/Medical-ChatBot" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Mohammed-Altaf/Medical-ChatBot",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Mohammed-Altaf/Medical-ChatBot" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Mohammed-Altaf/Medical-ChatBot",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use Mohammed-Altaf/Medical-ChatBot with Docker Model Runner:
```
docker model run hf.co/Mohammed-Altaf/Medical-ChatBot
```

Douts

by visionop19 - opened Dec 25, 2023

Discussion

visionop19

Dec 25, 2023

Hey altaf I am chandan Cr I need to use your model in an hackathon I need to connect with you how can I ?

Mohammed-Altaf

Owner Dec 26, 2023

you can see the details to use at the model card it self. used quantized version of the model if you lack hardware reseources.

visionop19

Dec 26, 2023

Sorry for the trouble, I am chandan Cr from banglore studying in 2nd year Ai and ML, I am new to this field and this is my first hackathon can u help me how I can use quantazied model

Mohammed-Altaf

Owner Dec 26, 2023

import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer


path = "Mohammed-Altaf/medical_chatbot-8bit"
device = "cuda" if torch.cuda.is_available() else "cpu"
tokenizer = GPT2Tokenizer.from_pretrained(path)
model = GPT2LMHeadModel.from_pretrained(path).to(device)

prompt_input = (
    "The conversation between human and AI assistant.\n"
    "[|Human|] {input}\n"
    "[|AI|]"
)
sentence = prompt_input.format_map({'input': "what is parkinson's disease?"})
inputs = tokenizer(sentence, return_tensors="pt").to(device)

with torch.no_grad():
    beam_output = model.generate(**inputs,
                                min_new_tokens=1, 
                                max_length=512,
                                num_beams=3,
                                repetition_penalty=1.2,
                                early_stopping=True,
                                eos_token_id=198 
                                )
    print(tokenizer.decode(beam_output[0], skip_special_tokens=True))

using above code you can use the quantized model just keep changing the sentence variable to change the input from "what is parkinsons disease" to anythig you want or take the input from the user and add it there that's it.
Convert the above code into a function and return the decoded value from the tokenizer rather than printing it,
that should solve your problem

visionop19

Jan 1, 2024

Is there any social media where I can contact youuu

visionop19

Jan 3, 2024

I am getting error while reading your json file on line 1 itself expecting a eof ','

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment