Instructions to use BEncoderRT/medical_inference with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use BEncoderRT/medical_inference with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="BEncoderRT/medical_inference", filename="unsloth.Q8_0.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use BEncoderRT/medical_inference with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf BEncoderRT/medical_inference:Q8_0 # Run inference directly in the terminal: llama-cli -hf BEncoderRT/medical_inference:Q8_0
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf BEncoderRT/medical_inference:Q8_0 # Run inference directly in the terminal: llama-cli -hf BEncoderRT/medical_inference:Q8_0
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf BEncoderRT/medical_inference:Q8_0 # Run inference directly in the terminal: ./llama-cli -hf BEncoderRT/medical_inference:Q8_0
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf BEncoderRT/medical_inference:Q8_0 # Run inference directly in the terminal: ./build/bin/llama-cli -hf BEncoderRT/medical_inference:Q8_0
Use Docker
docker model run hf.co/BEncoderRT/medical_inference:Q8_0
- LM Studio
- Jan
- vLLM
How to use BEncoderRT/medical_inference with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "BEncoderRT/medical_inference" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "BEncoderRT/medical_inference", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/BEncoderRT/medical_inference:Q8_0
- Ollama
How to use BEncoderRT/medical_inference with Ollama:
ollama run hf.co/BEncoderRT/medical_inference:Q8_0
- Unsloth Studio new
How to use BEncoderRT/medical_inference with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for BEncoderRT/medical_inference to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for BEncoderRT/medical_inference to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for BEncoderRT/medical_inference to start chatting
- Docker Model Runner
How to use BEncoderRT/medical_inference with Docker Model Runner:
docker model run hf.co/BEncoderRT/medical_inference:Q8_0
- Lemonade
How to use BEncoderRT/medical_inference with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull BEncoderRT/medical_inference:Q8_0
Run and chat with the model
lemonade run user.medical_inference-Q8_0
List all available models
lemonade list
Update README.md
Browse files
README.md
CHANGED
|
@@ -74,10 +74,32 @@ def generate(model, prompt, max_new_tokens=200):
|
|
| 74 |
|
| 75 |
# Example prompt
|
| 76 |
prompt = """### Instruction:
|
| 77 |
-
|
| 78 |
-
|
| 79 |
### Response:
|
| 80 |
"""
|
| 81 |
|
| 82 |
print(generate(model, prompt))
|
| 83 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 74 |
|
| 75 |
# Example prompt
|
| 76 |
prompt = """### Instruction:
|
| 77 |
+
"一个患有急性阑尾炎的病人已经发病5天,腹痛稍有减轻但仍然发热,在体检时发现右下腹有压痛的包块,此时应如何处理?"
|
|
|
|
| 78 |
### Response:
|
| 79 |
"""
|
| 80 |
|
| 81 |
print(generate(model, prompt))
|
| 82 |
|
| 83 |
+
|
| 84 |
+
<think>
|
| 85 |
+
这个病人已经有五天的急性阑尾炎了,腹痛虽然有所缓解,但他还是发烧,这让我有点担心。再看看他的体检结果,右下腹有压痛的包块,这就让我更有理由担心。嗯,既然有包块,而且病人还是发烧,应该是急性阑尾炎的急性症状。
|
| 86 |
+
|
| 87 |
+
想到这些,我觉得应该马上处理一下。最好的办法就是立即做腹部超声检查,看看包块的具体情况。超声检查可以帮助我们了解包块的位置、大小、形态以及周围组织的状态,这些信息对治疗非常重要。
|
| 88 |
+
|
| 89 |
+
哦,对了,包块如果是液体的,可能就不需要手术了。我们可以通过超声检查来确定这一点。这样就能避免不必要的手术创伤。
|
| 90 |
+
|
| 91 |
+
但是如果包块是固体的,特别是当它对周围组织有压迫作用时,手术就变得必要了。我们需要根据包块的具体情况来决定是否手术。
|
| 92 |
+
|
| 93 |
+
另外,包块的形态和位置也会影响手术的选择。我们需要根据这些因素来制定具体的手术方案。
|
| 94 |
+
|
| 95 |
+
嗯,总结一下,首先必须做一个腹部超声检查,确定包块的类型和具体情况。只有在确认包块是固体并且对周围组织有压迫作用的时候,才需要考虑手术。
|
| 96 |
+
|
| 97 |
+
好的,这个思路看起来很合理。现在就要把这些想法转化为具体的处理步骤。
|
| 98 |
+
</think>
|
| 99 |
+
在这种情况下,建议首先进行腹部超声检查,以确定包块的类型和具体情况。根据检查结果:
|
| 100 |
+
|
| 101 |
+
1. **如果包块是液体性包块**:通常不需要手术。可以通过药物治疗缓解症状,随后观察病情缓解情况。
|
| 102 |
+
|
| 103 |
+
2. **如果包块是固体性包块并且对周围组织有压迫作用**:这时手术就变得必要了。通常需要进行包块切除术,以解除压迫,缓解症状。
|
| 104 |
+
|
| 105 |
+
请尽快安排进行腹部超声检查,并根据检查结果制定最合适的手术方案。<|end▁of▁sentence|>
|