Instructions to use Vickstester/PV-BioMistral-1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Vickstester/PV-BioMistral-1 with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Vickstester/PV-BioMistral-1",
	filename="pv-biomistral-7b-Q4_K_M.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use Vickstester/PV-BioMistral-1 with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Vickstester/PV-BioMistral-1:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Vickstester/PV-BioMistral-1:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Vickstester/PV-BioMistral-1:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Vickstester/PV-BioMistral-1:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Vickstester/PV-BioMistral-1:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf Vickstester/PV-BioMistral-1:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Vickstester/PV-BioMistral-1:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Vickstester/PV-BioMistral-1:Q4_K_M

Use Docker

docker model run hf.co/Vickstester/PV-BioMistral-1:Q4_K_M

LM Studio
Jan
Ollama
How to use Vickstester/PV-BioMistral-1 with Ollama:
```
ollama run hf.co/Vickstester/PV-BioMistral-1:Q4_K_M
```

Unsloth Studio

How to use Vickstester/PV-BioMistral-1 with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Vickstester/PV-BioMistral-1 to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Vickstester/PV-BioMistral-1 to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Vickstester/PV-BioMistral-1 to start chatting

Docker Model Runner
How to use Vickstester/PV-BioMistral-1 with Docker Model Runner:
```
docker model run hf.co/Vickstester/PV-BioMistral-1:Q4_K_M
```

Lemonade

How to use Vickstester/PV-BioMistral-1 with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Vickstester/PV-BioMistral-1:Q4_K_M

Run and chat with the model

lemonade run user.PV-BioMistral-1-Q4_K_M

List all available models

lemonade list

Vickstester commited on Apr 13

Commit

2f97eee

verified ·

1 Parent(s): 983c1f2

Upload README.md

Browse files

Files changed (1) hide show

README.md +158 -38

README.md CHANGED Viewed

@@ -1,58 +1,178 @@
 ---
-base_model: mistralai/Mistral-7B-Instruct-v0.3
-library_name: transformers
-model_name: pv-biomistral-7b-v2
 tags:
-- generated_from_trainer
-- sft
-- trl
-licence: license
 ---
-# Model Card for pv-biomistral-7b-v2
-This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3).
-It has been trained using [TRL](https://github.com/huggingface/trl).
-## Quick start
-```python
-from transformers import pipeline
-question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
-generator = pipeline("text-generation", model="Vickstester/pv-biomistral-7b-v2", device="cuda")
-output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
-print(output["generated_text"])
 ```
-## Training procedure
-This model was trained with SFT.
-### Framework versions
-- TRL: 1.0.0
-- Transformers: 5.5.0
-- Pytorch: 2.11.0+cu126
-- Datasets: 4.8.4
-- Tokenizers: 0.22.2
-## Citations
-Cite TRL as:
-```bibtex
-@software{vonwerra2020trl,
-  title   = {{TRL: Transformers Reinforcement Learning}},
-  author  = {von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin},
-  license = {Apache-2.0},
-  url     = {https://github.com/huggingface/trl},
-  year    = {2020}
-}
-```

 ---
+language:
+- en
+license: cc-by-nc-4.0
 tags:
+- pharmacovigilance
+- medical
+- mistral
+- qlora
+- faers
+- drug-safety
+- adverse-events
+base_model: mistralai/Mistral-7B-Instruct-v0.3
+---
+# pv-biomistral-7b
+A pharmacovigilance-specialised language model fine-tuned from
+[Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3)
+on 100,000 FAERS-derived training examples across five structured PV tasks.
+This is the community testing release. It contains only the Q4_K_M quantized
+GGUF for local inference via Ollama or llama-cpp-python.
+---
+## ⚠️ Important Disclaimer
+This model is a **research prototype** intended for pharmacovigilance
+professionals to evaluate and provide feedback on. It is **not a validated
+system** and must not be used for:
+- Autonomous pharmacovigilance decision-making
+- Generating or contributing to regulatory submissions
+- Replacing qualified pharmacovigilance assessor judgment
+- Clinical or safety-critical decisions of any kind
+All model outputs require review by a qualified pharmacovigilance professional.
+This tool is for exploratory and research purposes only.
 ---
+## Model Details
+| Property | Value |
+|---|---|
+| Base model | mistralai/Mistral-7B-Instruct-v0.3 |
+| Fine-tuning method | QLoRA (4-bit NF4, LoRA r=16) |
+| Training records | 100,000 |
+| Training epochs | 3 |
+| Data source | FAERS public database (FDA) |
+| Quantization | Q4_K_M (GGUF) |
+| Model size | 4.37 GB |
+| Context window | 8192 tokens |
+| Framework | TRL 1.0.0, Transformers, PEFT |
+## Setup — Ollama (Recommended)
+### Requirements
+- [Ollama](https://ollama.com/download) installed
+- ~5 GB free disk space
+- 8 GB RAM minimum, 16 GB recommended
+- GPU optional but recommended for faster inference
+### Installation
+**Step 1 — Download both files from this repository:**
+- `pv-biomistral-7b-Q4_K_M.gguf` (4.37 GB)
+- `Modelfile`
+Place both in the same folder.
+**Step 2 — Create the Ollama model**
+```bash
+cd /path/to/downloaded/files
+ollama create pv-mistral-v2 -f Modelfile
+```
+**Step 3 — Run**
+```bash
+ollama run pv-mistral-v2
 ```
+**Windows users:** Use the full path e.g. `cd C:\Users\YourName\Downloads\pv-model\`
+---
+## Setup — llama-cpp-python (Alternative)
+```bash
+pip install llama-cpp-python[server]
+python -m llama_cpp.server \
+  --model pv-biomistral-7b-Q4_K_M.gguf \
+  --chat_format mistral-instruct \
+  --n_gpu_layers -1 \
+  --n_ctx 8192
+```
+Then open `http://localhost:8000/docs` for the Swagger UI.
+---
+## Setup — Jan App (Windows/Mac)
+1. Download [Jan](https://jan.ai)
+2. Import Model → select the GGUF file
+3. Set temperature to 0.1 in chat settings
+4. Add system prompt from the Modelfile SYSTEM field
+---
+## Expected Performance by Hardware
+| Hardware | Speed | Response Time |
+|---|---|---|
+| Mac Mini M4 / Apple Silicon | 25-35 tokens/sec | 2-5 sec/case |
+| Windows + NVIDIA GPU (8GB+ VRAM) | 25-40 tokens/sec | 2-4 sec/case |
+| Snapdragon X Elite (16GB) | 8-15 tokens/sec | 5-12 sec/case |
+| Windows CPU only (16-24GB RAM) | 3-6 tokens/sec | 15-30 sec/case |
+---
+## Known Limitations
+- **Probable causality underrepresented:** Training data contained only 70 Probable
+  causality examples out of 100,000 records, reflecting real-world FAERS spontaneous
+  reporting patterns. The model may default to Possible even for cases with confirmed
+  positive dechallenge and no confounders.
+- **Spontaneous reports only:** Trained exclusively on FAERS spontaneous adverse
+  event reports. Performance on clinical trial safety data, EHR-derived cases,
+  or non-English source material is untested.
+- **Not formally validated:** The model has not been validated against any regulatory
+  standard including ICH E2D, ICH E2A, or WHO-UMC guidelines.
+- **Short context optimised:** Designed for single-case inputs under 512 tokens.
+---
+## CIOMS WG XIV Alignment
+This model is designed to operate within a Human-in-the-Loop (HITL) framework
+consistent with CIOMS Working Group XIV recommendations for AI in drug safety.
+All outputs are decision-support signals requiring human adjudication by a
+qualified pharmacovigilance professional.
+---
+## Feedback
+This is a community testing release. Please evaluate the model on real cases
+from your practice area and share findings. Particular interest in:
+- Causality outputs where you would classify Probable
+- Cases with unusual drug combinations or rare reactions
+- Narrative quality from a safety database entry perspective
+- Therapeutic areas where performance appears weaker
+---
+## Training Data
+Trained on 10,000 cases from the FDA Adverse Event Reporting System (FAERS),
+accessed via public database export. No proprietary, confidential, or
+patient-identifiable data beyond what is publicly available in FAERS was used.
+---
+## License
+Base model (Mistral-7B-Instruct-v0.3): Apache 2.0
+Fine-tuned weights: CC BY-NC 4.0 (non-commercial research use only)
+By downloading this model you agree to use it for research purposes only
+and not for any commercial application or regulatory submission.