|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
tags: |
|
|
- gguf |
|
|
- ollama |
|
|
- fda |
|
|
- regulatory |
|
|
- task-extraction |
|
|
- llama |
|
|
datasets: |
|
|
- fda-documents |
|
|
pipeline_tag: text-generation |
|
|
model_type: llama |
|
|
quantization: Q8_0 |
|
|
--- |
|
|
|
|
|
# FDA Task Classifier - GGUF |
|
|
|
|
|
A specialized language model fine-tuned for extracting regulatory tasks from FDA correspondence documents. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Model Type:** LlamaForCausalLM |
|
|
- **Parameters:** 361.82M |
|
|
- **Quantization:** Q8_0 GGUF |
|
|
- **Context Window:** 4096 tokens |
|
|
- **File Size:** 369 MB |
|
|
- **License:** Apache 2.0 |
|
|
|
|
|
## Quick Start with Ollama |
|
|
|
|
|
The easiest way to use this model is with [Ollama](https://ollama.com): |
|
|
|
|
|
```bash |
|
|
# Pull the Modelfile from this repo |
|
|
wget https://huggingface.co/llama-farm/fda-task-classifier-gguf/raw/main/Modelfile |
|
|
|
|
|
# Create the model in Ollama |
|
|
ollama create fda-task-classifier -f Modelfile |
|
|
|
|
|
# Run the model |
|
|
ollama run fda-task-classifier |
|
|
``` |
|
|
|
|
|
### Or download manually: |
|
|
|
|
|
```bash |
|
|
# Download the GGUF file |
|
|
wget https://huggingface.co/llama-farm/fda-task-classifier-gguf/resolve/main/model.gguf |
|
|
|
|
|
# Create a Modelfile |
|
|
cat > Modelfile << 'EOF' |
|
|
FROM ./model.gguf |
|
|
|
|
|
PARAMETER temperature 0.3 |
|
|
PARAMETER top_p 0.9 |
|
|
PARAMETER top_k 40 |
|
|
PARAMETER num_ctx 4096 |
|
|
PARAMETER num_predict 512 |
|
|
|
|
|
SYSTEM """You are an FDA regulatory task extraction specialist. Your role is to analyze document chunks and identify specific FDA regulatory tasks, requirements, and action items. |
|
|
|
|
|
When analyzing text, focus on: |
|
|
- Regulatory submissions and deadlines |
|
|
- Clinical trial requirements |
|
|
- Manufacturing and quality control tasks |
|
|
- Compliance and reporting obligations |
|
|
- Safety monitoring requirements |
|
|
- Documentation and record-keeping tasks |
|
|
|
|
|
Extract tasks in a structured format with: |
|
|
- Task description |
|
|
- Regulatory category (e.g., clinical, manufacturing, compliance) |
|
|
- Priority level if mentioned |
|
|
- Deadline if specified |
|
|
- Relevant FDA regulation references |
|
|
|
|
|
Be precise and factual. Only extract tasks that are explicitly stated or clearly implied in the text.""" |
|
|
EOF |
|
|
|
|
|
# Create model in Ollama |
|
|
ollama create fda-task-classifier -f Modelfile |
|
|
``` |
|
|
|
|
|
## Usage Examples |
|
|
|
|
|
### Simple Task Extraction |
|
|
|
|
|
```bash |
|
|
ollama run fda-task-classifier "Extract all FDA regulatory tasks from this text: |
|
|
|
|
|
The sponsor must submit a complete Chemistry, Manufacturing, and Controls (CMC) |
|
|
section as part of the IND application within 30 days of this notice. Additionally, |
|
|
the clinical protocol must be amended to include enhanced safety monitoring procedures." |
|
|
``` |
|
|
|
|
|
**Output:** |
|
|
``` |
|
|
1. Submit complete CMC section within 30 days |
|
|
Category: Manufacturing/Submission |
|
|
Priority: Critical |
|
|
Deadline: 30 days from notice |
|
|
|
|
|
2. Amend clinical protocol to include enhanced safety monitoring |
|
|
Category: Clinical/Safety |
|
|
Priority: High |
|
|
``` |
|
|
|
|
|
### API Usage |
|
|
|
|
|
```python |
|
|
import requests |
|
|
|
|
|
response = requests.post('http://localhost:11434/api/generate', json={ |
|
|
"model": "fda-task-classifier", |
|
|
"prompt": "Extract tasks from: The sponsor should provide updated stability data...", |
|
|
"stream": False |
|
|
}) |
|
|
|
|
|
print(response.json()['response']) |
|
|
``` |
|
|
|
|
|
## Model Specialization |
|
|
|
|
|
This model is specifically trained to identify: |
|
|
|
|
|
✅ **Submission Requirements** |
|
|
- IND/NDA submissions |
|
|
- Supplemental applications |
|
|
- Annual reports |
|
|
|
|
|
✅ **Clinical Trial Directives** |
|
|
- Protocol amendments |
|
|
- Safety monitoring |
|
|
- Patient enrollment criteria |
|
|
|
|
|
✅ **Manufacturing Tasks** |
|
|
- CMC requirements |
|
|
- Quality control procedures |
|
|
- GMP compliance |
|
|
|
|
|
✅ **Regulatory Compliance** |
|
|
- 21 CFR citations |
|
|
- Inspection responses |
|
|
- CAPA plans |
|
|
|
|
|
✅ **Safety Obligations** |
|
|
- Adverse event reporting |
|
|
- REMS requirements |
|
|
- Risk assessments |
|
|
|
|
|
## Integration with LlamaFarm |
|
|
|
|
|
This model is designed to work seamlessly with [LlamaFarm](https://github.com/llama-farm/llamafarm): |
|
|
|
|
|
```yaml |
|
|
# llamafarm.yaml |
|
|
runtime: |
|
|
models: |
|
|
- name: fda-task-classifier |
|
|
provider: ollama |
|
|
model: fda-task-classifier |
|
|
base_url: http://localhost:11434/v1 |
|
|
|
|
|
agents: |
|
|
- name: fda_document_analyzer |
|
|
type: document_analyzer |
|
|
model: fda-task-classifier |
|
|
description: Extracts FDA regulatory tasks from documents |
|
|
``` |
|
|
|
|
|
## Performance |
|
|
|
|
|
- **Speed:** ~2-3 seconds per document chunk on M1 Mac |
|
|
- **Accuracy:** Optimized for FDA regulatory language |
|
|
- **Context:** 4096 tokens (sufficient for most FDA letter sections) |
|
|
- **Memory:** ~500MB RAM usage |
|
|
|
|
|
## Files in This Repository |
|
|
|
|
|
- `model.gguf` - Quantized model weights (Q8_0) |
|
|
- `Modelfile` - Ollama model configuration |
|
|
- `README.md` - Original documentation |
|
|
- `USAGE.md` - Detailed usage examples |
|
|
- `model_info.json` - Model metadata |
|
|
|
|
|
## Technical Details |
|
|
|
|
|
**Architecture:** LlamaForCausalLM |
|
|
**Quantization:** Q8_0 (8-bit quantization) |
|
|
**Base Model:** [Undisclosed] |
|
|
**Training Data:** FDA correspondence, deficiency letters, meeting minutes |
|
|
|
|
|
**Recommended Parameters:** |
|
|
- `temperature: 0.3` - More deterministic outputs |
|
|
- `top_p: 0.9` - Focused sampling |
|
|
- `num_ctx: 4096` - Optimized context window |
|
|
- `num_predict: 512` - Concise task lists |
|
|
|
|
|
## Use Cases |
|
|
|
|
|
1. **Regulatory Document Processing** |
|
|
- Extract action items from FDA deficiency letters |
|
|
- Identify compliance obligations |
|
|
- Track submission deadlines |
|
|
|
|
|
2. **Quality Assurance** |
|
|
- Parse inspection observations (483s) |
|
|
- Extract CAPA requirements |
|
|
- Identify GMP violations |
|
|
|
|
|
3. **Clinical Operations** |
|
|
- Extract protocol amendment requirements |
|
|
- Identify safety reporting obligations |
|
|
- Track clinical trial milestones |
|
|
|
|
|
4. **Automated Compliance** |
|
|
- Build task tracking systems |
|
|
- Create regulatory calendars |
|
|
- Generate compliance reports |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Optimized for FDA documents (US regulatory text) |
|
|
- May not generalize well to other regulatory bodies (EMA, PMDA) |
|
|
- Works best with formal regulatory correspondence |
|
|
- Limited to English language |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model in your research or application, please cite: |
|
|
|
|
|
```bibtex |
|
|
@software{fda_task_classifier_2025, |
|
|
title={FDA Task Classifier GGUF}, |
|
|
author={LlamaFarm Team}, |
|
|
year={2025}, |
|
|
url={https://huggingface.co/llama-farm/fda-task-classifier-gguf} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
Apache 2.0 - See LICENSE file for details |
|
|
|
|
|
## Links |
|
|
|
|
|
- **LlamaFarm:** https://github.com/llama-farm/llamafarm |
|
|
- **Ollama:** https://ollama.com |
|
|
- **Issues:** https://github.com/llama-farm/llamafarm/issues |
|
|
- **Discord:** https://discord.gg/RrAUXTCVNF |
|
|
|