Named Entity Recognition (NER) Agents Guide
Overview
The Pub/Sub Multi-Agent System now includes specialized NER (Named Entity Recognition) agents powered by HuggingFace Transformers. These agents use pre-trained BERT models to extract medical entities from text and work differently from regular LLM agents.
Technical Implementation
NER agents use the HuggingFace transformers library:
from transformers import pipeline
ner_pipeline = pipeline(
"ner",
model="samrawal/bert-base-uncased_clinical-ner",
aggregation_strategy="simple"
)
# Process text
entities = ner_pipeline("Patient has diabetes")
Key differences from LLM agents:
- Use transformers pipelines, not Ollama
- Models are downloaded on first use from HuggingFace
- Processing is deterministic (no temperature/sampling)
- Faster inference than LLM-based extraction
Available NER Models
1. Clinical NER Model
Model: samrawal/bert-base-uncased_clinical-ner
Purpose: Extract clinical entities from medical text
Recognized Entity Types:
- PROBLEM: Diseases, conditions, symptoms
- TREATMENT: Medications, procedures, therapies
- TEST: Diagnostic tests, lab results
- OCCURRENCE: Medical events, admissions
Best for:
- Clinical notes
- Patient reports
- Medical records
- Symptom descriptions
2. Anatomy Detection Model
Model: OpenMed/OpenMed-NER-AnatomyDetect-BioPatient-108M
Purpose: Detect anatomical structures and patient information
Recognized Entity Types:
- ANATOMY: Body parts, organs, anatomical structures
- PATIENT: Patient demographics, identifiers
- BIOMARKER: Biological markers
- CLINICAL_FINDING: Clinical observations
Best for:
- Anatomical descriptions
- Radiology reports
- Surgical notes
- Physical examination records
How NER Agents Work
Processing Flow
NER agents work the same way as LLM agents regarding prompt processing, but differ in what they do with the rendered prompt:
1. Prompt Rendering (Same as LLM agents):
Agent Prompt: "Patient information: {PatientNote}"
Rendered: "Patient information: Patient has diabetes and takes metformin"
2. Processing Difference:
LLM Agents:
- Rendered prompt → Send to Ollama → Generate response
NER Agents:
- Rendered prompt → IS the text to analyze → Extract entities
Example
Agent Configuration:
Title: Clinical Entity Extractor
Model: samrawal/bert-base-uncased_clinical-ner
Prompt: Clinical note:
{PatientNote}
Extract entities from the note above.
What happens:
- System renders prompt → Replaces
{PatientNote}with actual text - Rendered text: "Clinical note:\nPatient has diabetes\n\nExtract entities from the note above."
- NER processes entire rendered text (not just the data source)
- Entities found: "diabetes" as PROBLEM
Key Point: The prompt template itself becomes part of the analyzed text!
Special Behavior
- Unified Prompt System: NER and LLM agents use the same prompt rendering
- Text Analysis: The rendered prompt is the text NER analyzes
- Dual Output:
- JSON result (for chaining to other agents)
- Formatted display (for human reading)
- Dedicated Display: NER Result box shows entities inline with text
Design Philosophy
Why NER agents use rendered prompts as analysis text:
- Consistency: All agents render prompts the same way
- Flexibility: Can combine multiple data sources in prompt
- Context: Can add instructions or context around the text
- Composability: Text from previous agents can be analyzed
Example Use Cases:
Use Case 1: Direct data source
Prompt: {PatientNote}
→ Analyzes just the patient note
Use Case 2: With context
Prompt: Medical History: {History}
Current Symptoms: {Symptoms}
→ Analyzes both sections with labels
Use Case 3: From previous agent
Prompt: {input}
→ Analyzes output from previous agent
Use Case 4: Combined
Prompt: Patient: {question}
Previous Analysis: {input}
→ Analyzes both user question and previous results
Using NER Agents
Basic Setup
Agent Configuration:
Title: Clinical Entity Extractor
Model: samrawal/bert-base-uncased_clinical-ner
Prompt: {PatientNote}
Subscribe Topic: TEXT_TO_ANALYZE
Publish Topic: ENTITIES_FOUND
☑ Show result in Final Result box
What happens:
- Agent receives message from
TEXT_TO_ANALYZEtopic - Renders prompt:
{PatientNote}→ actual patient note text - Runs NER pipeline on the rendered text
- Extracts entities automatically
- Publishes JSON to
ENTITIES_FOUNDtopic - Shows JSON in Final Result box
- Shows formatted text in NER Result box
Important: The entire rendered prompt is analyzed, not just individual placeholders.
Output Format
JSON Output (in Final Result box):
[
{
"text": "diabetes",
"entity_type": "PROBLEM",
"start": 45,
"end": 53,
"score": 0.9987
},
{
"text": "metformin",
"entity_type": "TREATMENT",
"start": 78,
"end": 87,
"score": 0.9923
}
]
Formatted Output (in NER Result box):
Patient reports history of [diabetes:PROBLEM] and is taking [metformin:TREATMENT].
Note: The score field (0.0-1.0) indicates the model's confidence in the entity classification.
Example Workflows
Example 1: Clinical Note Analysis
Data Source:
- Label:
ClinicalNote - Content:
Patient presents with chest pain and shortness of breath.
History of hypertension and diabetes mellitus type 2.
Currently taking lisinopril 10mg daily and metformin 500mg twice daily.
ECG shows ST elevation. Troponin levels elevated at 0.5 ng/mL.
Agents:
Agent 1: Clinical NER
- Title:
Extract Clinical Entities - Model:
samrawal/bert-base-uncased_clinical-ner - Subscribe:
START - Publish:
CLINICAL_ENTITIES - Prompt:
{ClinicalNote}(text to analyze) - ☑ Show result
Agent 2: Entity Summarizer
- Title:
Summarize Findings - Model:
phi4-mini - Subscribe:
CLINICAL_ENTITIES - Publish: (empty)
- Prompt:
Based on these extracted entities:
{input}
Summarize the key clinical findings:
1. Problems identified
2. Treatments mentioned
3. Tests performed
- ☑ Show result
Expected Results:
NER Result box:
Patient presents with [chest pain:PROBLEM] and [shortness of breath:PROBLEM].
History of [hypertension:PROBLEM] and [diabetes mellitus type 2:PROBLEM].
Currently taking [lisinopril:TREATMENT] 10mg daily and [metformin:TREATMENT] 500mg twice daily.
[ECG:TEST] shows ST elevation. [Troponin:TEST] levels elevated at 0.5 ng/mL.
Final Result box:
--- Extract Clinical Entities ---
[{"text": "chest pain", "entity_type": "PROBLEM", ...}, ...]
--- Summarize Findings ---
Key Clinical Findings:
1. Problems: chest pain, shortness of breath, hypertension, diabetes
2. Treatments: lisinopril, metformin
3. Tests: ECG, Troponin
Example 2: Anatomy Detection in Radiology Report
User Question: "Analyze this radiology report"
Data Source:
- Label:
RadiologyReport - Content:
CT scan of the chest reveals mass in right upper lobe measuring 3.2 cm.
No evidence of mediastinal lymphadenopathy.
Heart size is normal. Lungs are clear bilaterally.
Liver and spleen appear unremarkable.
Agent Configuration:
Agent 1: Anatomy Detector
- Title:
Detect Anatomical Structures - Model:
OpenMed/OpenMed-NER-AnatomyDetect-BioPatient-108M - Subscribe:
START - Publish:
ANATOMY_FOUND - Prompt:
{RadiologyReport} - ☑ Show result
Expected NER Result:
CT scan of the [chest:ANATOMY] reveals mass in [right upper lobe:ANATOMY] measuring 3.2 cm.
No evidence of [mediastinal:ANATOMY] lymphadenopathy.
[Heart:ANATOMY] size is normal. [Lungs:ANATOMY] are clear bilaterally.
[Liver:ANATOMY] and [spleen:ANATOMY] appear unremarkable.
Example 3: Multi-Stage Medical Analysis
Workflow: Extract entities → Categorize → Generate report
Agent 1: Entity Extraction
- Model:
samrawal/bert-base-uncased_clinical-ner - Subscribe:
START - Publish:
ENTITIES - ☑ Show result
Agent 2: Entity Categorization
- Model:
phi4-mini - Subscribe:
ENTITIES - Publish:
CATEGORIZED - Prompt:
Categorize these medical entities by type:
{input}
Group by: Problems, Treatments, Tests
- ☑ Show result
Agent 3: Report Generator
- Model:
MedAIBase/MedGemma1.5:4b - Subscribe:
CATEGORIZED - Publish: (empty)
- Prompt:
Generate a structured clinical summary based on:
{input}
Include assessment and plan.
- ☑ Show result
NER Result Display Features
Inline Entity Markup
Entities are displayed inline with brackets and labels:
[entity text:ENTITY_TYPE]
Color Coding (Future Enhancement)
Different entity types could be color-coded:
- Problems: Red
- Treatments: Blue
- Tests: Green
- Anatomy: Purple
Entity Statistics (Future Enhancement)
Could show count of each entity type found.
Best Practices
1. Choosing the Right NER Model
Use Clinical NER for:
- General clinical text
- Patient complaints
- Medical history
- Treatment plans
Use Anatomy NER for:
- Radiology reports
- Surgical notes
- Physical examination
- Anatomical descriptions
2. Crafting Effective Prompts
Simple (Direct Analysis):
Prompt: {PatientNote}
Analyzes just the patient note.
With Structure:
Prompt: Chief Complaint: {Complaint}
Medical History: {History}
Current Medications: {Medications}
Analyzes all sections with clear labels.
With Context:
Prompt: Patient Case Summary:
{input}
The above text contains medical information.
Adds context (though NER will analyze the entire text).
From Previous Agent:
Prompt: {input}
Analyzes output from previous agent in chain.
3. Combining NER with Other Agents
Pattern: Extract → Analyze → Report
NER Agent → Regular LLM → Medical LLM
Example:
- NER extracts entities from clinical note
- phi4-mini categorizes entities by type
- MedGemma generates clinical assessment
4. Understanding What Gets Analyzed
Remember: The ENTIRE rendered prompt is analyzed.
Example:
Prompt: Patient has {condition} and takes {medication}.
If {condition} = "diabetes" and {medication} = "metformin":
Rendered text analyzed:
Patient has diabetes and takes metformin.
Entities found: "diabetes" (PROBLEM), "metformin" (TREATMENT)
The sentence structure matters! NER sees the full context.
Limitations
Current Limitations
- No Prompt Templates: NER agents don't support custom prompts
- Fixed Entity Types: Each model has predefined entity types
- English Only: Models trained on English medical text
- Context Window: Limited input text size
Workarounds
For Long Texts:
- Split into chunks
- Process separately
- Combine results
For Custom Entities:
- Use regular LLM with custom prompt
- Post-process NER output with another agent
Troubleshooting
Issue: No entities detected
Causes:
- Text doesn't contain medical terms
- Wrong NER model for the content type
- Text too short or too long
Solutions:
- Verify text contains medical content
- Try different NER model
- Check text length
Issue: Entities in wrong category
Cause: Model misclassification
Solution: Use post-processing agent to reclassify
Issue: NER Result box empty
Causes:
- "Show result" not checked
- Agent failed to execute
- No entities found
Solutions:
- Check "Show result" checkbox
- Review Execution Log for errors
- Verify input text
Advanced Usage
Combining Multiple NER Models
Run both NER models on same text:
Agent 1: Clinical NER
Subscribe: START
Publish: CLINICAL_ENTITIES
Agent 2: Anatomy NER
Subscribe: START
Publish: ANATOMY_ENTITIES
Agent 3: Merge Results
Subscribe: CLINICAL_ENTITIES, ANATOMY_ENTITIES
Combine both outputs
Entity Validation
Add validation agent after NER:
Agent 1: NER Extraction
Model: Clinical NER
Publish: RAW_ENTITIES
Agent 2: Entity Validator
Model: MedGemma
Subscribe: RAW_ENTITIES
Validate medical accuracy
Publish: VALIDATED_ENTITIES
Future Enhancements
Planned features:
- Color-coded entity display
- Entity statistics dashboard
- Confidence scores
- Custom entity types
- Multi-language support
- Entity linking (to medical ontologies)
- Batch processing