ViT5-VietMedSum / README.md

nielsr HF Staff

Improve model card: correct pipeline tag, add paper link, and refine description

9ac59bb verified about 1 year ago

2.89 kB

datasets:
  - leduckhai/VietMed-Sum
language:
  - vi
library_name: transformers
pipeline_tag: audio-text-to-text
license: mit

Real-time Speech Summarization for Medical Conversations

Interspeech 2024 (Oral)

Khai Le-Duc*, Khai-Nguyen Nguyen*, Long Vo-Dang, Truong-Son Hy

*Equal contribution

drawing

Description

This model, presented in the paper Real-time Speech Summarization for Medical Conversations, summarizes medical dialogues in Vietnamese. It can work in tandem with an ASR system to provide real-time dialogue summary. In doctor-patient conversations, identifying medically relevant information is crucial, posing the need for conversation summarization. This work proposes a deployable real-time speech summarization system for real-world applications in industry, which generates a local summary after every N speech utterances within a conversation and a global summary after the end of a conversation. This system enhances user experience and reduces computational costs. The work also introduces VietMed-Sum, a speech summarization dataset for medical conversations, and details a methodology for creating gold standard and synthetic summaries using LLMs and human annotators. Baseline results of state-of-the-art models on VietMed-Sum are presented. All code, data (English-translated and Vietnamese), and models are available at: https://github.com/leduckhai/MultiMed/tree/master/VietMed-Sum

Model Details

Model Description

This model summarizes medical dialogues in Vietnamese. It can work in tandem with an ASR system to provide real-time dialogue summary.

Developed by: Khai-Nguyen Nguyen
Language(s) (NLP): Vietnamese
Finetuned from model [optional]: ViT5

How to Get Started with the Model

Install the pre-requisite packages in Python.

pip install transformers

Use the code below to get started with the model.

from transformers import pipeline

# Initialize the pipeline with the ViT5 model, specify the device to use CUDA for GPU acceleration
pipe = pipeline("text2text-generation", model="monishsystem/medisum_vit5", device='cuda')

# Example text in Vietnamese describing a traditional medicine product
example = "Loại thuốc này chứa các thành phần đông y đặc biệt tốt cho sức khoẻ, giúp tăng cường sinh lý và bổ thận tráng dương, đặc biệt tốt cho người cao tuổi và người có bệnh lý nền"

# Generate a summary for the input text with a maximum length of 50 tokens
summary = pipe(example, max_new_tokens=50)

# Print the generated summary
print(summary)