Commit History

fix app.py to remove redundant generate_kwargs in transcription calls and enable demo sharing during launch
3e64dd3

michaeltangz commited on

fix app.py to enable demo sharing during launch
dd7a42e

michaeltangz commited on

refactor app.py to update demo launch logic; ensure proper execution within main guard
a3d95dd

michaeltangz commited on

fix app.py to correct dtype parameter usage in model initialization and pipeline; remove redundant torch_dtype argument
574825e

michaeltangz commited on

fix app.py to correct variable name from dtype to torch_dtype in model and pipeline initialization
6cd2d8c

michaeltangz commited on

fix app.py to correct variable name for dtype in model and pipeline initialization
1a404ef

michaeltangz commited on

refactor app.py to remove flash attention installation logic and simplify attention implementation; enhance error handling in transcription functions
721ab04

michaeltangz commited on

refactor requirements.txt to remove build isolation for flash attention installation
20eeccd

michaeltangz commited on

refactor requirements.txt to ensure flash attention installation without build isolation
44864ba

michaeltangz commited on

refactor requirements.txt to add gradio and numpy dependencies for enhanced functionality
28675ec

michaeltangz commited on

refactor app.py to move flash attention installation into a try-except block; ensure proper fallback to sdpa if installation fails
078e579

michaeltangz commited on

refactor app.py to streamline flash attention installation and model loading; remove fallback mechanisms and enhance transcription parameters
8f2a46b

michaeltangz commited on

refactor app.py to remove condition_on_previous_text parameter from transcription functions; enhance output consistency
f8af19e

michaeltangz commited on

refactor app.py to enhance pipeline configuration; adjust chunk length for better context and simplify demo interface
31502e6

michaeltangz commited on

refactor app.py to correct parameter name for model loading; streamline microphone input handling
d64755c

michaeltangz commited on

refactor app.py to improve flash attention installation and model loading; add fallback mechanism for model loading
47751fa

michaeltangz commited on

refactor app.py to streamline flash attention installation and model loading; enhance voice activity detection and transcription accuracy
62fccb4

michaeltangz commited on

refactor app.py to enhance transcription pipeline by adding condition_on_previous_text parameter; improve transcription accuracy
ae149f3

michaeltangz commited on

refactor app.py to specify language parameter in transcription pipeline; enhance accuracy for English transcriptions
aebccdb

michaeltangz commited on

refactor app.py to update model loading parameters for flash attention; improve code clarity and maintainability
26bcc10

michaeltangz commited on

refactor app.py to remove concurrency limit from audio stream; improve streaming performance
a6b63fd

michaeltangz commited on

refactor app.py to remove unnecessary time limit from audio stream; enhance streaming functionality
35686e6

michaeltangz commited on

refactor app.py to improve flash-attn installation and model loading with error handling; update demo launch condition
fce6c08

michaeltangz commited on

refactor app.py to streamline flash-attn installation and model loading; update requirements.txt to remove unnecessary dependencies
6fdae11

michaeltangz commited on

refactor app.py to improve model loading with flash attention and enhance streaming transcription; add packages.txt for dependencies
7b34cad

michaeltangz commited on

refactor requirements.txt to remove duplicate entries and ensure proper package versions
aa1d99c

michaeltangz commited on

add flash-attn to requirements for improved attention mechanisms
d12338a

michaeltangz commited on

install Flash Attention 2 and optimize Whisper model loading; enhance streaming transcription with pipeline approach and latency tracking
a9a8aec

michaeltangz commited on

update demo launch settings for Spaces to share=False by default
03bd1f9

michaeltangz commited on

adjust max_new_tokens in transcription functions to leave room for special tokens
e542369

michaeltangz commited on

enhance audio transcription accuracy with librosa resampling and voice activity detection; increase buffer duration for better context retention
2afefc3

michaeltangz commited on

refactor streaming transcription to only emit complete sentences and clear buffer after capture, removing incremental text accumulation logic
e5caa82

mic3333 commited on

add incremental text accumulation to prevent duplicate transcription of buffered audio
846c0d7

mic3333 commited on

simplify streaming transcription by removing VAD, diarization, and complex buffering logic
55d67f9

mic3333 commited on

add speaker diarization with pyannote and optimize streaming pipeline with static caching and deque-based buffering
1f8fa97

mic3333 commited on

increase buffer duration and adjust transcription parameters to prioritize accuracy over latency
52f6b26

mic3333 commited on

reduce buffer duration and adjust transcription parameters for lower latency
8493d7f

mic3333 commited on

improve sentence splitting to avoid breaking on common abbreviations like Mr., Ms., Dr., etc.
649d6e6

mic3333 commited on

remove live subtitle display and only show full transcript in the UI
41e2ea2

mic3333 commited on

improve near-duplicate detection to compare against all existing sentences instead of only the last one
5db4d4b

mic3333 commited on

add near-duplicate detection to suppress streaming variants of the same sentence
67ce003

mic3333 commited on

update live subtitle to persist last sentence when no incomplete audio
8d1f680

mic3333 commited on

Merge branch 'asr-1'
56ee70e

mic3333 commited on

refactor to use per-session state instead of global variables
8a63ea7

mic3333 commited on

update the instruction(copy) for ui
bd1f625

mic3333 commited on

update the instruction(copy) for ui
7b46207

mic3333 commited on

edit the instruction for ui
af9867e

mic3333 commited on

uodate the new words modification
f091709

mic3333 commited on

optimize the words for better display on ui
c98c665

mic3333 commited on

change to small whisper model
4b1ceed

mic3333 commited on