asr / app.py

Commit History

fix app.py to remove redundant generate_kwargs in transcription calls and enable demo sharing during launch
3e64dd3

michaeltangz commited on

fix app.py to enable demo sharing during launch
dd7a42e

michaeltangz commited on

refactor app.py to update demo launch logic; ensure proper execution within main guard
a3d95dd

michaeltangz commited on

fix app.py to correct dtype parameter usage in model initialization and pipeline; remove redundant torch_dtype argument
574825e

michaeltangz commited on

fix app.py to correct variable name from dtype to torch_dtype in model and pipeline initialization
6cd2d8c

michaeltangz commited on

fix app.py to correct variable name for dtype in model and pipeline initialization
1a404ef

michaeltangz commited on

refactor app.py to remove flash attention installation logic and simplify attention implementation; enhance error handling in transcription functions
721ab04

michaeltangz commited on

refactor app.py to move flash attention installation into a try-except block; ensure proper fallback to sdpa if installation fails
078e579

michaeltangz commited on

refactor app.py to streamline flash attention installation and model loading; remove fallback mechanisms and enhance transcription parameters
8f2a46b

michaeltangz commited on

refactor app.py to remove condition_on_previous_text parameter from transcription functions; enhance output consistency
f8af19e

michaeltangz commited on

refactor app.py to enhance pipeline configuration; adjust chunk length for better context and simplify demo interface
31502e6

michaeltangz commited on

refactor app.py to correct parameter name for model loading; streamline microphone input handling
d64755c

michaeltangz commited on

refactor app.py to improve flash attention installation and model loading; add fallback mechanism for model loading
47751fa

michaeltangz commited on

refactor app.py to streamline flash attention installation and model loading; enhance voice activity detection and transcription accuracy
62fccb4

michaeltangz commited on

refactor app.py to enhance transcription pipeline by adding condition_on_previous_text parameter; improve transcription accuracy
ae149f3

michaeltangz commited on

refactor app.py to specify language parameter in transcription pipeline; enhance accuracy for English transcriptions
aebccdb

michaeltangz commited on

refactor app.py to update model loading parameters for flash attention; improve code clarity and maintainability
26bcc10

michaeltangz commited on

refactor app.py to remove concurrency limit from audio stream; improve streaming performance
a6b63fd

michaeltangz commited on

refactor app.py to remove unnecessary time limit from audio stream; enhance streaming functionality
35686e6

michaeltangz commited on

refactor app.py to improve flash-attn installation and model loading with error handling; update demo launch condition
fce6c08

michaeltangz commited on

refactor app.py to streamline flash-attn installation and model loading; update requirements.txt to remove unnecessary dependencies
6fdae11

michaeltangz commited on

refactor app.py to improve model loading with flash attention and enhance streaming transcription; add packages.txt for dependencies
7b34cad

michaeltangz commited on

install Flash Attention 2 and optimize Whisper model loading; enhance streaming transcription with pipeline approach and latency tracking
a9a8aec

michaeltangz commited on

update demo launch settings for Spaces to share=False by default
03bd1f9

michaeltangz commited on

adjust max_new_tokens in transcription functions to leave room for special tokens
e542369

michaeltangz commited on

enhance audio transcription accuracy with librosa resampling and voice activity detection; increase buffer duration for better context retention
2afefc3

michaeltangz commited on

refactor streaming transcription to only emit complete sentences and clear buffer after capture, removing incremental text accumulation logic
e5caa82

mic3333 commited on

add incremental text accumulation to prevent duplicate transcription of buffered audio
846c0d7

mic3333 commited on

simplify streaming transcription by removing VAD, diarization, and complex buffering logic
55d67f9

mic3333 commited on

add speaker diarization with pyannote and optimize streaming pipeline with static caching and deque-based buffering
1f8fa97

mic3333 commited on

increase buffer duration and adjust transcription parameters to prioritize accuracy over latency
52f6b26

mic3333 commited on

reduce buffer duration and adjust transcription parameters for lower latency
8493d7f

mic3333 commited on

improve sentence splitting to avoid breaking on common abbreviations like Mr., Ms., Dr., etc.
649d6e6

mic3333 commited on

remove live subtitle display and only show full transcript in the UI
41e2ea2

mic3333 commited on

improve near-duplicate detection to compare against all existing sentences instead of only the last one
5db4d4b

mic3333 commited on

add near-duplicate detection to suppress streaming variants of the same sentence
67ce003

mic3333 commited on

update live subtitle to persist last sentence when no incomplete audio
8d1f680

mic3333 commited on

refactor to use per-session state instead of global variables
8a63ea7

mic3333 commited on

update the instruction(copy) for ui
bd1f625

mic3333 commited on

fix the transcription-focused config
bb22cc6

mic3333 commited on

fix duplication issue - 2
8dbd3ce

mic3333 commited on

fix duplication issue
17e0529

mic3333 commited on

update the further bugs
d6a23f4

mic3333 commited on

fix bugs
cd9ed00

mic3333 commited on

fix buffer_duration issue
5348723

mic3333 commited on

full transcription
e038230

mic3333 commited on

simpler version
87ebad7

mic3333 commited on

update to word by word version
47b3415

mic3333 commited on

optimize the repeation
c5cb0f3

mic3333 commited on

update the title
1b06df3

mic3333 commited on