Commit History

fix(tts): remove tokenizer_tts from app.state
431435a
Running

CrazyMonkey0 commited on

feat(tts): switch TTS model from mms-tts-eng to Kokoro-82M
d6cde26

CrazyMonkey0 commited on

feat(nlp): change nlp model to Qwen/Qwen2.5-1.5B-Instruct
4d18a16

CrazyMonkey0 commited on

feat(nlp): change nlp model to microsoft/Phi-3.5-mini-instruct
f854f33

CrazyMonkey0 commited on

feat(asr): add local ASR endpoint using faster-whisper with async lock
0ee88ba

CrazyMonkey0 commited on

fix(asr): fix loading model
0d30253

CrazyMonkey0 commited on

feat(asr): replace Whisper HF with faster-whisper for CPU-friendly transcription
c84acef

CrazyMonkey0 commited on

feat(api): add sending generated audio to external backend
20a7446

CrazyMonkey0 commited on

test(tts): checking whether the tts model is working correctly
65952f6

CrazyMonkey0 commited on

feat(nlp): switch NLP model to Qwen2.5-0.5B-Instruct
df63d34

CrazyMonkey0 commited on

fix(nlp): remove options n_threads=os.cpu_count() in load_model_nlp
0bdf4f1

CrazyMonkey0 commited on

feat(nlp): change version to qwen2.5-1.5b-instruct-q3_k_m.gguf
dc74289

CrazyMonkey0 commited on

test(nlp): comment part of tts in nlp.py
018ff86

CrazyMonkey0 commited on

fix: resolve TypeError in TTS audio generation and optimize model performance
e288dcc

CrazyMonkey0 commited on

fix(tts): swapping the model and tokenizer in the return function of load_models_tts
4784a54

CrazyMonkey0 commited on

fix: optimize FastAPI + Qwen2.5-1.5B for CPU, reduce max_tokens, increase timeout
88284a4

CrazyMonkey0 commited on

feat(nlp): switch Qwen2.5 model to 1.5B GGUF q6_k version
5d51d0f

CrazyMonkey0 commited on

fix(asr): load audio from in-memory buffer instead of disk
9ea2744

CrazyMonkey0 commited on

fix(chat): use llm() directly instead of create_chat_completion
3ad9eac

CrazyMonkey0 commited on

feat(chat): return NLP response with in-memory TTS audio
245cf59

CrazyMonkey0 commited on

feat(tts): migrate Kokoro TTS to Hugging Face facebook/mms-tts-eng with in-memory optimization
2a3f624

CrazyMonkey0 commited on

fix(nlp): Adding chat_handler for handling the Qwen2.5-3B-Instruct-GGUF model in llama-cpp-python
5f3ceca

CrazyMonkey0 commited on

fix(nlp):add ',' to fix an error in response generation
2d6bfd5

CrazyMonkey0 commited on

perf: implement lazy loading to fix startup timeouts
8f110eb

CrazyMonkey0 commited on

fix: resolve model loading and state management issues
bf1dc5f

CrazyMonkey0 commited on

refactor(chat): migrate from transformers to llama-cpp-python using Qwen 3B
6151d5f

CrazyMonkey0 commited on

feat(nlp): optimize NLP model for CPU
d5d8ff1

CrazyMonkey0 commited on

feat(nlp): Optimize CPU usage for Hugging Face Spaces Free Tier
75451ba

CrazyMonkey0 commited on

feat(nlp): reintroduce Qwen2.5-1.5B-Instruct model and migrate back to Transformers
94cf754

CrazyMonkey0 commited on

feat(llama): another attempt to integrate llama-cpp with the Qwen3-8B-Q4_K_M.gguf model
9bb78b3

CrazyMonkey0 commited on

test(models): downloading models from transformers
fe8b413

CrazyMonkey0 commited on

fix(nlp): update Llama loading to use from_pretrained()
f7ec4f4

CrazyMonkey0 commited on

Fix(nlp): NLP model download and build fix
89865a6

CrazyMonkey0 commited on

feat(nlp): add lama.cpp support for Qwen3-8B-Q5_K_M.gguf and download models
b2565e9

CrazyMonkey0 commited on

Initial APP
7eb3110

CrazyMonkey0 commited on