Refactor text generation in chatbot application to utilize max_new_tokens for improved clarity and added truncation support. Removed unnecessary print statement for generated answer length.
Refactor response handling in chatbot application to support optional system messages and improve message processing. Added checks for empty responses to enhance user experience.
Update response generation in chatbot application to set default values for max_tokens, temperature, and top_p parameters. This enhancement ensures smoother operation when these parameters are not explicitly provided.
Streamline model loading and response generation in chatbot application by utilizing a text generation pipeline. Removed legacy loading methods and improved response handling for enhanced performance and clarity.
Refactor model loading process in chatbot application to prioritize local path loading, with enhanced error handling and fallback mechanisms for HuggingFace models and PEFT adapters.
Enhance model loading logic in chatbot application to support direct loading and PEFT adapter fallback. Updated model and tokenizer initialization for improved error handling and device management.
Refactor model loading and input handling in chatbot application. Updated model and tokenizer initialization, improved device management for inputs, and removed unused sliders from the Gradio interface.