Enhance model loading process in EndpointHandler Added detailed logging for Python and PyTorch versions, and CUDA availability. Implemented multiple approaches for loading the model, including AutoModel, LlavaForConditionalGeneration, and a fallback to pipeline and manual loading with tokenizer. Improved error handling and status reporting for model loading failures. Updated generation logic to handle cases where no model components are available.