DocUA's picture
feat: Implement CUDA BF16 error handling with automatic fallback to CPU for model inference and generation.
4f43939