feat: Implement CUDA BF16 error handling with automatic fallback to CPU for model inference and generation. 4f43939 Running DocUA commited on Jan 31
feat: Automatically set generation config's `pad_token_id` and `eos_token_id` from the tokenizer and suppress Hugging Face logging warnings. d54528e DocUA commited on Jan 30
fix: Ensure proper `pad_token_id` configuration and `attention_mask` generation for DeepSeek OCR model. 4505c9a DocUA commited on Jan 30
feat: Improve Hugging Face cache management and enable mixed-precision inference for GPU models. 9efb9c8 DocUA commited on Jan 30
fix: Add a fallback definition for `is_torch_fx_available` in `transformers.utils.import_utils`. 3537ca8 DocUA commited on Jan 30
feat: Add LlamaFlashAttention2 compatibility alias and eager attention implementation for model loading. e0b7657 DocUA commited on Jan 30
refactor: Reorder imports and the `spaces` dummy decorator definition to the top of the file. 6379065 DocUA commited on Jan 30
refactor: Simplify library imports and remove verbose version checks in `app_hf.py`. cd23458 DocUA commited on Jan 30
refactor: Enhance dependency import robustness with try-except blocks and add version logging for Gradio and HuggingFace Hub. 439d893 DocUA commited on Jan 30
fix: Address MPS compatibility issues, ensure explicit model dtype, and improve Gradio file input handling. 092c902 DocUA commited on Jan 30
Implement warning suppression, ensure pad token ID for generation, enable deterministic sampling, refine Gradio UI CSS and clear functionality, and add `.env` to .gitignore." c3371d2 DocUA commited on Jan 28
Initial commit: DeepSeek-OCR-2 & MedGemma-1.5 multimodal analysis app with ZeroGPU support b752d16 DocUA commited on Jan 28