feat: Implement CUDA BF16 error handling with automatic fallback to CPU for model inference and generation. 4f43939 Running DocUA commited on 2 days ago
feat: Automatically set generation config's `pad_token_id` and `eos_token_id` from the tokenizer and suppress Hugging Face logging warnings. d54528e DocUA commited on 3 days ago
fix: Ensure proper `pad_token_id` configuration and `attention_mask` generation for DeepSeek OCR model. 4505c9a DocUA commited on 3 days ago
feat: Improve Hugging Face cache management and enable mixed-precision inference for GPU models. 9efb9c8 DocUA commited on 3 days ago
fix: Add a fallback definition for `is_torch_fx_available` in `transformers.utils.import_utils`. 3537ca8 DocUA commited on 3 days ago
feat: Add LlamaFlashAttention2 compatibility alias and eager attention implementation for model loading. e0b7657 DocUA commited on 3 days ago
refactor: Reorder imports and the `spaces` dummy decorator definition to the top of the file. 6379065 DocUA commited on 3 days ago
refactor: Simplify library imports and remove verbose version checks in `app_hf.py`. cd23458 DocUA commited on 3 days ago
refactor: Enhance dependency import robustness with try-except blocks and add version logging for Gradio and HuggingFace Hub. 439d893 DocUA commited on 3 days ago
fix: Address MPS compatibility issues, ensure explicit model dtype, and improve Gradio file input handling. 092c902 DocUA commited on 3 days ago
Implement warning suppression, ensure pad token ID for generation, enable deterministic sampling, refine Gradio UI CSS and clear functionality, and add `.env` to .gitignore." c3371d2 DocUA commited on 5 days ago
Initial commit: DeepSeek-OCR-2 & MedGemma-1.5 multimodal analysis app with ZeroGPU support b752d16 DocUA commited on 5 days ago