Spaces:

alex4cip
/

simple-chat

Sleeping

alex4cip Claude commited on Oct 30

Commit

a0f4ab8

1 Parent(s): fc890b6

feat: Add comprehensive hardware environment detection system

Major Changes:
- Implement detect_hardware_environment() function
- Support 6 environment types with auto-detection
- Dynamic UI based on detected hardware

Environment Types Supported:
1. HF Spaces - ZeroGPU (NVIDIA H200)
2. HF Spaces - CPU Upgrade (8 vCPU, 32GB RAM)
3. HF Spaces - CPU Basic (2-4 vCPU, 16GB RAM)
4. Local - GPU (CUDA/MPS detection)
5. Local - Apple Silicon (MPS backend)
6. Local - CPU (fallback)

Detection Logic:
- SPACE_ID environment variable → HF Spaces
- Import spaces success → ZeroGPU
- CPU count >= 8 → CPU Upgrade
- CPU count < 8 → CPU Basic
- torch.cuda.is_available() → Local CUDA GPU
- torch.backends.mps.is_available() → Apple Silicon GPU
- Fallback → Local CPU

Environment Info Displayed:
- Platform (HF Spaces / Local)
- Hardware type
- GPU availability and name
- CPU core count
- Operating system
- Detailed description

UI Improvements:
- Dynamic header with environment-specific info
- Hardware-specific features and recommendations
- Response time estimates per environment
- Cost information (HF Spaces tiers)
- Model recommendations (lightweight for CPU)

Benefits:
- Clear visibility of current environment
- Appropriate expectations for performance
- Better user experience with accurate info
- Easier debugging and troubleshooting
- Support for all deployment scenarios

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

Files changed (1) hide show

app.py +200 -70

app.py CHANGED Viewed

@@ -1,23 +1,106 @@
 """
-Flexible version: Works on both ZeroGPU and CPU Upgrade hardware
-Automatically detects hardware and adjusts accordingly
 """
-# Try to import spaces for ZeroGPU support
-try:
-    import spaces
-    ZEROGPU_AVAILABLE = True
-    print("✅ ZeroGPU support enabled")
-except ImportError:
-    ZEROGPU_AVAILABLE = False
-    print("ℹ️ ZeroGPU not available, using standard mode")
 import os
 import gradio as gr
 from transformers import AutoModelForCausalLM, AutoTokenizer
 from huggingface_hub import snapshot_download
 import torch
 # Load environment variables from .env file
 try:
     from dotenv import load_dotenv
@@ -338,37 +421,62 @@ def chat_wrapper(message, history):
     return response_history
-# Determine hardware info for UI
-hardware_info = "NVIDIA H200 (ZeroGPU)" if ZEROGPU_AVAILABLE else "CPU Upgrade (32GB RAM)"
-print(f"✅ App initialized - Hardware: {hardware_info}")
 # Create Gradio interface
 with gr.Blocks(title="🤖 Multi-Model Chatbot") as demo:
-    # Dynamic header based on hardware
-    if ZEROGPU_AVAILABLE:
-        header = """
-        # 🤖 다중 모델 챗봇 (ZeroGPU)
-        **하드웨어**: NVIDIA H200 (ZeroGPU - 자동 할당)
-        **특징**:
-        - ⚡ GPU 가속으로 빠른 응답 (3-5초)
-        - 🎯 {TOTAL_MODEL_COUNT}가지 한글 최적화 모델 선택 가능 ({PUBLIC_MODEL_COUNT} Public + {GATED_MODEL_COUNT} Gated)
-        - 🔄 모�� 전환 시 자동 재로딩
-        - 💰 PRO 구독 시 하루 25분 무료 사용
-        """
-    else:
-        header = """
-        # 🤖 다중 모델 챗봇 (CPU Upgrade)
-        **하드웨어**: CPU Upgrade (8 vCPU / 32 GB RAM)
-        **특징**:
-        - 🎯 {TOTAL_MODEL_COUNT}가지 한글 최적화 모델 선택 가능 ({PUBLIC_MODEL_COUNT} Public + {GATED_MODEL_COUNT} Gated)
-        - 🔄 모델 전환 시 자동 재로딩
-        - ⏳ CPU 환경이므로 응답이 다소 느립니다 (30초~1분)
-        - 💰 시간당 $0.03 (월 약 $22)
-        """
     gr.Markdown(header)
@@ -438,37 +546,59 @@ with gr.Blocks(title="🤖 Multi-Model Chatbot") as demo:
     msg.submit(submit, [msg, chatbot], [chatbot, msg])
     clear.click(lambda: [], outputs=chatbot)
-    # Dynamic footer based on hardware
-    if ZEROGPU_AVAILABLE:
-        footer = f"""
-        ---
-        **참고사항 (ZeroGPU 모드)**:
-        - 🤖 {TOTAL_MODEL_COUNT}가지 모델 중 선택 가능 (드롭다운에서 선택)
-        - ⚡ ZeroGPU는 요청 시 자동으로 GPU를 할당합니다
-        - 💰 PRO 구독자는 하루 25분 무료 사용
-        - 🔄 모델 변경 시 대화 내역이 초기화됩니다
-        - ⏱️ 첫 응답은 모델 로딩 시간 포함 (~10-15초)
-        **테스트 예시**:
-        - "안녕하세요"
-        - "인공지능에 대해 설명해주세요"
-        - "한국의 수도는 어디인가요?"
-        """
-    else:
-        footer = f"""
-        ---
-        **참고사항 (CPU Upgrade 모드)**:
-        - 🤖 {TOTAL_MODEL_COUNT}가지 모델 중 선택 가능 (드롭다운에서 선택)
-        - 🔄 모델 변경 시 대화 내역이 초기화됩니다
-        - ⏳ CPU 환경이므로 응답이 느립니다 (30초~1분)
-        - ⏱️ 첫 응답은 모델 로딩 시간 포함 (~1-2분)
-        - 💰 24시간 무제한 사용 (시간당 $0.03)
-        **테스트 예시**:
-        - "안녕하세요"
-        - "인공지능에 대해 설명해주세요"
-        - "한국의 수도는 어디인가요?"
-        """
     gr.Markdown(footer)

 """
+Multi-environment chatbot: Detects and adapts to different hardware environments
+Supports: Local (Mac/Linux/Windows), HF Spaces (CPU Basic/Upgrade, ZeroGPU)
 """
 import os
+import platform
 import gradio as gr
 from transformers import AutoModelForCausalLM, AutoTokenizer
 from huggingface_hub import snapshot_download
 import torch
+# ============================================================================
+# Hardware Environment Detection
+# ============================================================================
+def detect_hardware_environment():
+    """
+    Comprehensive hardware environment detection
+    Returns:
+        dict: {
+            'platform': 'hf_spaces' | 'local',
+            'hardware': 'zerogpu' | 'cpu_upgrade' | 'cpu_basic' | 'local_gpu' | 'local_cpu',
+            'gpu_available': bool,
+            'gpu_name': str or None,
+            'cpu_count': int,
+            'os': 'Darwin' | 'Linux' | 'Windows',
+            'description': str
+        }
+    """
+    env_info = {
+        'platform': 'local',
+        'hardware': 'local_cpu',
+        'gpu_available': False,
+        'gpu_name': None,
+        'cpu_count': os.cpu_count() or 1,
+        'os': platform.system(),
+        'description': ''
+    }
+    # Check if running on HF Spaces
+    is_hf_spaces = os.environ.get('SPACE_ID') is not None
+    if is_hf_spaces:
+        env_info['platform'] = 'hf_spaces'
+        space_id = os.environ.get('SPACE_ID', 'unknown')
+        # Check for ZeroGPU
+        try:
+            import spaces
+            env_info['hardware'] = 'zerogpu'
+            env_info['gpu_available'] = True
+            env_info['gpu_name'] = 'NVIDIA H200 (ZeroGPU)'
+            env_info['description'] = f"🚀 HF Spaces - ZeroGPU ({space_id})"
+        except ImportError:
+            # Check CPU tier by memory/CPU count
+            cpu_count = env_info['cpu_count']
+            if cpu_count >= 8:
+                env_info['hardware'] = 'cpu_upgrade'
+                env_info['description'] = f"⚙️  HF Spaces - CPU Upgrade ({cpu_count} vCPU, 32GB RAM)"
+            else:
+                env_info['hardware'] = 'cpu_basic'
+                env_info['description'] = f"💻 HF Spaces - CPU Basic ({cpu_count} vCPU, 16GB RAM)"
+    else:
+        # Local environment detection
+        if torch.cuda.is_available():
+            env_info['hardware'] = 'local_gpu'
+            env_info['gpu_available'] = True
+            try:
+                env_info['gpu_name'] = torch.cuda.get_device_name(0)
+            except:
+                env_info['gpu_name'] = 'CUDA GPU'
+            env_info['description'] = f"🖥️  Local - GPU ({env_info['gpu_name']})"
+        elif torch.backends.mps.is_available():
+            env_info['hardware'] = 'local_gpu'
+            env_info['gpu_available'] = True
+            env_info['gpu_name'] = 'Apple Silicon GPU (MPS)'
+            env_info['description'] = f"🍎 Local - Apple Silicon GPU"
+        else:
+            env_info['hardware'] = 'local_cpu'
+            env_info['description'] = f"💻 Local - CPU ({env_info['os']}, {env_info['cpu_count']} cores)"
+    return env_info
+# Detect hardware environment
+HW_ENV = detect_hardware_environment()
+ZEROGPU_AVAILABLE = HW_ENV['hardware'] == 'zerogpu'
+# Print environment info
+print("=" * 60)
+print("Hardware Environment Detection")
+print("=" * 60)
+print(f"Platform: {HW_ENV['platform']}")
+print(f"Hardware: {HW_ENV['hardware']}")
+print(f"GPU Available: {HW_ENV['gpu_available']}")
+if HW_ENV['gpu_name']:
+    print(f"GPU Name: {HW_ENV['gpu_name']}")
+print(f"CPU Cores: {HW_ENV['cpu_count']}")
+print(f"OS: {HW_ENV['os']}")
+print(f"Description: {HW_ENV['description']}")
+print("=" * 60)
 # Load environment variables from .env file
 try:
     from dotenv import load_dotenv
     return response_history
+print(f"✅ App initialized - {HW_ENV['description']}")
 # Create Gradio interface
 with gr.Blocks(title="🤖 Multi-Model Chatbot") as demo:
+    # Dynamic header based on hardware environment
+    header = f"""
+    # 🤖 다중 모델 챗봇 {HW_ENV['description']}
+    **환경 정보**:
+    - **플랫폼**: {HW_ENV['platform'].upper().replace('_', ' ')}
+    - **하드웨어**: {HW_ENV['hardware'].upper().replace('_', ' ')}
+    - **GPU**: {'✅ ' + HW_ENV['gpu_name'] if HW_ENV['gpu_available'] else '❌ CPU only'}
+    - **CPU 코어**: {HW_ENV['cpu_count']}
+    - **운영체제**: {HW_ENV['os']}
+    **모델 선택**:
+    - 🎯 {TOTAL_MODEL_COUNT}가지 한글 최적화 모델 ({PUBLIC_MODEL_COUNT} Public + {GATED_MODEL_COUNT} Gated)
+    - 🔄 모델 전환 시 자동 재로딩 (채팅 히스토리 초기화)
+    """
+    # Add hardware-specific features
+    if HW_ENV['hardware'] == 'zerogpu':
+        header += """
+    **ZeroGPU 특징**:
+    - ⚡ 초고속 응답 (3-5초, GPU 가속)
+    - 🚀 NVIDIA H200 자동 할당
+    - 💰 PRO 구독 시 하루 25분 무료
+    """
+    elif HW_ENV['hardware'] == 'cpu_upgrade':
+        header += """
+    **CPU Upgrade 특징**:
+    - ⏰ 무제한 사용 시간
+    - ⏳ CPU 환경 (응답 30초~1분)
+    - 💰 시간당 $0.03 (월 약 $22)
+    """
+    elif HW_ENV['hardware'] == 'cpu_basic':
+        header += """
+    **CPU Basic 특징**:
+    - 💡 무료 티어
+    - ⏳ CPU 환경 (응답 1~2분)
+    - 🔒 경량 모델 권장 (EXAONE 2.4B, Mistral 7B)
+    """
+    elif HW_ENV['hardware'] == 'local_gpu':
+        header += f"""
+    **로컬 GPU 특징**:
+    - 🖥️  개인 GPU: {HW_ENV['gpu_name']}
+    - ⚡ 빠른 응답 (GPU 가속)
+    - 🔓 무제한 사용
+    """
+    else:  # local_cpu
+        header += """
+    **로컬 CPU 특징**:
+    - 💻 로컬 개발 환경
+    - ⏳ CPU 환경 (느린 응답)
+    - 🔒 경량 모델 권장
+    """
     gr.Markdown(header)
     msg.submit(submit, [msg, chatbot], [chatbot, msg])
     clear.click(lambda: [], outputs=chatbot)
+    # Dynamic footer based on hardware environment
+    footer = f"""
+    ---
+    **현재 환경**: {HW_ENV['description']}
+    **참고사항**:
+    - 🤖 {TOTAL_MODEL_COUNT}가지 모델 중 선택 가능
+    - 🔄 모델 변경 시 대화 내역 초기화
+    - ⏱️ 첫 응답은 모델 로딩 시간 포함
+    """
+    # Add environment-specific notes
+    if HW_ENV['hardware'] == 'zerogpu':
+        footer += """
+    - ⚡ ZeroGPU 자동 GPU 할당 (3-5초 응답)
+    - 💰 PRO 구독자 하루 25분 무료
+    - ⏱️ 첫 로딩: ~10-15초
+    """
+    elif HW_ENV['hardware'] == 'cpu_upgrade':
+        footer += """
+    - ⏰ 24시간 무제한 사용
+    - ⏳ CPU 환경 (30초~1분 응답)
+    - 💰 시간당 $0.03
+    - ⏱️ 첫 로딩: ~1-2분
+    """
+    elif HW_ENV['hardware'] == 'cpu_basic':
+        footer += """
+    - 💡 무료 티어 (제한적)
+    - ⏳ CPU 환경 (1~2분 응답)
+    - 🔒 경량 모델 권장
+    - ⏱️ 첫 로딩: ~2-3분
+    """
+    elif HW_ENV['hardware'] == 'local_gpu':
+        footer += f"""
+    - 🖥️  GPU 가속: {HW_ENV['gpu_name']}
+    - ⚡ 빠른 응답 (5-10초)
+    - 🔓 무제한 사용
+    - ⏱️ 첫 로딩: ~10-20초
+    """
+    else:  # local_cpu
+        footer += f"""
+    - 💻 로컬 CPU ({HW_ENV['cpu_count']} 코어)
+    - ⏳ 느린 응답 (1~3분)
+    - 🔒 경량 모델 권장
+    - ⏱️ 첫 로딩: ~2-5분
+    """
+    footer += """
+    **테스트 예시**:
+    - "안녕하세요"
+    - "인공지능에 대해 설명해주세요"
+    - "한국의 수도는 어디인가요?"
+    """
     gr.Markdown(footer)