Spaces:

george614
/

gpu-memory-calculator

Sleeping

George Yang Claude Sonnet 4.5 commited on Jan 26

Commit

e9c64c8

1 Parent(s): 9085c78

Feat: Sync all features from main repository

Major feature updates and bug fixes:

**New Features:**
- Add HuggingFace Hub integration for fetching model metadata
- Add SGLang inference engine support
- Share preset models across all tabs (training, inference, multi-node)
- Add batch size optimizer API endpoint
- Add inference, multi-node, and exporter features to web UI
- Add framework config export (Accelerate, Lightning, Axolotl)

**Bug Fixes:**
- Fix critical calculation bugs and improve defensive programming
- Fix black formatting in formulas.py
- Handle null values in inference results display
- Restore two-column layout for tabbed interface
- Resolve CI failures in web/app.py

**Improvements:**
- Update dates to 2026 across codebase
- Improve error handling and validation
- Update web UI styling and accessibility

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Files changed (45) hide show

pyproject.toml +1 -0
src/gpu_mem_calculator/__pycache__/__init__.cpython-312.pyc +0 -0
src/gpu_mem_calculator/config/__pycache__/__init__.cpython-312.pyc +0 -0
src/gpu_mem_calculator/config/__pycache__/parser.cpython-312.pyc +0 -0
src/gpu_mem_calculator/config/__pycache__/presets.cpython-312.pyc +0 -0
src/gpu_mem_calculator/core/__pycache__/__init__.cpython-312.pyc +0 -0
src/gpu_mem_calculator/core/__pycache__/calculator.cpython-312.pyc +0 -0
src/gpu_mem_calculator/core/__pycache__/formulas.cpython-312.pyc +0 -0
src/gpu_mem_calculator/core/__pycache__/models.cpython-312.pyc +0 -0
src/gpu_mem_calculator/core/__pycache__/multinode.cpython-312.pyc +0 -0
src/gpu_mem_calculator/engines/__pycache__/__init__.cpython-312.pyc +0 -0
src/gpu_mem_calculator/engines/__pycache__/base.cpython-312.pyc +0 -0
src/gpu_mem_calculator/engines/__pycache__/deepspeed.cpython-312.pyc +0 -0
src/gpu_mem_calculator/engines/__pycache__/fsdp.cpython-312.pyc +0 -0
src/gpu_mem_calculator/engines/__pycache__/megatron.cpython-312.pyc +0 -0
src/gpu_mem_calculator/engines/__pycache__/pytorch.cpython-312.pyc +0 -0
src/gpu_mem_calculator/exporters/__pycache__/__init__.cpython-312.pyc +0 -0
src/gpu_mem_calculator/exporters/__pycache__/accelerate.cpython-312.pyc +0 -0
src/gpu_mem_calculator/exporters/__pycache__/axolotl.cpython-312.pyc +0 -0
src/gpu_mem_calculator/exporters/__pycache__/lightning.cpython-312.pyc +0 -0
src/gpu_mem_calculator/exporters/__pycache__/manager.cpython-312.pyc +0 -0
src/gpu_mem_calculator/huggingface/__init__.py +19 -0
src/gpu_mem_calculator/huggingface/__pycache__/__init__.cpython-312.pyc +0 -0
src/gpu_mem_calculator/huggingface/__pycache__/client.cpython-312.pyc +0 -0
src/gpu_mem_calculator/huggingface/__pycache__/exceptions.cpython-312.pyc +0 -0
src/gpu_mem_calculator/huggingface/__pycache__/mapper.cpython-312.pyc +0 -0
src/gpu_mem_calculator/huggingface/client.py +139 -0
src/gpu_mem_calculator/huggingface/exceptions.py +25 -0
src/gpu_mem_calculator/huggingface/mapper.py +167 -0
src/gpu_mem_calculator/inference/__pycache__/__init__.cpython-312.pyc +0 -0
src/gpu_mem_calculator/inference/__pycache__/base.cpython-312.pyc +0 -0
src/gpu_mem_calculator/inference/__pycache__/calculator.cpython-312.pyc +0 -0
src/gpu_mem_calculator/inference/__pycache__/huggingface.cpython-312.pyc +0 -0
src/gpu_mem_calculator/inference/__pycache__/sglang.cpython-312.pyc +0 -0
src/gpu_mem_calculator/inference/__pycache__/tensorrt_llm.cpython-312.pyc +0 -0
src/gpu_mem_calculator/inference/__pycache__/tgi.cpython-312.pyc +0 -0
src/gpu_mem_calculator/inference/__pycache__/vllm.cpython-312.pyc +0 -0
src/gpu_mem_calculator/utils/__pycache__/__init__.cpython-312.pyc +0 -0
src/gpu_mem_calculator/utils/__pycache__/precision.cpython-312.pyc +0 -0
web/__pycache__/__init__.cpython-312.pyc +0 -0
web/__pycache__/app.cpython-312.pyc +0 -0
web/app.py +99 -1
web/static/css/styles.css +76 -0
web/static/js/app.js +178 -0
web/templates/index.html +45 -17

pyproject.toml CHANGED Viewed

@@ -45,6 +45,7 @@ dependencies = [
     "click>=8.1.0",
     "pydantic-settings>=2.0.0",
     "rich>=13.0.0",
 ]
 [project.optional-dependencies]

     "click>=8.1.0",
     "pydantic-settings>=2.0.0",
     "rich>=13.0.0",
+    "httpx>=0.15.0",
 ]
 [project.optional-dependencies]

src/gpu_mem_calculator/__pycache__/__init__.cpython-312.pyc DELETED Viewed

Binary file (257 Bytes)

src/gpu_mem_calculator/config/__pycache__/__init__.cpython-312.pyc DELETED Viewed

Binary file (375 Bytes)

src/gpu_mem_calculator/config/__pycache__/parser.cpython-312.pyc DELETED Viewed

Binary file (14.2 kB)

src/gpu_mem_calculator/config/__pycache__/presets.cpython-312.pyc DELETED Viewed

Binary file (3.35 kB)

src/gpu_mem_calculator/core/__pycache__/__init__.cpython-312.pyc DELETED Viewed

Binary file (562 Bytes)

src/gpu_mem_calculator/core/__pycache__/calculator.cpython-312.pyc DELETED Viewed

Binary file (6.51 kB)

src/gpu_mem_calculator/core/__pycache__/formulas.cpython-312.pyc DELETED Viewed

Binary file (7.29 kB)

src/gpu_mem_calculator/core/__pycache__/models.cpython-312.pyc DELETED Viewed

Binary file (24.4 kB)

src/gpu_mem_calculator/core/__pycache__/multinode.cpython-312.pyc DELETED Viewed

Binary file (10.8 kB)

src/gpu_mem_calculator/engines/__pycache__/__init__.cpython-312.pyc DELETED Viewed

Binary file (688 Bytes)

src/gpu_mem_calculator/engines/__pycache__/base.cpython-312.pyc DELETED Viewed

Binary file (8.07 kB)

src/gpu_mem_calculator/engines/__pycache__/deepspeed.cpython-312.pyc DELETED Viewed

Binary file (11.2 kB)

src/gpu_mem_calculator/engines/__pycache__/fsdp.cpython-312.pyc DELETED Viewed

Binary file (8.07 kB)

src/gpu_mem_calculator/engines/__pycache__/megatron.cpython-312.pyc DELETED Viewed

Binary file (8.5 kB)

src/gpu_mem_calculator/engines/__pycache__/pytorch.cpython-312.pyc DELETED Viewed

Binary file (3.73 kB)

src/gpu_mem_calculator/exporters/__pycache__/__init__.cpython-312.pyc DELETED Viewed

Binary file (628 Bytes)

src/gpu_mem_calculator/exporters/__pycache__/accelerate.cpython-312.pyc DELETED Viewed

Binary file (7.81 kB)

src/gpu_mem_calculator/exporters/__pycache__/axolotl.cpython-312.pyc DELETED Viewed

Binary file (9.07 kB)

src/gpu_mem_calculator/exporters/__pycache__/lightning.cpython-312.pyc DELETED Viewed

Binary file (9.41 kB)

src/gpu_mem_calculator/exporters/__pycache__/manager.cpython-312.pyc DELETED Viewed

Binary file (9.29 kB)

src/gpu_mem_calculator/huggingface/__init__.py ADDED Viewed

	@@ -0,0 +1,19 @@

+"""Hugging Face Hub integration for fetching model metadata."""
+from gpu_mem_calculator.huggingface.client import HuggingFaceClient
+from gpu_mem_calculator.huggingface.exceptions import (
+    HuggingFaceError,
+    InvalidConfigError,
+    ModelNotFoundError,
+    PrivateModelAccessError,
+)
+from gpu_mem_calculator.huggingface.mapper import HuggingFaceConfigMapper
+__all__ = [
+    "HuggingFaceClient",
+    "HuggingFaceConfigMapper",
+    "HuggingFaceError",
+    "InvalidConfigError",
+    "ModelNotFoundError",
+    "PrivateModelAccessError",
+]

src/gpu_mem_calculator/huggingface/__pycache__/__init__.cpython-312.pyc ADDED Viewed

Binary file (670 Bytes). View file

src/gpu_mem_calculator/huggingface/__pycache__/client.cpython-312.pyc ADDED Viewed

Binary file (6.87 kB). View file

src/gpu_mem_calculator/huggingface/__pycache__/exceptions.cpython-312.pyc ADDED Viewed

Binary file (1.18 kB). View file

src/gpu_mem_calculator/huggingface/__pycache__/mapper.cpython-312.pyc ADDED Viewed

Binary file (5.23 kB). View file

src/gpu_mem_calculator/huggingface/client.py ADDED Viewed

	@@ -0,0 +1,139 @@

+"""HuggingFace Hub client for fetching model metadata."""
+from typing import Any, cast
+import httpx
+from gpu_mem_calculator.huggingface.exceptions import (
+    HuggingFaceError,
+    InvalidConfigError,
+    ModelNotFoundError,
+    PrivateModelAccessError,
+)
+class HuggingFaceClient:
+    """Client for interacting with HuggingFace Hub API."""
+    def __init__(self, token: str | None = None, timeout: int = 30):
+        """Initialize HF Hub client.
+        Args:
+            token: HF API token for private models (optional)
+            timeout: HTTP timeout in seconds
+        """
+        self.token = token
+        self.timeout = timeout
+        self.api_base = "https://huggingface.co/api"
+        self.raw_base = "https://huggingface.co"
+    def _get_headers(self) -> dict[str, str]:
+        """Get HTTP headers with optional authentication."""
+        headers = {
+            "User-Agent": "GPU-Mem-Calculator/0.1.0",
+            "Accept": "application/json",
+        }
+        if self.token:
+            headers["Authorization"] = f"Bearer {self.token}"
+        return headers
+    async def get_model_info(self, model_id: str) -> dict[str, Any]:
+        """Get model metadata from HF Hub.
+        Args:
+            model_id: Model identifier (e.g., "meta-llama/Llama-2-7b-hf")
+        Returns:
+            Model metadata dict
+        Raises:
+            ModelNotFoundError: If model doesn't exist
+            PrivateModelAccessError: If authentication required
+            HuggingFaceError: For network issues
+        """
+        model_id = model_id.strip()
+        if not model_id:
+            raise ValueError("Model ID cannot be empty")
+        # Sanitize model ID
+        model_id = model_id.strip("/")
+        url = f"{self.api_base}/models/{model_id}"
+        async with httpx.AsyncClient(timeout=self.timeout, follow_redirects=True) as client:
+            response = await client.get(url, headers=self._get_headers())
+            if response.status_code == 401:
+                raise PrivateModelAccessError(
+                    f"Authentication required for model '{model_id}'. "
+                    "Please provide a HuggingFace token."
+                )
+            elif response.status_code == 404:
+                raise ModelNotFoundError(f"Model '{model_id}' not found on HuggingFace Hub")
+            elif response.status_code != 200:
+                raise HuggingFaceError(f"Failed to fetch model info: HTTP {response.status_code}")
+            return cast(dict[str, Any], response.json())
+    async def get_model_config(self, model_id: str) -> dict[str, Any]:
+        """Get model config.json from HF Hub.
+        Args:
+            model_id: Model identifier
+        Returns:
+            Model configuration dict
+        Raises:
+            ModelNotFoundError: If model doesn't exist
+            PrivateModelAccessError: If authentication required
+            InvalidConfigError: If config.json not found
+            HuggingFaceError: For network issues
+        """
+        model_id = model_id.strip().strip("/")
+        # Try to fetch config.json from the repository
+        url = f"{self.raw_base}/{model_id}/raw/main/config.json"
+        async with httpx.AsyncClient(timeout=self.timeout, follow_redirects=True) as client:
+            response = await client.get(url, headers=self._get_headers())
+            if response.status_code == 404:
+                # Try alternative branches
+                for branch in ["base", "research"]:
+                    url = f"{self.raw_base}/{model_id}/raw/{branch}/config.json"
+                    response = await client.get(url, headers=self._get_headers())
+                    if response.status_code == 200:
+                        break
+                if response.status_code == 404:
+                    raise InvalidConfigError(f"config.json not found for model '{model_id}'")
+            elif response.status_code == 401:
+                raise PrivateModelAccessError(f"Authentication required for model '{model_id}'")
+            elif response.status_code != 200:
+                raise HuggingFaceError(f"Failed to fetch model config: HTTP {response.status_code}")
+            return cast(dict[str, Any], response.json())
+    async def fetch_model_metadata(self, model_id: str) -> dict[str, Any]:
+        """Fetch complete model metadata including info and config.
+        Args:
+            model_id: Model identifier
+        Returns:
+            Dictionary with 'model_info' and 'config' keys
+        Raises:
+            ModelNotFoundError: If model doesn't exist
+            PrivateModelAccessError: If authentication required
+            InvalidConfigError: If config.json not found
+            HuggingFaceError: For other errors
+        """
+        model_info = await self.get_model_info(model_id)
+        model_config = await self.get_model_config(model_id)
+        return {
+            "model_info": model_info,
+            "config": model_config,
+        }

src/gpu_mem_calculator/huggingface/exceptions.py ADDED Viewed

	@@ -0,0 +1,25 @@

+"""Custom exceptions for HuggingFace Hub integration."""
+class HuggingFaceError(Exception):
+    """Base exception for HuggingFace-related errors."""
+    pass
+class ModelNotFoundError(HuggingFaceError):
+    """Raised when a model is not found on HuggingFace Hub."""
+    pass
+class PrivateModelAccessError(HuggingFaceError):
+    """Raised when authentication is required for a private model."""
+    pass
+class InvalidConfigError(HuggingFaceError):
+    """Raised when model config is invalid or missing required fields."""
+    pass

src/gpu_mem_calculator/huggingface/mapper.py ADDED Viewed

	@@ -0,0 +1,167 @@

+"""Map HuggingFace model configs to GPU Memory Calculator ModelConfig."""
+from typing import Any
+class HuggingFaceConfigMapper:
+    """Map HF config.json to ModelConfig."""
+    # Mapping of HF config field names to ModelConfig fields
+    FIELD_MAPPINGS = {
+        # Direct mappings
+        "hidden_size": "hidden_size",
+        "num_hidden_layers": "num_layers",
+        "num_attention_heads": "num_attention_heads",
+        "vocab_size": "vocab_size",
+        "max_position_embeddings": "max_seq_len",
+        # Common alternatives
+        "n_layer": "num_layers",
+        "n_head": "num_attention_heads",
+        "n_embd": "hidden_size",
+        "n_positions": "max_seq_len",
+    }
+    def map_to_model_config(
+        self, hf_config: dict[str, Any], model_info: dict[str, Any] | None = None
+    ) -> dict[str, Any]:
+        """Map HF config to ModelConfig-compatible dictionary.
+        Args:
+            hf_config: HuggingFace config.json dict
+            model_info: Optional model metadata from HF API
+        Returns:
+            Dictionary with keys:
+                - 'config': ModelConfig-compatible dict
+                - 'missing_fields': List of required fields not found
+                - 'found_fields': List of fields that were mapped
+        """
+        model_config = {}
+        missing_fields = []
+        found_fields = []
+        # Extract model name
+        if model_info:
+            model_config["name"] = model_info.get("modelId", "custom").replace("/", "-")
+        else:
+            model_config["name"] = "custom-hf-model"
+        # Map simple fields
+        for hf_field, our_field in self.FIELD_MAPPINGS.items():
+            if hf_field in hf_config:
+                value = hf_config[hf_field]
+                # Ensure type is int
+                if isinstance(value, (int, float)):
+                    model_config[our_field] = int(value)
+                    found_fields.append(our_field)
+                elif isinstance(value, str):
+                    # Handle special cases like "32B"
+                    if our_field == "num_parameters":
+                        model_config[our_field] = value
+                        found_fields.append(our_field)
+        # Extract MoE-specific fields
+        moe_config = self._extract_moe_config(hf_config)
+        if moe_config:
+            model_config.update(moe_config)
+            found_fields.extend(moe_config.keys())
+        # Handle num_parameters - compute if not provided
+        if "num_parameters" not in model_config:
+            # Try to compute from architecture
+            computed_params = self._estimate_num_parameters(hf_config, model_config)
+            if computed_params:
+                model_config["num_parameters"] = computed_params
+                found_fields.append("num_parameters")
+        # Identify missing fields
+        required_fields = [
+            "num_parameters",
+            "num_layers",
+            "hidden_size",
+            "num_attention_heads",
+            "vocab_size",
+            "max_seq_len",
+        ]
+        for field in required_fields:
+            if field not in model_config:
+                missing_fields.append(field)
+        return {
+            "config": model_config,
+            "missing_fields": missing_fields,
+            "found_fields": found_fields,
+        }
+    def _extract_moe_config(self, hf_config: dict[str, Any]) -> dict[str, Any]:
+        """Extract MoE-specific configuration from HF config.
+        Args:
+            hf_config: HuggingFace config
+        Returns:
+            Dict with moe_enabled, num_experts, top_k if MoE detected
+        """
+        moe_config: dict[str, Any] = {}
+        # Common HF MoE field names
+        num_experts_val = hf_config.get(
+            "num_local_experts",
+            hf_config.get("num_experts", hf_config.get("n_expert")),
+        )
+        top_k_val = hf_config.get(
+            "expert_capacity",
+            hf_config.get("num_experts_per_tok", hf_config.get("top_k")),
+        )
+        # Type narrowing for MoE fields
+        if isinstance(num_experts_val, (int, float)) and num_experts_val > 1:
+            moe_config["moe_enabled"] = True
+            moe_config["num_experts"] = int(num_experts_val)
+            if isinstance(top_k_val, (int, float)):
+                moe_config["top_k"] = int(top_k_val)
+            else:
+                moe_config["top_k"] = 2  # Default
+        return moe_config
+    def _estimate_num_parameters(
+        self, hf_config: dict[str, Any], partial_config: dict[str, Any]
+    ) -> int | None:
+        """Estimate number of parameters if not provided.
+        Args:
+            hf_config: Full HF config
+            partial_config: Partially built config
+        Returns:
+            Estimated parameter count or None
+        """
+        # Check if HF provides the count directly
+        if "num_parameters" in hf_config:
+            return int(hf_config["num_parameters"])
+        # Try to compute from model architecture
+        hidden_size = partial_config.get("hidden_size")
+        num_layers = partial_config.get("num_layers")
+        vocab_size = partial_config.get("vocab_size")
+        # Type narrowing for calculation
+        if (
+            isinstance(hidden_size, int)
+            and isinstance(num_layers, int)
+            and isinstance(vocab_size, int)
+        ):
+            # Rough estimate for transformer models
+            # Based on: embeddings + transformer layers
+            embedding_params = vocab_size * hidden_size
+            layer_params = 4 * hidden_size * hidden_size * num_layers  # FFN + attention
+            total = embedding_params + layer_params
+            # Apply scaling factor for real-world variance
+            # (accounting for biases, layernorm, etc.)
+            return int(total * 1.2)
+        return None

src/gpu_mem_calculator/inference/__pycache__/__init__.cpython-312.pyc DELETED Viewed

Binary file (527 Bytes)

src/gpu_mem_calculator/inference/__pycache__/base.cpython-312.pyc DELETED Viewed

Binary file (6.65 kB)

src/gpu_mem_calculator/inference/__pycache__/calculator.cpython-312.pyc DELETED Viewed

Binary file (4.26 kB)

src/gpu_mem_calculator/inference/__pycache__/huggingface.cpython-312.pyc DELETED Viewed

Binary file (3.56 kB)

src/gpu_mem_calculator/inference/__pycache__/sglang.cpython-312.pyc DELETED Viewed

Binary file (6.34 kB)

src/gpu_mem_calculator/inference/__pycache__/tensorrt_llm.cpython-312.pyc DELETED Viewed

Binary file (3.83 kB)

src/gpu_mem_calculator/inference/__pycache__/tgi.cpython-312.pyc DELETED Viewed

Binary file (3.91 kB)

src/gpu_mem_calculator/inference/__pycache__/vllm.cpython-312.pyc DELETED Viewed

Binary file (5.02 kB)

src/gpu_mem_calculator/utils/__pycache__/__init__.cpython-312.pyc DELETED Viewed

Binary file (360 Bytes)

src/gpu_mem_calculator/utils/__pycache__/precision.cpython-312.pyc DELETED Viewed

Binary file (3.01 kB)

web/__pycache__/__init__.cpython-312.pyc DELETED Viewed

Binary file (212 Bytes)

web/__pycache__/app.cpython-312.pyc DELETED Viewed

Binary file (43.8 kB)

web/app.py CHANGED Viewed

@@ -10,7 +10,7 @@ from fastapi import FastAPI, HTTPException
 from fastapi.middleware.cors import CORSMiddleware
 from fastapi.staticfiles import StaticFiles
 from fastapi.templating import Jinja2Templates
-from pydantic import BaseModel, Field, field_validator, model_validator
 from starlette.requests import Request
 from gpu_mem_calculator.config.presets import load_presets
@@ -29,6 +29,14 @@ from gpu_mem_calculator.core.models import (
 )
 from gpu_mem_calculator.core.multinode import MultiNodeCalculator
 from gpu_mem_calculator.exporters.manager import ExportFormat, ExportManager
 from gpu_mem_calculator.inference.calculator import InferenceMemoryCalculator
 # Configure logging
@@ -154,6 +162,15 @@ class PresetInfo(BaseModel):
     config: dict[str, Any]
 # Simple in-memory cache for calculation results
 # In production, use Redis or similar
 _calculation_cache: dict[str, tuple[MemoryResult, float]] = {}  # key -> (result, timestamp)
@@ -278,6 +295,87 @@ async def get_preset(preset_name: str) -> dict[str, Any]:
     return PRESETS[preset_name].config
 @app.post("/api/calculate")
 async def calculate_memory(request: CalculateRequest) -> MemoryResult:
     """Calculate GPU memory requirements.

 from fastapi.middleware.cors import CORSMiddleware
 from fastapi.staticfiles import StaticFiles
 from fastapi.templating import Jinja2Templates
+from pydantic import BaseModel, ConfigDict, Field, field_validator, model_validator
 from starlette.requests import Request
 from gpu_mem_calculator.config.presets import load_presets
 )
 from gpu_mem_calculator.core.multinode import MultiNodeCalculator
 from gpu_mem_calculator.exporters.manager import ExportFormat, ExportManager
+from gpu_mem_calculator.huggingface import (
+    HuggingFaceClient,
+    HuggingFaceConfigMapper,
+    HuggingFaceError,
+    InvalidConfigError,
+    ModelNotFoundError,
+    PrivateModelAccessError,
+)
 from gpu_mem_calculator.inference.calculator import InferenceMemoryCalculator
 # Configure logging
     config: dict[str, Any]
+class HuggingFaceRequest(BaseModel):
+    """Request for fetching HuggingFace model metadata."""
+    model_config = ConfigDict(protected_namespaces=())
+    model_id: str = Field(description="HuggingFace model ID (e.g., meta-llama/Llama-2-7b-hf)")
+    token: str | None = Field(default=None, description="HF token for private models")
 # Simple in-memory cache for calculation results
 # In production, use Redis or similar
 _calculation_cache: dict[str, tuple[MemoryResult, float]] = {}  # key -> (result, timestamp)
     return PRESETS[preset_name].config
+@app.post("/api/hf/fetch")
+async def fetch_huggingface_model(request: HuggingFaceRequest) -> dict[str, Any]:
+    """Fetch model metadata from HuggingFace Hub.
+    Args:
+        request: Request with model_id and optional token
+    Returns:
+        Model config with fields filled from HF, plus list of missing fields
+    Raises:
+        HTTPException: If model not found, access denied, or invalid config
+    """
+    try:
+        # Initialize HF client
+        client = HuggingFaceClient(token=request.token)
+        # Fetch metadata
+        metadata = await client.fetch_model_metadata(request.model_id)
+        # Map to ModelConfig
+        mapper = HuggingFaceConfigMapper()
+        result = mapper.map_to_model_config(metadata["config"], metadata.get("model_info"))
+        return {
+            "model_id": request.model_id,
+            "config": result["config"],
+            "missing_fields": result["missing_fields"],
+            "found_fields": result["found_fields"],
+            "warnings": [],
+        }
+    except PrivateModelAccessError as e:
+        raise HTTPException(
+            status_code=401,
+            detail={
+                "error": "Authentication required",
+                "message": str(e),
+                "type": "auth_error",
+            },
+        ) from e
+    except ModelNotFoundError as e:
+        raise HTTPException(
+            status_code=404,
+            detail={
+                "error": "Model not found",
+                "message": str(e),
+                "type": "not_found",
+            },
+        ) from e
+    except InvalidConfigError as e:
+        raise HTTPException(
+            status_code=422,
+            detail={
+                "error": "Invalid model configuration",
+                "message": str(e),
+                "type": "invalid_config",
+            },
+        ) from e
+    except HuggingFaceError as e:
+        logger.error(f"HuggingFace error: {str(e)}")
+        raise HTTPException(
+            status_code=500,
+            detail={
+                "error": "HuggingFace API error",
+                "message": str(e),
+                "type": "api_error",
+            },
+        ) from e
+    except Exception as e:
+        logger.error(f"Unexpected error fetching HF model: {str(e)}", exc_info=True)
+        raise HTTPException(
+            status_code=500,
+            detail={
+                "error": "Internal server error",
+                "message": "An unexpected error occurred",
+                "type": "server_error",
+            },
+        ) from e
 @app.post("/api/calculate")
 async def calculate_memory(request: CalculateRequest) -> MemoryResult:
     """Calculate GPU memory requirements.

web/static/css/styles.css CHANGED Viewed

@@ -530,3 +530,79 @@ header h1 {
 .formula-references a:hover {
     text-decoration: underline;
 }

 .formula-references a:hover {
     text-decoration: underline;
 }
+/* Hugging Face Integration */
+.preset-row {
+	display: flex;
+	gap: 10px;
+	align-items: center;
+}
+.preset-row select {
+	flex: 1;
+}
+.btn-tertiary {
+	background-color: #ffd700;
+	color: #1e293b;
+	border: none;
+	padding: 10px 16px;
+	border-radius: 6px;
+	font-size: 0.9rem;
+	font-weight: 600;
+	cursor: pointer;
+	transition: all 0.2s;
+	white-space: nowrap;
+}
+.btn-tertiary:hover {
+	background-color: #f0c000;
+}
+.btn-tertiary:active {
+	transform: scale(0.98);
+}
+.hf-fetch-panel {
+	background: #f0f9ff;
+	border: 1px solid #2563eb;
+	border-radius: 8px;
+	padding: 20px;
+	margin-top: 15px;
+}
+.hf-fetch-panel .form-group {
+	margin-bottom: 15px;
+}
+.hf-fetch-panel .help-text {
+	display: block;
+	font-size: 0.85rem;
+	color: var(--text-secondary);
+	margin-top: 5px;
+}
+.loading-message {
+	color: var(--primary-color);
+	font-style: italic;
+	padding: 10px;
+	text-align: center;
+}
+.error-message {
+	background-color: #fee;
+	color: #c00;
+	padding: 12px;
+	border-radius: 6px;
+	border: 1px solid #fcc;
+	margin-top: 10px;
+}
+.success-message {
+	background-color: #efe;
+	color: #0a0;
+	padding: 12px;
+	border-radius: 6px;
+	border: 1px solid #cfc;
+	margin-top: 10px;
+}

web/static/js/app.js CHANGED Viewed

@@ -32,6 +32,19 @@ class GPUMemCalculator {
             }
         });
         // Batch size slider sync
         const batchSizeInput = document.getElementById('batch-size');
         const batchSizeSlider = document.getElementById('batch-size-slider');
@@ -475,6 +488,171 @@ class GPUMemCalculator {
         }
     }
     applyConfig(config) {
         // Set flag to prevent auto-calculation during config load
         this.isApplyingConfig = true;

             }
         });
+        // Hugging Face fetch functionality
+        document.getElementById('fetch-hf-btn').addEventListener('click', () => {
+            this.showHFPFetchPanel();
+        });
+        document.getElementById('hf-fetch-submit').addEventListener('click', () => {
+            this.fetchFromHuggingFace();
+        });
+        document.getElementById('hf-fetch-cancel').addEventListener('click', () => {
+            this.hideHFFetchPanel();
+        });
         // Batch size slider sync
         const batchSizeInput = document.getElementById('batch-size');
         const batchSizeSlider = document.getElementById('batch-size-slider');
         }
     }
+    showHFPFetchPanel() {
+        const panel = document.getElementById('hf-fetch-panel');
+        panel.style.display = 'block';
+        // Auto-focus model ID input
+        document.getElementById('hf-model-id').focus();
+        // Clear previous messages
+        document.getElementById('hf-error').style.display = 'none';
+        document.getElementById('hf-success').style.display = 'none';
+    }
+    hideHFFetchPanel() {
+        const panel = document.getElementById('hf-fetch-panel');
+        panel.style.display = 'none';
+        // Clear inputs
+        document.getElementById('hf-model-id').value = '';
+        document.getElementById('hf-token').value = '';
+        // Clear messages
+        document.getElementById('hf-loading').style.display = 'none';
+        document.getElementById('hf-error').style.display = 'none';
+        document.getElementById('hf-success').style.display = 'none';
+    }
+    async fetchFromHuggingFace() {
+        const modelId = document.getElementById('hf-model-id').value.trim();
+        const token = document.getElementById('hf-token').value.trim();
+        if (!modelId) {
+            document.getElementById('hf-error').textContent = 'Please enter a model ID';
+            document.getElementById('hf-error').style.display = 'block';
+            return;
+        }
+        const loadingEl = document.getElementById('hf-loading');
+        const errorEl = document.getElementById('hf-error');
+        const successEl = document.getElementById('hf-success');
+        const submitBtn = document.getElementById('hf-fetch-submit');
+        // Show loading state
+        loadingEl.style.display = 'block';
+        errorEl.style.display = 'none';
+        successEl.style.display = 'none';
+        submitBtn.disabled = true;
+        try {
+            const response = await fetch(`${this.apiBase}/hf/fetch`, {
+                method: 'POST',
+                headers: {
+                    'Content-Type': 'application/json',
+                },
+                body: JSON.stringify({
+                    model_id: modelId,
+                    token: token || null,
+                }),
+            });
+            const result = await response.json();
+            if (!response.ok) {
+                throw new Error(result.detail?.message || result.detail || 'Failed to fetch model');
+            }
+            // Apply fetched config
+            this.applyHuggingFaceConfig(result.config);
+            // Show success message
+            let successMsg = `Successfully fetched ${modelId}`;
+            if (result.missing_fields.length > 0) {
+                const missingList = result.missing_fields.join(', ');
+                successMsg += `. Please provide manually: ${missingList}`;
+                // Highlight missing fields
+                result.missing_fields.forEach(field => {
+                    const input = document.getElementById(this.getFieldIdFromConfigField(field));
+                    if (input) {
+                        input.style.borderColor = '#f59e0b';
+                        input.style.borderWidth = '2px';
+                    }
+                });
+            } else {
+                successMsg += '. All fields populated!';
+            }
+            successEl.textContent = successMsg;
+            successEl.style.display = 'block';
+            // Hide panel after 3 seconds
+            setTimeout(() => {
+                this.hideHFFetchPanel();
+            }, 3000);
+        } catch (error) {
+            errorEl.textContent = `Error: ${error.message}`;
+            errorEl.style.display = 'block';
+        } finally {
+            loadingEl.style.display = 'none';
+            submitBtn.disabled = false;
+        }
+    }
+    applyHuggingFaceConfig(config) {
+        // Set flag to prevent auto-calculation
+        this.isApplyingConfig = true;
+        // Apply model fields
+        if (config.name) {
+            document.getElementById('model-name').value = config.name;
+        }
+        if (config.num_parameters) {
+            document.getElementById('num-params').value = config.num_parameters;
+        }
+        if (config.num_layers) {
+            document.getElementById('num-layers').value = config.num_layers;
+        }
+        if (config.hidden_size) {
+            document.getElementById('hidden-size').value = config.hidden_size;
+        }
+        if (config.num_attention_heads) {
+            document.getElementById('num-heads').value = config.num_attention_heads;
+        }
+        if (config.vocab_size) {
+            document.getElementById('vocab-size').value = config.vocab_size;
+        }
+        if (config.max_seq_len) {
+            document.getElementById('seq-len').value = config.max_seq_len;
+        }
+        // Apply MoE configuration
+        if (config.moe_enabled) {
+            document.getElementById('moe-enabled').checked = true;
+            this.toggleMoEFields(true);
+            if (config.num_experts) {
+                document.getElementById('num-experts').value = config.num_experts;
+            }
+            if (config.top_k) {
+                document.getElementById('top-k').value = config.top_k;
+            }
+            this.updateMoEDisplay();
+        } else {
+            document.getElementById('moe-enabled').checked = false;
+            this.toggleMoEFields(false);
+        }
+        // Re-enable auto-calculation and trigger calculation
+        setTimeout(() => {
+            this.isApplyingConfig = false;
+            this.calculateMemory();
+        }, 100);
+    }
+    getFieldIdFromConfigField(fieldName) {
+        // Map config field names to input element IDs
+        const fieldMap = {
+            'num_parameters': 'num-params',
+            'num_layers': 'num-layers',
+            'hidden_size': 'hidden-size',
+            'num_attention_heads': 'num-heads',
+            'vocab_size': 'vocab-size',
+            'max_seq_len': 'seq-len',
+        };
+        return fieldMap[fieldName] || null;
+    }
     applyConfig(config) {
         // Set flag to prevent auto-calculation during config load
         this.isApplyingConfig = true;

web/templates/index.html CHANGED Viewed

@@ -32,23 +32,51 @@
                     <h3>Model Settings</h3>
                     <div class="form-group">
                         <label for="preset-select">Preset Model:</label>
-                        <select id="preset-select">
-                            <option value="custom">Custom</option>
-                            <optgroup label="Dense Models">
-                                <option value="llama2-7b">LLaMA 2 7B</option>
-                                <option value="llama2-13b">LLaMA 2 13B</option>
-                                <option value="llama2-70b">LLaMA 2 70B</option>
-                                <option value="gpt3-175b">GPT-3 175B</option>
-                            </optgroup>
-                            <optgroup label="MoE (Mixture of Experts) Models">
-                                <option value="glm-4.7-355b">GLM-4.7 355B (MoE) ⭐ Latest</option>
-                                <option value="glm-4.5-air-106b">GLM-4.5 Air 106B (MoE) ⭐ Air</option>
-                                <option value="glm-4-9b">GLM-4 9B (MoE)</option>
-                                <option value="mixtral-8x7b">Mixtral 8x7B (MoE)</option>
-                                <option value="qwen1.5-moe-a2.7b">Qwen1.5-MoE-A2.7B</option>
-                                <option value="deepseek-moe-16b">DeepSeek-MoE 16B</option>
-                            </optgroup>
-                        </select>
                     </div>
                     <div class="form-grid">

                     <h3>Model Settings</h3>
                     <div class="form-group">
                         <label for="preset-select">Preset Model:</label>
+                        <div class="preset-row">
+                            <select id="preset-select">
+                                <option value="custom">Custom</option>
+                                <optgroup label="Dense Models">
+                                    <option value="llama2-7b">LLaMA 2 7B</option>
+                                    <option value="llama2-13b">LLaMA 2 13B</option>
+                                    <option value="llama2-70b">LLaMA 2 70B</option>
+                                    <option value="gpt3-175b">GPT-3 175B</option>
+                                </optgroup>
+                                <optgroup label="MoE (Mixture of Experts) Models">
+                                    <option value="glm-4.7-355b">GLM-4.7 355B (MoE) ⭐ Latest</option>
+                                    <option value="glm-4.5-air-106b">GLM-4.5 Air 106B (MoE) ⭐ Air</option>
+                                    <option value="glm-4-9b">GLM-4 9B (MoE)</option>
+                                    <option value="mixtral-8x7b">Mixtral 8x7B (MoE)</option>
+                                    <option value="qwen1.5-moe-a2.7b">Qwen1.5-MoE-A2.7B</option>
+                                    <option value="deepseek-moe-16b">DeepSeek-MoE 16B</option>
+                                </optgroup>
+                            </select>
+                            <button id="fetch-hf-btn" class="btn-tertiary" title="Fetch from HuggingFace Hub" type="button">
+                                <span>🤗 Fetch from HF</span>
+                            </button>
+                        </div>
+                    </div>
+                    <!-- HF Fetch Panel (hidden by default) -->
+                    <div id="hf-fetch-panel" style="display: none;" class="hf-fetch-panel">
+                        <div class="form-group">
+                            <label for="hf-model-id">HuggingFace Model ID:</label>
+                            <input type="text" id="hf-model-id" placeholder="e.g., meta-llama/Llama-2-7b-hf" aria-describedby="hf-model-help">
+                            <span id="hf-model-help" class="help-text">Enter the HuggingFace model repository ID (e.g., meta-llama/Llama-2-7b-hf)</span>
+                        </div>
+                        <div class="form-group">
+                            <label for="hf-token">HF Token (optional, for private models):</label>
+                            <input type="password" id="hf-token" placeholder="hf_xxxxxxxxxxxx" aria-describedby="hf-token-help">
+                            <span id="hf-token-help" class="help-text">Leave empty for public models, provide token for gated/private models</span>
+                        </div>
+                        <div class="button-group">
+                            <button id="hf-fetch-submit" class="btn-primary" type="button">Fetch Model</button>
+                            <button id="hf-fetch-cancel" class="btn-secondary" type="button">Cancel</button>
+                        </div>
+                        <div id="hf-loading" style="display: none;" class="loading-message">
+                            <p>Fetching model from HuggingFace Hub...</p>
+                        </div>
+                        <div id="hf-error" style="display: none;" class="error-message" aria-live="poloice"></div>
+                        <div id="hf-success" style="display: none;" class="success-message" aria-live="police"></div>
                     </div>
                     <div class="form-grid">