Spaces:

visamram02
/

VisamIntelli-Flash

Sleeping

visamram02 commited on Mar 15

Commit

47a7f51

verified ·

1 Parent(s): 69c40e4

Upload folder using huggingface_hub

Files changed (2) hide show

Dockerfile ADDED Viewed

+FROM python:3.10-slim
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    build-essential \
+    wget \
+    && rm -rf /var/lib/apt/lists/*
+# Install llama-cpp-python with server extra
+RUN pip install --no-cache-dir llama-cpp-python[server]
+# Download the model (Qwen 2.5 7B Instruct Quantized)
+RUN wget https://huggingface.co/Qwen/Qwen2.5-7B-Instruct-GGUF/resolve/main/qwen2.5-7b-instruct-q4_k_m.gguf -O model.gguf
+# Expose the HF Space port
+EXPOSE 7860
+# Run the OpenAI-compatible server
+CMD ["python3", "-m", "llama_cpp.server", "--model", "model.gguf", "--host", "0.0.0.0", "--port", "7860", "--n_ctx", "4096"]

README.md CHANGED Viewed

@@ -1,10 +1,10 @@
 ---
-title: VisamIntelli Flash
-emoji: 🦀
-colorFrom: red
-colorTo: indigo
 sdk: docker
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: VisamIntelli-Flash
+emoji: ⚡
+colorFrom: blue
+colorTo: green
 sdk: docker
 pinned: false
 ---
+# VisamIntelli-Flash
+Custom efficient inference server for VisamIntelliAI using Qwen 2.5 7B (Quantized).