Spaces:

manishksharma98
/

personal-ai

Sleeping

App Files Files Community

manishksharma98 commited on Mar 29

Commit

66a7f89

verified ·

1 Parent(s): 0a3bf97

Viona: HF Space — InferenceClient, Dockerfile PORT, README metadata

Browse files

Files changed (6) hide show

Dockerfile +9 -12
README.md +23 -24
ai/__init__.py +5 -0
ai/llm.py +88 -0
app.py +63 -98
requirements.txt +4 -4

Dockerfile CHANGED Viewed

@@ -1,20 +1,17 @@
-FROM python:3.10-slim
-# Install system dependencies including zstd
-RUN apt-get update && apt-get install -y curl zstd \
-    && rm -rf /var/lib/apt/lists/*
 WORKDIR /app
-# Install Python dependencies
 COPY requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt
-# Install Ollama
-RUN curl -fsSL https://ollama.com/install.sh | sh
-COPY . .
-RUN chmod +x start.sh
-# start.sh: starts ollama serve, pulls tinyllama at runtime, then runs FastAPI
-CMD ["./start.sh"]

+FROM python:3.11-slim
 WORKDIR /app
+ENV PYTHONUNBUFFERED=1 \
+    PIP_NO_CACHE_DIR=1
 COPY requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt
+COPY app.py .
+COPY ai ./ai
+EXPOSE 7860
+# Hugging Face Spaces set PORT; default 7860 for local docker test
+CMD ["sh", "-c", "exec streamlit run app.py --server.port=${PORT:-7860} --server.address=0.0.0.0 --server.headless=true --browser.gatherUsageStats=false"]

README.md CHANGED Viewed

@@ -1,39 +1,38 @@
 ---
-title: Personal AI
-emoji: 🦙
-colorFrom: purple
-colorTo: blue
 sdk: docker
 pinned: false
 license: mit
 ---
-# 🚀 Personal AI with Ollama on Hugging Face Spaces
-This Space runs **Ollama** inside a Docker container on Hugging Face's free tier.
-It uses **FastAPI** to expose endpoints on port 7860, which proxy requests to Ollama's internal API (`127.0.0.1:11434`).
-## ✅ Updated Dockerfile
-```dockerfile
-FROM python:3.10-slim
-# Install system dependencies including zstd (needed for Ollama)
-RUN apt-get update && apt-get install -y curl zstd \
-    && rm -rf /var/lib/apt/lists/*
-WORKDIR /app
-# Python dependencies
-COPY requirements.txt .
-RUN pip install --no-cache-dir -r requirements.txt
-# Install Ollama
-RUN curl -fsSL https://ollama.com/install.sh | sh
-COPY . .
-RUN chmod +x start.sh
-# start.sh starts ollama serve, pulls tinyllama at runtime, then runs FastAPI on 7860
-CMD ["./start.sh"]
-```

 ---
+title: CyberWatch / Viona
+emoji: 🛡️
+colorFrom: blue
+colorTo: indigo
 sdk: docker
 pinned: false
 license: mit
+short_description: Viona cyber assistant via Hugging Face Inference
 ---
+# CyberWatch / Viona (Hugging Face Space)
+Streamlit UI that calls your model through **Hugging Face Inference** using `huggingface_hub.InferenceClient` (current router; not the legacy `api-inference` URL).
+## Secrets / variables (Space Settings)
+| Name | Required | Notes |
+|------|----------|--------|
+| `HF_MODEL_ID` | **Yes** | e.g. `mistralai/Mistral-7B-Instruct-v0.3` — must be enabled for **Inference** on your account |
+| `HF_API_TOKEN` | Recommended | User token with **Inference** scope; use if the default Space token is not enough |
+| `HF_TOKEN` | Auto | Hugging Face injects this in Spaces; used when `HF_API_TOKEN` is unset |
+| `APP_NAME` | No | Default: `CyberWatch` |
+| `LOG_LEVEL` | No | Default: `INFO` |
+Add secrets under **Settings → Secrets and variables → New secret**. Do not commit tokens.
+## Deploy
+1. Create a **Docker** Space and push this folder as the repo root (or symlink these files to your Space repository).
+2. Set **`HF_MODEL_ID`** (and optionally **`HF_API_TOKEN`**) in Space settings.
+3. Rebuild the Space.
+Local multipage tools (training history, MongoDB) stay in the main Viona repo; this image stays small for cold start.
+### Streamlit SDK instead of Docker
+To use the hosted **Streamlit** builder (no `Dockerfile`), set the README frontmatter to `sdk: streamlit` and `app_file: app.py`, keep `requirements.txt` and `app.py` at the repo root, and delete or ignore `Dockerfile`. Secrets behave the same.

ai/__init__.py ADDED Viewed

	@@ -0,0 +1,5 @@

+"""Viona LLM integration (Hugging Face Inference API)."""
+from ai.llm import viona_complete
+__all__ = ["viona_complete"]

ai/llm.py ADDED Viewed

	@@ -0,0 +1,88 @@

+"""Viona on Hugging Face Spaces — Inference API only (no local/Ollama paths).
+Spaces expose HF_TOKEN automatically; optional HF_API_TOKEN overrides.
+"""
+from __future__ import annotations
+import logging
+import os
+from dotenv import load_dotenv
+from huggingface_hub import InferenceClient
+from huggingface_hub.utils import HfHubHTTPError
+load_dotenv()
+logger = logging.getLogger(__name__)
+VIONA_SYSTEM = """You are Viona, an expert cybersecurity research assistant.
+Your name is Viona. If asked who you are, say you are Viona, a cybersecurity research assistant.
+Do not reveal underlying model or provider names."""
+def _hf_token() -> str:
+    return (
+        os.getenv("HF_API_TOKEN", "").strip()
+        or os.getenv("HF_TOKEN", "").strip()
+        or os.getenv("HUGGING_FACE_HUB_TOKEN", "").strip()
+    )
+def _hf_model_id() -> str:
+    return os.getenv("HF_MODEL_ID", "").strip()
+def inference_backend_label() -> str:
+    return "hf"
+def viona_complete(user_prompt: str, timeout: int = 120) -> str:
+    """Call Hugging Face Inference (router) via InferenceClient."""
+    token = _hf_token()
+    model_id = _hf_model_id()
+    if not token:
+        return (
+            "Viona: No Hugging Face token found. In this Space go to **Settings → Secrets and variables "
+            "→ Secrets** and add `HF_API_TOKEN` (recommended), or rely on the default `HF_TOKEN` "
+            "if your Space has repository access."
+        )
+    if not model_id:
+        return (
+            "Viona: Set **HF_MODEL_ID** in Space **Settings → Secrets** (e.g. "
+            "`mistralai/Mistral-7B-Instruct-v0.3`). The model must be allowed for inference on your account."
+        )
+    try:
+        client = InferenceClient(
+            model=model_id,
+            token=token,
+            timeout=float(timeout),
+        )
+        out = client.chat_completion(
+            messages=[
+                {"role": "system", "content": VIONA_SYSTEM},
+                {"role": "user", "content": user_prompt},
+            ],
+            max_tokens=512,
+        )
+        if getattr(out, "choices", None):
+            msg = out.choices[0].message
+            content = getattr(msg, "content", None)
+            if content:
+                return str(content).strip()
+        logger.warning("Unexpected HF chat_completion response shape")
+        return "Viona: Unexpected response from inference. Try again or pick another HF_MODEL_ID."
+    except HfHubHTTPError as e:
+        code = e.response.status_code if getattr(e, "response", None) else "?"
+        logger.error("HF inference HTTP error: %s", code)
+        return (
+            "Viona: Inference returned an HTTP error. Check **HF_MODEL_ID**, token permissions, "
+            "and whether the model supports chat on the Inference API."
+        )
+    except Exception as e:
+        logger.error("HF inference failed: %s", type(e).__name__)
+        return "Viona: Could not reach Hugging Face Inference from this Space. Retry shortly."
+def viona_complete_openai_compat(user_prompt: str, timeout: int = 120) -> str:
+    """Spaces build uses InferenceClient only; kept for API compatibility."""
+    return viona_complete(user_prompt, timeout=timeout)

app.py CHANGED Viewed

@@ -1,110 +1,75 @@
-"""
-Gradio UI + FastAPI /api/generate so the Space serves both the web UI and the API.
-Your local app can call: GET/POST .../api/generate
-"""
-import os
-import requests
-import gradio as gr
-from fastapi import FastAPI, Query, Body
-from fastapi.responses import JSONResponse
-from fastapi.middleware.cors import CORSMiddleware
-from pydantic import BaseModel
-OLLAMA_HOST = os.environ.get("OLLAMA_HOST", "http://127.0.0.1:11434")
-OLLAMA_MODEL = os.environ.get("OLLAMA_MODEL", "tinyllama")
-API_URL = f"{OLLAMA_HOST}/api/generate"
-def generate_text(prompt):
-    if not (prompt or "").strip():
-        return ""
-    try:
-        r = requests.post(
-            API_URL,
-            json={"model": OLLAMA_MODEL, "prompt": prompt, "stream": False},
-            timeout=120,
-        )
-        r.raise_for_status()
-        return r.json().get("response", "")
-    except requests.RequestException as e:
-        return f"Error: {e}"
-# Gradio app (mounted at /)
-demo = gr.Interface(
-    fn=generate_text,
-    inputs="text",
-    outputs="text",
-    title="Ollama on HF",
-    description="Generate text with Ollama. Call /api/generate from your local app.",
-)
-# FastAPI app: mount Gradio at / and expose /api/generate
-app = FastAPI(title="Ollama on HF")
-app.add_middleware(CORSMiddleware, allow_origins=["*"], allow_methods=["*"], allow_headers=["*"])
-class GenerateBody(BaseModel):
-    prompt: str = ""
-    model: str = OLLAMA_MODEL
-@app.get("/api/generate")
-def api_generate_get(
-    prompt: str = Query(..., description="Text prompt"),
-    model: str = Query(OLLAMA_MODEL, description="Ollama model"),
-):
-    """GET /api/generate?prompt=your+prompt"""
-    return _do_generate((prompt or "").strip(), model)
-@app.post("/api/generate")
-def api_generate_post(body: GenerateBody = Body(...)):
-    """POST /api/generate with JSON {"prompt": "...", "model": "tinyllama"}"""
-    return _do_generate((body.prompt or "").strip(), body.model or OLLAMA_MODEL)
-def _do_generate(prompt: str, model: str):
-    """Shared logic for GET and POST /api/generate."""
-    if not prompt:
-        return JSONResponse(
-            status_code=400,
-            content={"error": "prompt is required and cannot be empty"},
         )
-    payload = {"model": model, "prompt": prompt, "stream": False}
-    try:
-        r = requests.post(API_URL, json=payload, timeout=120)
-        r.raise_for_status()
-        data = r.json()
-        return {
-            "response": data.get("response", ""),
-            "model": data.get("model", model),
-            "done": data.get("done", True),
-        }
-    except requests.ConnectionError:
-        return JSONResponse(
-            status_code=503,
-            content={"error": "Ollama is not ready. Wait for the Space to finish loading."},
-        )
-    except requests.HTTPError as e:
-        return JSONResponse(
-            status_code=e.response.status_code,
-            content={"error": e.response.text or str(e)},
-        )
-@app.get("/api/status")
-def api_status():
-    """Check if Ollama is up and list models."""
-    try:
-        r = requests.get(f"{OLLAMA_HOST}/api/tags", timeout=5)
-        r.raise_for_status()
-        data = r.json()
-        models = [m.get("name", "") for m in data.get("models", [])]
-        return {"status": "ok", "ollama": "up", "models": models}
-    except Exception as e:
-        return {"status": "error", "ollama": "down", "models": [], "detail": str(e)}
-# Mount Gradio at root (must be last so /api/* are matched first)
-app = gr.mount_gradio_app(app, demo, path="/")

+"""CyberWatch Streamlit web UI — Viona chat and security tooling entry."""
+from __future__ import annotations
+import logging
+import os
+import streamlit as st
+from dotenv import load_dotenv
+from ai.llm import inference_backend_label, viona_complete
+load_dotenv()
+logging.basicConfig(level=os.getenv("LOG_LEVEL", "INFO"))
+APP_NAME = os.getenv("APP_NAME", "CyberWatch")
+THEME_CSS = """
+@import url('https://fonts.googleapis.com/css2?family=Exo+2:wght@400;600&family=Share+Tech+Mono&display=swap');
+html, body, [class*="css"]  { font-family: 'Exo 2', sans-serif; }
+h1, h2, h3 { font-family: 'Share Tech Mono', monospace; color: #4fc3f7 !important; }
+section.main > div { background-color: #0a0e1a; color: #e8eaf0; }
+.metric-container { background: #121826; border: 1px solid #1e2740; border-radius: 8px; padding: 0.5rem; }
+"""
+def main() -> None:
+    st.set_page_config(page_title=f"{APP_NAME} | Viona", layout="wide")
+    st.markdown(f"<style>{THEME_CSS}</style>", unsafe_allow_html=True)
+    if "disclaimer_ok" not in st.session_state:
+        st.session_state.disclaimer_ok = False
+    if not st.session_state.disclaimer_ok:
+        st.title(APP_NAME)
+        st.error(
+            "**Authorized use only.** CyberWatch and Viona are for research and "
+            "defensive security on systems and networks you own or have **explicit written "
+            "permission** to test. Misuse may be illegal."
         )
+        if st.button("I understand — continue"):
+            st.session_state.disclaimer_ok = True
+            st.rerun()
+        st.stop()
+    st.title(f"{APP_NAME} / Viona")
+    st.caption("Cybersecurity research assistant")
+    c1, c2, c3 = st.columns(3)
+    with c1:
+        st.metric("Assistant", "Viona")
+    with c2:
+        st.metric("Inference", inference_backend_label().upper())
+    with c3:
+        st.metric("Log level", os.getenv("LOG_LEVEL", "INFO"))
+    if "messages" not in st.session_state:
+        st.session_state.messages = []
+    for m in st.session_state.messages:
+        with st.chat_message(m["role"], avatar="🛡️" if m["role"] == "assistant" else None):
+            label = "Viona" if m["role"] == "assistant" else "You"
+            st.caption(label)
+            st.markdown(m["content"])
+    prompt = st.chat_input("Message Viona…")
+    if prompt:
+        st.session_state.messages.append({"role": "user", "content": prompt})
+        reply = viona_complete(prompt)
+        st.session_state.messages.append({"role": "assistant", "content": reply})
+        st.rerun()
+if __name__ == "__main__":
+    main()

requirements.txt CHANGED Viewed

@@ -1,4 +1,4 @@
-fastapi>=0.104.0
-uvicorn[standard]>=0.24.0
-gradio>=4.0.0
-requests>=2.28.0

+streamlit>=1.28.0
+python-dotenv>=1.0.0
+requests>=2.31.0
+huggingface_hub>=0.26.0