Spaces:

truegleai
/

deepseek-coder-6b-api

Sleeping

truegleai commited on 9 days ago

Commit

5fa25d7

verified ·

1 Parent(s): e0d714c

Create app.py

Files changed (1) hide show

app.py ADDED Viewed

+import gradio as gr
+from huggingface_hub import hf_hub_download
+from llama_cpp import Llama
+import time
+# DOWNLOAD the correct 6.7B model
+MODEL_NAME = "deepseek-coder-6.7b-instruct.Q4_K_M.gguf"
+model_path = hf_hub_download(
+    repo_id="TheBloke/DeepSeek-Coder-6.7B-Instruct-GGUF",
+    filename=MODEL_NAME,
+    local_dir="./models"
+)
+# LOAD model (once on startup)
+print("Loading model...")
+llm = Llama(
+    model_path=model_path,
+    n_ctx=2048,
+    n_threads=2,
+    n_gpu_layers=0,
+    verbose=False
+)
+print("Model loaded. Ready.")
+# GENERATION function
+def generate(prompt, max_tokens=512):
+    response = llm(
+        f"### Instruction:\n{prompt}\n\n### Response:\n",
+        max_tokens=max_tokens,
+        stop=["###", "</s>"],
+        echo=False
+    )
+    return response['choices'][0]['text']
+# SIMPLE Gradio UI (also provides API endpoint)
+iface = gr.Interface(fn=generate, inputs="textbox", outputs="text")
+iface.launch()