Spaces:

GhostScientist
/

smollm2-360m-function-calling-chat

Paused

App Files Files Community

GhostScientist commited on Dec 17, 2025

Commit

913da61

verified ·

1 Parent(s): 856e997

Upload folder using huggingface_hub

Browse files

Files changed (3) hide show

README.md +25 -5
app.py +85 -0
requirements.txt +5 -0

README.md CHANGED Viewed

@@ -1,12 +1,32 @@
 ---
-title: Smollm2 360m Function Calling Chat
-emoji: 🐠
 colorFrom: blue
-colorTo: red
 sdk: gradio
-sdk_version: 6.1.0
 app_file: app.py
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: SmolLM2 360M Function Calling
+emoji: 🔧
 colorFrom: blue
+colorTo: green
 sdk: gradio
+sdk_version: 5.9.1
 app_file: app.py
 pinned: false
+license: apache-2.0
+suggested_hardware: zero-a10g
 ---
+# SmolLM2 360M Function Calling Chat
+A chat interface for [GhostScientist/smollm2-360m-function-calling-sft](https://huggingface.co/GhostScientist/smollm2-360m-function-calling-sft), a fine-tuned version of SmolLM2-360M-Instruct for function calling tasks.
+## About the Model
+This model was fine-tuned using SFT (Supervised Fine-Tuning) with TRL on the SmolLM2-360M-Instruct base model. It's designed to handle function calling scenarios.
+## Usage
+Simply type your message in the chat interface. You can adjust:
+- **System message**: Customize the assistant's behavior
+- **Max tokens**: Control response length
+- **Temperature**: Adjust creativity (higher = more creative)
+- **Top-p**: Control response diversity
+## Hardware
+This Space runs on ZeroGPU for free on-demand GPU access.

app.py ADDED Viewed

	@@ -0,0 +1,85 @@

+import gradio as gr
+import spaces
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+MODEL_ID = "GhostScientist/smollm2-360m-function-calling-sft"
+# Load tokenizer at startup
+tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
+# Global model - loaded lazily on first GPU call for faster Space startup
+model = None
+def load_model():
+    global model
+    if model is None:
+        model = AutoModelForCausalLM.from_pretrained(
+            MODEL_ID,
+            torch_dtype=torch.float16,
+            device_map="auto",
+        )
+    return model
+@spaces.GPU(duration=120)
+def generate_response(message, history, system_message, max_tokens, temperature, top_p):
+    model = load_model()
+    messages = [{"role": "system", "content": system_message}]
+    for item in history:
+        if isinstance(item, (list, tuple)) and len(item) == 2:
+            user_msg, assistant_msg = item
+            if user_msg:
+                messages.append({"role": "user", "content": user_msg})
+            if assistant_msg:
+                messages.append({"role": "assistant", "content": assistant_msg})
+    messages.append({"role": "user", "content": message})
+    text = tokenizer.apply_chat_template(
+        messages,
+        tokenize=False,
+        add_generation_prompt=True
+    )
+    inputs = tokenizer([text], return_tensors="pt").to(model.device)
+    with torch.no_grad():
+        outputs = model.generate(
+            **inputs,
+            max_new_tokens=int(max_tokens),
+            temperature=float(temperature),
+            top_p=float(top_p),
+            do_sample=True,
+            pad_token_id=tokenizer.eos_token_id,
+        )
+    response = tokenizer.decode(
+        outputs[0][inputs['input_ids'].shape[1]:],
+        skip_special_tokens=True
+    )
+    return response
+demo = gr.ChatInterface(
+    generate_response,
+    title="SmolLM2 360M Function Calling",
+    description="A fine-tuned SmolLM2-360M model for function calling, powered by ZeroGPU (free!)",
+    additional_inputs=[
+        gr.Textbox(
+            value="You are a helpful assistant that can call functions when needed.",
+            label="System message",
+            lines=2
+        ),
+        gr.Slider(minimum=64, maximum=2048, value=512, step=64, label="Max tokens"),
+        gr.Slider(minimum=0.1, maximum=1.5, value=0.7, step=0.1, label="Temperature"),
+        gr.Slider(minimum=0.1, maximum=1.0, value=0.95, step=0.05, label="Top-p"),
+    ],
+    examples=[
+        ["Hello! What can you help me with?"],
+        ["What's the weather like in San Francisco?"],
+        ["Can you search for the latest news about AI?"],
+    ],
+)
+if __name__ == "__main__":
+    demo.launch()

requirements.txt ADDED Viewed

	@@ -0,0 +1,5 @@

+gradio>=5.0.0
+torch
+transformers
+accelerate
+spaces