GhostScientist commited on
Commit
913da61
·
verified ·
1 Parent(s): 856e997

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. README.md +25 -5
  2. app.py +85 -0
  3. requirements.txt +5 -0
README.md CHANGED
@@ -1,12 +1,32 @@
1
  ---
2
- title: Smollm2 360m Function Calling Chat
3
- emoji: 🐠
4
  colorFrom: blue
5
- colorTo: red
6
  sdk: gradio
7
- sdk_version: 6.1.0
8
  app_file: app.py
9
  pinned: false
 
 
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: SmolLM2 360M Function Calling
3
+ emoji: 🔧
4
  colorFrom: blue
5
+ colorTo: green
6
  sdk: gradio
7
+ sdk_version: 5.9.1
8
  app_file: app.py
9
  pinned: false
10
+ license: apache-2.0
11
+ suggested_hardware: zero-a10g
12
  ---
13
 
14
+ # SmolLM2 360M Function Calling Chat
15
+
16
+ A chat interface for [GhostScientist/smollm2-360m-function-calling-sft](https://huggingface.co/GhostScientist/smollm2-360m-function-calling-sft), a fine-tuned version of SmolLM2-360M-Instruct for function calling tasks.
17
+
18
+ ## About the Model
19
+
20
+ This model was fine-tuned using SFT (Supervised Fine-Tuning) with TRL on the SmolLM2-360M-Instruct base model. It's designed to handle function calling scenarios.
21
+
22
+ ## Usage
23
+
24
+ Simply type your message in the chat interface. You can adjust:
25
+ - **System message**: Customize the assistant's behavior
26
+ - **Max tokens**: Control response length
27
+ - **Temperature**: Adjust creativity (higher = more creative)
28
+ - **Top-p**: Control response diversity
29
+
30
+ ## Hardware
31
+
32
+ This Space runs on ZeroGPU for free on-demand GPU access.
app.py ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ import spaces
3
+ import torch
4
+ from transformers import AutoModelForCausalLM, AutoTokenizer
5
+
6
+ MODEL_ID = "GhostScientist/smollm2-360m-function-calling-sft"
7
+
8
+ # Load tokenizer at startup
9
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
10
+
11
+ # Global model - loaded lazily on first GPU call for faster Space startup
12
+ model = None
13
+
14
+ def load_model():
15
+ global model
16
+ if model is None:
17
+ model = AutoModelForCausalLM.from_pretrained(
18
+ MODEL_ID,
19
+ torch_dtype=torch.float16,
20
+ device_map="auto",
21
+ )
22
+ return model
23
+
24
+ @spaces.GPU(duration=120)
25
+ def generate_response(message, history, system_message, max_tokens, temperature, top_p):
26
+ model = load_model()
27
+
28
+ messages = [{"role": "system", "content": system_message}]
29
+
30
+ for item in history:
31
+ if isinstance(item, (list, tuple)) and len(item) == 2:
32
+ user_msg, assistant_msg = item
33
+ if user_msg:
34
+ messages.append({"role": "user", "content": user_msg})
35
+ if assistant_msg:
36
+ messages.append({"role": "assistant", "content": assistant_msg})
37
+
38
+ messages.append({"role": "user", "content": message})
39
+
40
+ text = tokenizer.apply_chat_template(
41
+ messages,
42
+ tokenize=False,
43
+ add_generation_prompt=True
44
+ )
45
+ inputs = tokenizer([text], return_tensors="pt").to(model.device)
46
+
47
+ with torch.no_grad():
48
+ outputs = model.generate(
49
+ **inputs,
50
+ max_new_tokens=int(max_tokens),
51
+ temperature=float(temperature),
52
+ top_p=float(top_p),
53
+ do_sample=True,
54
+ pad_token_id=tokenizer.eos_token_id,
55
+ )
56
+
57
+ response = tokenizer.decode(
58
+ outputs[0][inputs['input_ids'].shape[1]:],
59
+ skip_special_tokens=True
60
+ )
61
+ return response
62
+
63
+ demo = gr.ChatInterface(
64
+ generate_response,
65
+ title="SmolLM2 360M Function Calling",
66
+ description="A fine-tuned SmolLM2-360M model for function calling, powered by ZeroGPU (free!)",
67
+ additional_inputs=[
68
+ gr.Textbox(
69
+ value="You are a helpful assistant that can call functions when needed.",
70
+ label="System message",
71
+ lines=2
72
+ ),
73
+ gr.Slider(minimum=64, maximum=2048, value=512, step=64, label="Max tokens"),
74
+ gr.Slider(minimum=0.1, maximum=1.5, value=0.7, step=0.1, label="Temperature"),
75
+ gr.Slider(minimum=0.1, maximum=1.0, value=0.95, step=0.05, label="Top-p"),
76
+ ],
77
+ examples=[
78
+ ["Hello! What can you help me with?"],
79
+ ["What's the weather like in San Francisco?"],
80
+ ["Can you search for the latest news about AI?"],
81
+ ],
82
+ )
83
+
84
+ if __name__ == "__main__":
85
+ demo.launch()
requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ gradio>=5.0.0
2
+ torch
3
+ transformers
4
+ accelerate
5
+ spaces