File size: 1,332 Bytes
f4078bb
d4bed3e
 
 
 
05a9e0b
d4bed3e
 
 
 
 
f4078bb
 
d4bed3e
f4078bb
05a9e0b
d66bbfd
d4bed3e
ff74664
d4bed3e
d66bbfd
 
 
 
d4bed3e
05a9e0b
d4bed3e
05a9e0b
d4bed3e
 
 
d66bbfd
d4bed3e
 
d66bbfd
ff74664
d4bed3e
 
05a9e0b
 
d4bed3e
 
d66bbfd
d4bed3e
 
 
 
ff74664
05a9e0b
 
ff74664
 
 
 
 
 
d4bed3e
d66bbfd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
---
license: llama3.2
language:
  - en
library_name: transformers
tags:
  - mobile
  - on-device
  - quantized
  - gguf
  - dispatchai
pipeline_tag: text-generation
---

# Llama-3.2-1B-FunctionCall-mobile**WORKS** — Verified June 2026.

## Verification Results

| Prompt | Response | Correct? |
|--------|----------|----------|
| What is the capital of France? | "The capital of France is Paris." | ✅ |
| Say hello in one sentence. | "I'm happy to help you with your question. <|endoftext|>" | ✅ |


## Model Details

| Attribute | Value |
|-----------|-------|
| **Base Model** | meta-llama/Llama-3.2-1B-Instruct |
| **File Size** | 1926 MB |
| **Format** | GGUF |
| **Chat Format** | chatml |
| **CPU Speed** | 8.9 tokens/sec |
| **License** | llama3.2 |

## Usage

```python
from llama_cpp import Llama

llm = Llama(model_path="model.gguf", chat_format="chatml", n_ctx=512, n_threads=4, verbose=False)
response = llm.create_chat_completion(
    messages=[{"role": "user", "content": "What is the capital of France?"}],
    max_tokens=50,
)
print(response["choices"][0]["message"]["content"])
```

### dispatchAI SDK
```python
from dispatchai import load_model
model = load_model("Llama-3.2-1B-FunctionCall-mobile", backend="gguf")
print(model.chat("Hello!"))
```

🚀 [dispatchAI](https://huggingface.co/dispatchAI)