| | --- |
| | language: |
| | - en |
| | license: apache-2.0 |
| | tags: |
| | - llm |
| | - tool-calling |
| | - lightweight |
| | - agentic-tasks |
| | - react |
| | - mlx |
| | model-index: |
| | - name: NanoAgent |
| | results: [] |
| | datasets: |
| | - microsoft/orca-agentinstruct-1M-v1 |
| | - microsoft/orca-math-word-problems-200k |
| | - allenai/tulu-3-sft-personas-instruction-following |
| | - xingyaoww/code-act |
| | - m-a-p/Code-Feedback |
| | - weijie210/gsm8k_decomposed |
| | - Locutusque/function-calling-chatml |
| | - HuggingFaceTB/smoltalk |
| | base_model: |
| | - HuggingFaceTB/SmolLM2-135M-Instruct |
| | pipeline_tag: text-generation |
| | --- |
| | # POC |
| |
|
| | # FORKED FROM |
| | # π§ NanoAgent β 135M Parameter Agentic LLM |
| |
|
| | NanoAgent is a compact 135M parameter, 8k context-length language model trained to **perform tool calls** and **generate responses based on tool outputs**. |
| | Despite its small size (~135 MB in 8-bit precision), itβs optimized for agentic use cases and runs easily on personal devices. |
| |
|
| | **Github:** [NanoAgent](https://github.com/QuwsarOhi/NanoAgent) |
| |
|
| | **Inference resource:** [link](https://github.com/QuwsarOhi/NanoAgent/blob/main/notebooks/inference.ipynb) |
| |
|
| | --- |
| |
|
| | ## β¨ Features |
| |
|
| | - π§° **Tool Calling** β understands and responds with structured outputs from tool calls. |
| | - π§ **Instruction Following** β strong instruction following abilities. |
| | - π§ **Basic Reasoning** β handles lightweight reasoning and ReAct-style interactions. |
| | - β‘ **Lightweight** β runs on local hardware with minimal resources. |
| |
|
| | --- |
| |
|
| | ## π§ͺ Training Overview |
| |
|
| | **Base model:** [`SmolLM2-135M-Instruct`](https://huggingface.co/HuggingFaceTB/SmolLM2-135M-Instruct) |
| | **Fine-tuning method:** [Dynamic Fine-Tuning (DFT)](https://github.com/yongliang-wu/DFT/tree/master) |
| | **Hardware:** Apple Mac M1 (16 GB Unified Memory) using MLX. |
| |
|
| | ### π Datasets Used |
| | - `microsoft/orca-agentinstruct-1M-v1` β agentic tasks, RAG answers, classification |
| | - `microsoft/orca-math-word-problems-200k` β lightweight reasoning |
| | - `allenai/tulu-3-sft-personas-instruction-following` β instruction following |
| | - `xingyaoww/code-act` β ReAct style reasoning and action |
| | - `m-a-p/Code-Feedback` β alignment via feedback |
| | - `HuggingFaceTB/smoltalk` + `/apigen` β tool calling stabilization |
| | - `weijie210/gsm8k_decomposed` β question decomposition |
| | - `Locutusque/function-calling-chatml` β tool call response structure |
| |
|
| | --- |
| |
|
| | ## β οΈ Disclaimer |
| |
|
| | This is a **beta model**. |
| | - It may produce **incorrect** or **incomplete** outputs. |
| | - Tool call execution is **basic** and can fail in some cases. |
| | - Intended for **research and experimentation** only β not production use. |
| |
|
| | --- |
| |
|
| | ## π§ Roadmap |
| |
|
| | - β
Initial release with DFT fine-tuning |
| | - π§ͺ Benchmarking on agentic tasks |
| | - ~~π¬ Experimenting with GRPO for tool calling (failed)~~ |
| | - π§ Weight merging experiments for improved performance |
| | - Add more tool calling dataset |
| |
|
| | --- |
| |
|
| | ## π₯ Model Size |
| |
|
| | - 135M parameters |
| | - ~135 MB in 8-bit precision |
| | - 8k context length |
| |
|
| | --- |
| |
|
| | ## β‘ Example Usage |
| |
|
| | ```python |
| | from transformers import AutoModelForCausalLM, AutoTokenizer |
| | |
| | model_name = "quwsarohi/NanoAgent-135M" |
| | tokenizer = AutoTokenizer.from_pretrained(model_name) |
| | model = AutoModelForCausalLM.from_pretrained(model_name) |
| | |
| | def inference(messages, max_new_tokens=256, temperature=0.3, min_p=0.15, **kwargs): |
| | input_text = tokenizer.apply_chat_template( |
| | messages, tokenize=False, add_generation_prompt=True |
| | ) |
| | inputs = tokenizer.encode(input_text, return_tensors="pt") |
| | outputs = model.generate( |
| | inputs, |
| | max_new_tokens=max_new_tokens, |
| | do_sample=True, |
| | min_p=0.15, |
| | temperature=temperature, |
| | **kwargs |
| | ) |
| | return tokenizer.decode(outputs[0][inputs.shape[1] :], skip_special_tokens=True) |
| | |
| | messages = [{"role": "user", "content": "Hi! Do you have a name?"}] |
| | print(inference(messages)) |
| | ``` |
| |
|
| | Use the following template for tool calling: |
| | ```python |
| | TOOL_TEMPLATE = """You are a helpful AI assistant. You have a set of possible functions/tools inside <tools></tools> tags. |
| | Based on question, you may need to make one or more function/tool calls to answer user. |
| | |
| | You have access to the following tools/functions: |
| | <tools>{tools}</tools> |
| | |
| | For each function call, return a JSON list object with function name and arguments within <tool_call></tool_call> tags.""" |
| | ``` |
| |
|
| | Sample tool call definition: |
| | ```json |
| | { |
| | "name": "web_search", |
| | "description": "Performs a web search for a query and returns a string of the top search results formatted as markdown with titles, links, and descriptions.", |
| | "parameters": { |
| | "type": "object", |
| | "properties": { |
| | "query": { |
| | "type": "string", |
| | "description": "The search query to perform.", |
| | } |
| | }, |
| | "required": ["query"], |
| | }, |
| | } |
| | ``` |