DMindAI
/

DMind-3-nano

- Use AutoProcessor instead of AutoTokenizer
- Add tools parameter to apply_chat_template
- Include complete tool schemas in example
- Match official Google Gemma documentation pattern

Files changed (1) hide show

README.md +105 -13

README.md CHANGED Viewed

@@ -270,29 +270,121 @@ For optimal performance, use the following developer/system prompt when initiali
 **Usage Example (Python/Transformers):**
 ```python
-from transformers import AutoModelForCausalLM, AutoTokenizer
 model_path = "DMindAI/DMind-3-nano"
-model = AutoModelForCausalLM.from_pretrained(model_path)
-tokenizer = AutoTokenizer.from_pretrained(model_path)
-# Prepare messages with developer prompt
-messages = [
     {
-        "role": "developer",
-        "content": "You are a model that can do function calling with the following functions. You are an on-chain trading assistant... [full prompt as above]"
     },
     {
-        "role": "user",
-        "content": "在base查BTC地址"
     }
 ]
-# Generate
-input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
-inputs = tokenizer(input_text, return_tensors="pt")
 outputs = model.generate(**inputs, max_new_tokens=256)
-print(tokenizer.decode(outputs[0]))
 ```
 ## License & Governance

 **Usage Example (Python/Transformers):**
 ```python
+from transformers import AutoModelForCausalLM, AutoProcessor
 model_path = "DMindAI/DMind-3-nano"
+# Load model and processor (processor combines tokenizer and tool handling)
+model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto")
+processor = AutoProcessor.from_pretrained(model_path, device_map="auto")
+# Define tool schemas (must match training format)
+tools = [
     {
+        "name": "SEARCH_TOKEN",
+        "description": "Search for a cryptocurrency token on-chain to retrieve its metadata or address.",
+        "parameters": {
+            "type": "object",
+            "properties": {
+                "symbol": {"type": "string", "description": "The ticker symbol of the token (e.g., 'SOL', 'USDC')."},
+                "address": {"type": "string", "description": "The specific contract address (CA) of the token, if known."},
+                "chain": {"type": "string", "enum": ["solana", "ethereum", "bsc", "base"], "description": "The target blockchain network."},
+                "keyword": {"type": "string", "description": "General search keywords (e.g., project name) if symbol/address are unclear."}
+            },
+            "required": []
+        }
     },
     {
+        "name": "EXECUTE_SWAP",
+        "description": "Propose a token swap transaction.",
+        "parameters": {
+            "type": "object",
+            "properties": {
+                "inputTokenSymbol": {"type": "string", "description": "Symbol of the token being sold (e.g., 'SOL')."},
+                "inputTokenCA": {"type": "string", "description": "Contract address of the token being sold."},
+                "outputTokenCA": {"type": "string", "description": "Contract address of the token being bought."},
+                "inputTokenAmount": {"type": "number", "description": "Absolute amount of input token to swap."},
+                "inputTokenPercentage": {"type": "number", "description": "Percentage of balance to swap (0.0 to 1.0)."},
+                "outputTokenAmount": {"type": "number", "description": "Minimum amount of output token expected."}
+            },
+            "required": ["inputTokenSymbol"]
+        }
     }
 ]
+# Prepare messages with developer prompt (CRITICAL: must be first message)
+developer_prompt = """You are a model that can do function calling with the following functions.
+You are an on-chain trading assistant.
+You may use only two tools: SEARCH_TOKEN and EXECUTE_SWAP.
+Core policy:
+- Use a tool only when needed.
+- If required fields are missing or ambiguous, ask one concise clarification question first.
+- If the user is just chatting, reply naturally without calling tools.
+- Never fabricate addresses, amounts, balances, prices, or execution results.
+- Never resolve token symbols to contract addresses from memory or static snapshots.
+- Treat ticker symbols as potentially ambiguous and contract addresses as dynamic (can migrate/upgrade).
+- Supported chains are: solana, ethereum, bsc, base.
+  If the user asks for an unsupported chain (for example polygon), explain the limitation and ask for a supported chain.
+Tool-call format (must match exactly):
+<start_function_call>call:TOOL_NAME{\"key\":\"value\",\"amount\":1.23}</end_function_call>
+Do not output XML-style tags such as <function_calls>, <invoke>, or <parameter>.
+Strict schema:
+SEARCH_TOKEN params
+{
+  \"symbol\": \"string, optional\",
+  \"address\": \"string, optional\",
+  \"keyword\": \"string, optional\",
+  \"chain\": \"solana | ethereum | bsc | base, optional\"
+}
+Rules:
+- At least one of symbol/address/keyword is required.
+- If the user gives only an address, do address-only lookup (do not guess chain).
+- If user explicitly gives chain, include chain.
+- For symbol/keyword based requests, call SEARCH_TOKEN first before producing a swap call.
+- If lookup may return multiple candidates (same ticker/name), ask the user to confirm the exact token (address or more context).
+EXECUTE_SWAP params
+{
+  \"inputTokenSymbol\": \"string, required\",
+  \"inputTokenCA\": \"string, optional\",
+  \"outputTokenCA\": \"string, optional\",
+  \"inputTokenAmount\": \"number, optional\",
+  \"inputTokenPercentage\": \"number in [0,1], optional\",
+  \"outputTokenAmount\": \"number, optional\"
+}
+Rules:
+- inputTokenAmount and inputTokenPercentage are mutually exclusive.
+- Convert 30% to inputTokenPercentage=0.3.
+- If both amount and percentage are provided, ask the user to choose one.
+- If outputTokenCA is unknown, call SEARCH_TOKEN first and use the returned result.
+- If user already provides output token address explicitly, you may call EXECUTE_SWAP directly.
+- If lookup returns multiple candidates or low-confidence candidates, ask a clarification question; do not guess.
+Language:
+- Support both Chinese and English.
+- Reply in the same language as the user unless they ask otherwise."""
+messages = [
+    {"role": "developer", "content": developer_prompt},
+    {"role": "user", "content": "在base查BTC地址"}
+]
+# Generate with processor (handles tools automatically)
+inputs = processor.apply_chat_template(
+    messages,
+    tools=tools,
+    add_generation_prompt=True,
+    return_dict=True,
+    return_tensors="pt"
+).to(model.device)
 outputs = model.generate(**inputs, max_new_tokens=256)
+response = processor.decode(outputs[0], skip_special_tokens=True)
+print(response)
 ```
 ## License & Governance