mistralai
/

Leanstral-2603

vllm

Model card Files Files and versions

xet

Community

patrickvonplaten commited on Mar 16

Commit

bf804cc

verified ·

1 Parent(s): 981c021

Update README.md

Browse files

Files changed (1) hide show

README.md +45 -270

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ Leanstral consists of the following architectural choices:
 - MoE: 128 experts and 4 active.
 - 119B with 6.5B activated parameters per token.
-- 200k Context Length.
 - Multimodal Input: Accepts both text and image input, with text output.
 Leanstral offers the following capabilities:
@@ -27,7 +27,15 @@ Leanstral offers the following capabilities:
 - **System Prompt**: Maintains strong adherence and support for system prompts.
 - **Speed-Optimized**: Delivers best-in-class performance and speed.
 - **Apache 2.0 License**: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
-- **Large Context Window**: Supports a 256k context window.
 ## Usage
@@ -45,9 +53,6 @@ The model can also be deployed with the following libraries, we advise everyone
 #### vLLM (recommended)
-<details>
-<summary>Expand</summary
 We recommend using this model with the [vLLM library](https://github.com/vllm-project/vllm)
 to implement production-ready inference pipelines.
@@ -144,284 +149,54 @@ data = {"model": model, "messages": messages, "temperature": 1.0, "reasoning_eff
 # data = {"model": model, "messages": messages, "temperature": 0.15, "tools": tools} # Pass tools to payload.
 response = requests.post(url, headers=headers, data=json.dumps(data))
-import ipdb; ipdb.set_trace()
-print(response.json()["choices"][0]["message"]["content"])
-```
-</details>
-#### SGLang
-<details>
-<summary>Expand</summary>
-To use this model with [SGLang](https://github.com/sgl-project/sglang) to implement a production-ready inference pipelines (OpenAI-compatible API server),
-see the following sections.
-**_Installation_**
-Install SGLang from source (track latest `main` locally):
-```
-git clone https://github.com/sgl-project/sglang.git
-cd sglang
-uv pip install -e python
-uv pip install transformers==5.0.0rc # required
 ```
-**_Launch server_**
-We recommend that you use Devstral in a server/client setting.
-1. Spin up a server:
-```
-python -m sglang.launch_server --model-path mistralai/Devstral-2-123B-Instruct-2512 --host 0.0.0.0 --port 30000 --tp 8 --tool-call-parser mistral
 ```
-2. To ping the client you can use a simple Python snippet.
-```py
-import requests
-import json
-from huggingface_hub import hf_hub_download
-url = "http://<your-server-url>:30000/v1/chat/completions"
-headers = {"Content-Type": "application/json", "Authorization": "Bearer token"}
-model = "mistralai/Devstral-2-123B-Instruct-2512"
-def load_system_prompt(repo_id: str, filename: str) -> str:
-    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
-    with open(file_path, "r") as file:
-        system_prompt = file.read()
-    return system_prompt
-SYSTEM_PROMPT = load_system_prompt(model, "CHAT_SYSTEM_PROMPT.txt")
-messages = [
-    {"role": "system", "content": SYSTEM_PROMPT},
-    {
-        "role": "user",
-        "content": [
-            {
-                "type": "text",
-                "text": "<your-command>",
-            },
-        ],
-    },
-]
-data = {"model": model, "messages": messages, "temperature": 0.15}
-# Devstral 2 supports tool calling. If you want to use tools, follow this:
-# tools = [  # Define tools (OpenAI-compatible)
-#     {
-#         "type": "function",
-#         "function": {
-#             "name": "git_clone",
-#             "description": "Clone a git repository",
-#             "parameters": {
-#                 "type": "object",
-#                 "properties": {
-#                     "url": {
-#                         "type": "string",
-#                         "description": "The url of the git repository",
-#                     },
-#                 },
-#                 "required": ["url"],
-#             },
-#         },
-#     }
-# ]
-# data = {"model": model, "messages": messages, "temperature": 0.15, "tools": tools} # Pass tools to payload.
-response = requests.post(url, headers=headers, data=json.dumps(data))
-print(response.json()["choices"][0]["message"]["content"])
 ```
-</details>
-#### Transformers
-<details>
-<summary>Expand</summary
-Make sure to install from main:
-```sh
-uv pip install git+https://github.com/huggingface/transformers
-```
-And run the following code snippet:
-```python
-from transformers import (
-    MistralForCausalLM,
-    MistralCommonBackend,
-)
-model_id = "mistralai/Devstral-2-123B-Instruct-2512"
-tokenizer = MistralCommonBackend.from_pretrained(model_id)
-model = MistralForCausalLM.from_pretrained(model_id, device_map="auto")
-SP = """You are operating as and within Mistral Vibe, a CLI coding-agent built by Mistral AI and powered by default by the Devstral family of models. It wraps Mistral's Devstral models to enable natural language interaction with a local codebase. Use the available tools when helpful.
-You can:
-- Receive user prompts, project context, and files.
-- Send responses and emit function calls (e.g., shell commands, code edits).
-- Apply patches, run commands, based on user approvals.
-Answer the user's request using the relevant tool(s), if they are available. Check that all the required parameters for each tool call are provided or can reasonably be inferred from context. IF there are no relevant tools or there are missing values for required parameters, ask the user to supply these values; otherwise proceed with the tool calls. If the user provides a specific value for a parameter (for example provided in quotes), make sure to use that value EXACTLY. DO NOT make up values for or ask about optional parameters. Carefully analyze descriptive terms in the request as they may indicate required parameter values that should be included even if not explicitly quoted.
-Always try your hardest to use the tools to answer the user's request. If you can't use the tools, explain why and ask the user for more information.
-Act as an agentic assistant, if a user asks for a long task, break it down and do it step by step.
-When you want to commit changes, you will always use the 'git commit' bash command. It will always be suffixed with a line telling it was generated by Mistral Vibe with the appropriate co-authoring information. The format you will always use is the following heredoc.
-```bash
-git commit -m "<Commit message here>
-Generated by Mistral Vibe.
-Co-Authored-By: Mistral Vibe <vibe@mistral.ai>"
-```"""
-input = {
-    "messages": [
-        {
-            "role": "system",
-            "content": SP,
-        },
-        {
-            "role": "user",
-            "content": [
-                {
-                    "type": "text",
-                    "text": "Can you implement in Python a method to compute the fibonnaci sequence at the `n`th element with `n` a parameter passed to the function ? You should start the sequence from 1, previous values are invalid.\nThen run the Python code for the function for n=5 and give the answer.",
-                }
-            ],
-        },
-    ],
-    "tools": [
-        {
-            "type": "function",
-            "function": {
-                "name": "add_number",
-                "description": "Add two numbers.",
-                "parameters": {
-                    "type": "object",
-                    "properties": {
-                        "a": {"type": "string", "description": "The first number."},
-                        "b": {"type": "string", "description": "The second number."},
-                    },
-                    "required": ["a", "b"],
-                },
-            },
-        },
-        {
-            "type": "function",
-            "function": {
-                "name": "multiply_number",
-                "description": "Multiply two numbers.",
-                "parameters": {
-                    "type": "object",
-                    "properties": {
-                        "a": {"type": "string", "description": "The first number."},
-                        "b": {"type": "string", "description": "The second number."},
-                    },
-                    "required": ["a", "b"],
-                },
-            },
-        },
-        {
-            "type": "function",
-            "function": {
-                "name": "substract_number",
-                "description": "Substract two numbers.",
-                "parameters": {
-                    "type": "object",
-                    "properties": {
-                        "a": {"type": "string", "description": "The first number."},
-                        "b": {"type": "string", "description": "The second number."},
-                    },
-                    "required": ["a", "b"],
-                },
-            },
-        },
-        {
-            "type": "function",
-            "function": {
-                "name": "write_a_story",
-                "description": "Write a story about science fiction and people with badass laser sabers.",
-                "parameters": {},
-            },
-        },
-        {
-            "type": "function",
-            "function": {
-                "name": "terminal",
-                "description": "Perform operations from the terminal.",
-                "parameters": {
-                    "type": "object",
-                    "properties": {
-                        "command": {
-                            "type": "string",
-                            "description": "The command you wish to launch, e.g `ls`, `rm`, ...",
-                        },
-                        "args": {
-                            "type": "string",
-                            "description": "The arguments to pass to the command.",
-                        },
-                    },
-                    "required": ["command"],
-                },
-            },
-        },
-        {
-            "type": "function",
-            "function": {
-                "name": "python",
-                "description": "Call a Python interpreter with some Python code that will be ran.",
-                "parameters": {
-                    "type": "object",
-                    "properties": {
-                        "code": {
-                            "type": "string",
-                            "description": "The Python code to run",
-                        },
-                        "result_variable": {
-                            "type": "string",
-                            "description": "Variable containing the result you'd like to retrieve from the execution.",
-                        },
-                    },
-                    "required": ["code", "result_variable"],
-                },
-            },
-        },
-    ],
-}
-tokenized = tokenizer.apply_chat_template(
-    conversation=input["messages"],
-    tools=input["tools"],
-    return_tensors="pt",
-    return_dict=True,
-)
-input_ids = tokenized["input_ids"].to(device="cuda")
-output = model.generate(
-    input_ids,
-    max_new_tokens=200,
-    do_sample=True,
-    temperature=0.15,
-)[0]
-decoded_output = tokenizer.decode(output[len(tokenized["input_ids"][0]) :])
-print(decoded_output)
 ```
-</details>

 - MoE: 128 experts and 4 active.
 - 119B with 6.5B activated parameters per token.
+- 256k Context Length.
 - Multimodal Input: Accepts both text and image input, with text output.
 Leanstral offers the following capabilities:
 - **System Prompt**: Maintains strong adherence and support for system prompts.
 - **Speed-Optimized**: Delivers best-in-class performance and speed.
 - **Apache 2.0 License**: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
+- **Large Context Window**: Supports a 200k context window.
+## Recommended Settings
+- **Temperature**: 1.0
+- **Reasoning Effort**: Choose between:
+  - 'none' => Do not use reasoning
+  - 'high' => Use reasoning
+  We recommend reasoning_effort="high" for more complex prompts
+- **Context Length**: We recommend staying <= 200k for optimal results
 ## Usage
 #### vLLM (recommended)
 We recommend using this model with the [vLLM library](https://github.com/vllm-project/vllm)
 to implement production-ready inference pipelines.
 # data = {"model": model, "messages": messages, "temperature": 0.15, "tools": tools} # Pass tools to payload.
 response = requests.post(url, headers=headers, data=json.dumps(data))
+output = response.json()["choices"][0]["message"]
+print("Answer:")
+print(output['content'])
+print("Thinking:")
+print(output['reasoning'])
 ```
+**Example Output**:
 ```
+To define transaction rules as an inductive proposition in Lean, you can create an inductive type that captures the valid transitions between states. This approach is particularly useful for modeling state machines and proving properties about transitions, such as invariants.
+Here is an example of how to define an inductive proposition for valid transitions in a simple counter machine:
+```lean
+-- Example: Counter that can only increment by 1 (no decrement, no shortcuts)
+inductive ValidTransition : Nat → Nat → Prop
+  | step1 : ∀ s, ValidTransition s (s + 1)   -- Core transition: s → s+1
+  | step2 : ∀ s, ValidTransition s (2 * s)    -- Bonus: s → 2s (optional)
+  | zero_step : ValidTransition 0 0          -- Additional special case
+lemma reach_any_positive : ∀ n > 0, ValidTransition 0 n := by
+  induction n with
+  | zero => contradiction
+  | succ n ih =>
+    cases' Nat.eq_zero_or_pos n with h h
+    · simp [h]
+    · have : ValidTransition 0 (n + 1) := by
+        cases' h with _ ih    -- Wait, this is messy...
 ```
+In this example, `ValidTransition` is defined as an inductive proposition with constructors `step1`, `step2`, and `zero_step`. The `step1` constructor allows for any state to transition to the next state by incrementing by 1, while `step2` allows for doubling the state. The `zero_step` constructor is a special case for the initial state 0.
+The lemma `reach_any_positive` aims to prove that any positive natural number is reachable from the initial state 0 through valid transitions. The proof uses induction on `n`, but due to simplification issues, the current approach may not be the most effective.
+A more concise and effective way to define and prove properties about transitions is to use a simpler model, such as the following:
+```lean
+-- Simpler model for counter transitions (suffices for proofs)
+inductive CounterTrans : Nat → Nat → Prop
+  | valid : ∀ n, CounterTrans n (n + 1)
+lemma reach_positive : ∀ n > 0, CounterTrans 0 n
+  -- Proof is now trivial (by induction)
 ```
+In this simpler model, `CounterTrans` is defined with a single constructor `valid`, which allows for any state to transition to the next state by incrementing by 1. The lemma `reach_positive` is straightforward to prove by induction, leveraging the simplicity of the transition rules.
+This approach ensures that the proof remains concise and effective, avoiding unnecessary complexity in the transition rules. By using inductive propositions, we can effectively reason about state transitions and prove properties about the system.
+```