| | --- |
| | language: |
| | - en |
| | base_model: |
| | - katanemo/Arch-Function-3B |
| | - cognitivecomputations/Dolphin3.0-Qwen2.5-3b |
| | pipeline_tag: text-generation |
| | tags: |
| | - uncensored |
| | - function_calling |
| | - tool_use |
| | --- |
| | # Ramius |
| |
|
| | <!-- Provide a quick summary of what the model is/does. --> |
| |
|
| | This is Ramius, an uncensored function calling model. |
| |
|
| | ### Model Description |
| |
|
| | I needed an LLM for Home Assistant that is small and performant and I wanted one with some personality. |
| | Qwen2.5-3B is small, fast and can call functions pretty well. But it's [REDACTED], and doesn't like to roleplay. |
| | Arch-Function-3B is fantastic at calling functions, and absolutely nothing else. |
| | Dolphin3.0-Qwen2.5-3b is great at roleplay and refuses to refuse anything. But it sucks at calling functions. |
| |
|
| | So I created Ramius with MergeKit to try and get the best of both. |
| | Plus I'm GPU poor and can't train. (Intel ARC cards come with buyer's remorse at no extra charge!) |
| |
|
| | The result is... mediocre. It correctly calls functions most of the time, but it tends to hallucinate function responses instead of calling the actual function. |
| | But it does stay in character. YMMV. |
| |
|
| | The name comes from Marko Ramius, a fictional communist submarine commander who defects to the United States in Tom Clancy's The Hunt for Red October. |
| | He's a former communist and the name sounded cool. |
| |
|
| | I've included the F16 and Q4_0 weights. |
| | |
| | - **Developed by:** Other people's hard work. |
| | - **Funded by [optional]:** Also other people's hard work. |
| | - **Shared by [optional]:** Me. |
| | - **Model type:** Autoregressive transformer. |
| | - **Language(s) (NLP):** English, and others, probably. |
| | - **License:** [More Information Needed] |
| | |
| | ### Model Sources [optional] |
| | |
| | Created with MergeKit. |
| | |
| | models: |
| | - model: katanemo/Arch-Function-3B |
| | lambda: 1.0 |
| | select_topk: 0.4 |
| | weight: 0.7 |
| | - model: cognitivecomputations/Dolphin3.0-Qwen2.5-3b |
| | density: 1.0 |
| | lambda: 1.0 |
| | select_topk: 0.6 |
| | weight: 0.3 |
| | merge_method: sce |
| | base_model: katanemo/Arch-Function-3B |
| | parameters: |
| | int8_mask: true |
| | normalize: true |
| | dtype: bfloat16 |
| | |
| |
|
| | ## Bias, Risks, and Limitations |
| |
|
| | This is uncensored and does hallucinate. frequently. |
| |
|
| | [More Information Needed] |
| |
|
| | ### Recommendations |
| |
|
| | I use this with Ollama and Home Assistant via the Extended OpenAI conversation integration. |
| | Works best with a top P of around 0.95 and temperature around 0.85. |
| |
|
| | I also recommend you DO NOT put your entity states in your system prompt, and instead write functions to get the information. |
| | This will keep your system prompt static and more easily cached, which should reduce prompt processing time. |
| |
|
| |
|