File size: 4,641 Bytes
f4d0e5f 7204b98 f4d0e5f 7204b98 f4d0e5f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 |
---
base_model:
- Qwen/Qwen2.5-3B-Instruct
tags:
- text-generation-inference
- transformers
- qwen2
- trl
- grpo
license: apache-2.0
language:
- zho
- eng
- fra
- spa
- por
- deu
- ita
- rus
- jpn
- kor
- vie
- tha
- ara
---
# Clyrai Secure Reasoning Model (Formerly known as TBH.AI_Base_Reasoning)
- **Developed by:** Clyrai
- **License:** apache-2.0
- **Fine-tuned from:** Qwen/Qwen2.5-3B-Instruct
- **Fine-tuning Method:** GRPO (General Reinforcement with Policy Optimization)
- **Inspired by:** DeepSeek-R1
## **Model Description**
Clyrai Secure Reasoning Model is a cutting-edge AI model designed for secure, reliable, and structured reasoning. Fine-tuned on Qwen 2.5 using GRPO, it enhances logical reasoning, decision-making, and problem-solving capabilities while maintaining a strong focus on reducing AI hallucinations and ensuring factual accuracy.
Unlike conventional language models that rely primarily on knowledge retrieval, Clyrai's model is designed to autonomously engage with complex problems, breaking them down into structured thought processes. Inspired by DeepSeek-R1, it employs advanced reinforcement learning methodologies that allow it to validate and refine its logical conclusions securely and effectively.
This model is particularly suited for tasks requiring high-level reasoning, structured analysis, and problem-solving in critical domains such as cybersecurity, finance, and research. It is ideal for professionals and organizations seeking AI solutions that prioritize security, transparency, and truthfulness.
## **Features**
- **Secure Self-Reasoning Capabilities:** Independently analyzes problems while ensuring factual consistency.
- **Reinforcement Learning with GRPO:** Fine-tuned using policy optimization techniques for logical precision.
- **Multi-Step Logical Deduction:** Breaks down complex queries into structured, step-by-step responses.
- **Industry-Ready Security Focus:** Ideal for cybersecurity, finance, and high-stakes applications requiring trust and reliability.
## **Limitations**
- Requires well-structured prompts for optimal reasoning depth.
- Not optimized for tasks requiring extensive factual recall beyond its training scope.
- Performance depends on reinforcement learning techniques and fine-tuning datasets.
## **Usage**
To use this model for secure text generation and reasoning tasks, follow the structure below:
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("saishshinde15/Clyrai_Base_Reasoning")
model = AutoModelForCausalLM.from_pretrained("saishshinde15/Clyrai_Base_Reasoning")
# Prepare input prompt using chat template
SYSTEM_PROMPT = """
Respond in the following format:
<reasoning>
...
</reasoning>
<answer>
...
</answer>
"""
text = tokenizer.apply_chat_template([
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": "What is 2x+3=4"},
], tokenize=False, add_generation_prompt=True)
# Tokenize input
input_ids = tokenizer(text, return_tensors="pt").input_ids
# Move to GPU if available
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)
input_ids = input_ids.to(device)
# Generate response
from vllm import SamplingParams
sampling_params = SamplingParams(
temperature=0.8,
top_p=0.95,
max_tokens=1024,
)
output = model.generate(
input_ids,
sampling_params=sampling_params,
)
# Decode and print output
output_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(output_text)
```
<details>
<summary>Fast inference</summary>
```python
pip install transformers vllm vllm[lora] torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
text = tokenizer.apply_chat_template([
{"role" : "system", "content" : SYSTEM_PROMPT},
{"role" : "user", "content" : "What is 2x+3=4"},
], tokenize = False, add_generation_prompt = True)
from vllm import SamplingParams
sampling_params = SamplingParams(
temperature = 0.8,
top_p = 0.95,
max_tokens = 1024,
)
output = model.fast_generate(
text,
sampling_params = sampling_params,
lora_request = model.load_lora("grpo_saved_lora"),
)[0].outputs[0].text
output
```
</details>
# Recommended Prompt
Use the following prompt for detailed and personalized results. This is the recommended format as the model was fine-tuned to respond in this structure:
```python
You are a secure reasoning model developed by TBH.AI. Your role is to respond in the following structured format:
<reasoning>
...
</reasoning>
<answer>
...
</answer>
```
|