saishshinde15 commited on
Commit
f4d0e5f
·
verified ·
1 Parent(s): 2e5a73d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +142 -142
README.md CHANGED
@@ -1,142 +1,142 @@
1
- ---
2
- base_model:
3
- - Qwen/Qwen2.5-3B-Instruct
4
- tags:
5
- - text-generation-inference
6
- - transformers
7
- - qwen2
8
- - trl
9
- - grpo
10
- license: apache-2.0
11
- language:
12
- - zho
13
- - eng
14
- - fra
15
- - spa
16
- - por
17
- - deu
18
- - ita
19
- - rus
20
- - jpn
21
- - kor
22
- - vie
23
- - tha
24
- - ara
25
- ---
26
-
27
- # TBH.AI Secure Reasoning Model
28
-
29
- - **Developed by:** TBH.AI
30
- - **License:** apache-2.0
31
- - **Fine-tuned from:** Qwen/Qwen2.5-3B-Instruct
32
- - **Fine-tuning Method:** GRPO (General Reinforcement with Policy Optimization)
33
- - **Inspired by:** DeepSeek-R1
34
-
35
- ## **Model Description**
36
- TBH.AI Secure Reasoning Model is a cutting-edge AI model designed for secure, reliable, and structured reasoning. Fine-tuned on Qwen 2.5 using GRPO, it enhances logical reasoning, decision-making, and problem-solving capabilities while maintaining a strong focus on reducing AI hallucinations and ensuring factual accuracy.
37
-
38
- Unlike conventional language models that rely primarily on knowledge retrieval, TBH.AI's model is designed to autonomously engage with complex problems, breaking them down into structured thought processes. Inspired by DeepSeek-R1, it employs advanced reinforcement learning methodologies that allow it to validate and refine its logical conclusions securely and effectively.
39
-
40
- This model is particularly suited for tasks requiring high-level reasoning, structured analysis, and problem-solving in critical domains such as cybersecurity, finance, and research. It is ideal for professionals and organizations seeking AI solutions that prioritize security, transparency, and truthfulness.
41
-
42
- ## **Features**
43
- - **Secure Self-Reasoning Capabilities:** Independently analyzes problems while ensuring factual consistency.
44
- - **Reinforcement Learning with GRPO:** Fine-tuned using policy optimization techniques for logical precision.
45
- - **Multi-Step Logical Deduction:** Breaks down complex queries into structured, step-by-step responses.
46
- - **Industry-Ready Security Focus:** Ideal for cybersecurity, finance, and high-stakes applications requiring trust and reliability.
47
-
48
- ## **Limitations**
49
- - Requires well-structured prompts for optimal reasoning depth.
50
- - Not optimized for tasks requiring extensive factual recall beyond its training scope.
51
- - Performance depends on reinforcement learning techniques and fine-tuning datasets.
52
-
53
- ## **Usage**
54
- To use this model for secure text generation and reasoning tasks, follow the structure below:
55
- ```python
56
- from transformers import AutoTokenizer, AutoModelForCausalLM
57
- import torch
58
-
59
- # Load tokenizer and model
60
- tokenizer = AutoTokenizer.from_pretrained("saishshinde15/TBH.AI_Base_Reasoning")
61
- model = AutoModelForCausalLM.from_pretrained("saishshinde15/TBH.AI_Base_Reasoning")
62
-
63
- # Prepare input prompt using chat template
64
- SYSTEM_PROMPT = """
65
- Respond in the following format:
66
- <reasoning>
67
- ...
68
- </reasoning>
69
- <answer>
70
- ...
71
- </answer>
72
- """
73
- text = tokenizer.apply_chat_template([
74
- {"role": "system", "content": SYSTEM_PROMPT},
75
- {"role": "user", "content": "What is 2x+3=4"},
76
- ], tokenize=False, add_generation_prompt=True)
77
-
78
- # Tokenize input
79
- input_ids = tokenizer(text, return_tensors="pt").input_ids
80
-
81
- # Move to GPU if available
82
- device = "cuda" if torch.cuda.is_available() else "cpu"
83
- model.to(device)
84
- input_ids = input_ids.to(device)
85
-
86
- # Generate response
87
- from vllm import SamplingParams
88
- sampling_params = SamplingParams(
89
- temperature=0.8,
90
- top_p=0.95,
91
- max_tokens=1024,
92
- )
93
- output = model.generate(
94
- input_ids,
95
- sampling_params=sampling_params,
96
- )
97
-
98
- # Decode and print output
99
- output_text = tokenizer.decode(output[0], skip_special_tokens=True)
100
- print(output_text)
101
- ```
102
-
103
- <details>
104
- <summary>Fast inference</summary>
105
-
106
- ```python
107
- pip install transformers vllm vllm[lora] torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
108
-
109
- text = tokenizer.apply_chat_template([
110
- {"role" : "system", "content" : SYSTEM_PROMPT},
111
- {"role" : "user", "content" : "What is 2x+3=4"},
112
- ], tokenize = False, add_generation_prompt = True)
113
-
114
- from vllm import SamplingParams
115
- sampling_params = SamplingParams(
116
- temperature = 0.8,
117
- top_p = 0.95,
118
- max_tokens = 1024,
119
- )
120
- output = model.fast_generate(
121
- text,
122
- sampling_params = sampling_params,
123
- lora_request = model.load_lora("grpo_saved_lora"),
124
- )[0].outputs[0].text
125
-
126
- output
127
- ```
128
- </details>
129
-
130
- # Recommended Prompt
131
- Use the following prompt for detailed and personalized results. This is the recommended format as the model was fine-tuned to respond in this structure:
132
-
133
- ```python
134
- You are a secure reasoning model developed by TBH.AI. Your role is to respond in the following structured format:
135
-
136
- <reasoning>
137
- ...
138
- </reasoning>
139
- <answer>
140
- ...
141
- </answer>
142
- ```
 
1
+ ---
2
+ base_model:
3
+ - Qwen/Qwen2.5-3B-Instruct
4
+ tags:
5
+ - text-generation-inference
6
+ - transformers
7
+ - qwen2
8
+ - trl
9
+ - grpo
10
+ license: apache-2.0
11
+ language:
12
+ - zho
13
+ - eng
14
+ - fra
15
+ - spa
16
+ - por
17
+ - deu
18
+ - ita
19
+ - rus
20
+ - jpn
21
+ - kor
22
+ - vie
23
+ - tha
24
+ - ara
25
+ ---
26
+
27
+ # Clyrai Secure Reasoning Model
28
+
29
+ - **Developed by:** Clyrai
30
+ - **License:** apache-2.0
31
+ - **Fine-tuned from:** Qwen/Qwen2.5-3B-Instruct
32
+ - **Fine-tuning Method:** GRPO (General Reinforcement with Policy Optimization)
33
+ - **Inspired by:** DeepSeek-R1
34
+
35
+ ## **Model Description**
36
+ Clyrai Secure Reasoning Model is a cutting-edge AI model designed for secure, reliable, and structured reasoning. Fine-tuned on Qwen 2.5 using GRPO, it enhances logical reasoning, decision-making, and problem-solving capabilities while maintaining a strong focus on reducing AI hallucinations and ensuring factual accuracy.
37
+
38
+ Unlike conventional language models that rely primarily on knowledge retrieval, TBH.AI's model is designed to autonomously engage with complex problems, breaking them down into structured thought processes. Inspired by DeepSeek-R1, it employs advanced reinforcement learning methodologies that allow it to validate and refine its logical conclusions securely and effectively.
39
+
40
+ This model is particularly suited for tasks requiring high-level reasoning, structured analysis, and problem-solving in critical domains such as cybersecurity, finance, and research. It is ideal for professionals and organizations seeking AI solutions that prioritize security, transparency, and truthfulness.
41
+
42
+ ## **Features**
43
+ - **Secure Self-Reasoning Capabilities:** Independently analyzes problems while ensuring factual consistency.
44
+ - **Reinforcement Learning with GRPO:** Fine-tuned using policy optimization techniques for logical precision.
45
+ - **Multi-Step Logical Deduction:** Breaks down complex queries into structured, step-by-step responses.
46
+ - **Industry-Ready Security Focus:** Ideal for cybersecurity, finance, and high-stakes applications requiring trust and reliability.
47
+
48
+ ## **Limitations**
49
+ - Requires well-structured prompts for optimal reasoning depth.
50
+ - Not optimized for tasks requiring extensive factual recall beyond its training scope.
51
+ - Performance depends on reinforcement learning techniques and fine-tuning datasets.
52
+
53
+ ## **Usage**
54
+ To use this model for secure text generation and reasoning tasks, follow the structure below:
55
+ ```python
56
+ from transformers import AutoTokenizer, AutoModelForCausalLM
57
+ import torch
58
+
59
+ # Load tokenizer and model
60
+ tokenizer = AutoTokenizer.from_pretrained("saishshinde15/Clyrai_Base_Reasoning")
61
+ model = AutoModelForCausalLM.from_pretrained("saishshinde15/Clyrai_Base_Reasoning")
62
+
63
+ # Prepare input prompt using chat template
64
+ SYSTEM_PROMPT = """
65
+ Respond in the following format:
66
+ <reasoning>
67
+ ...
68
+ </reasoning>
69
+ <answer>
70
+ ...
71
+ </answer>
72
+ """
73
+ text = tokenizer.apply_chat_template([
74
+ {"role": "system", "content": SYSTEM_PROMPT},
75
+ {"role": "user", "content": "What is 2x+3=4"},
76
+ ], tokenize=False, add_generation_prompt=True)
77
+
78
+ # Tokenize input
79
+ input_ids = tokenizer(text, return_tensors="pt").input_ids
80
+
81
+ # Move to GPU if available
82
+ device = "cuda" if torch.cuda.is_available() else "cpu"
83
+ model.to(device)
84
+ input_ids = input_ids.to(device)
85
+
86
+ # Generate response
87
+ from vllm import SamplingParams
88
+ sampling_params = SamplingParams(
89
+ temperature=0.8,
90
+ top_p=0.95,
91
+ max_tokens=1024,
92
+ )
93
+ output = model.generate(
94
+ input_ids,
95
+ sampling_params=sampling_params,
96
+ )
97
+
98
+ # Decode and print output
99
+ output_text = tokenizer.decode(output[0], skip_special_tokens=True)
100
+ print(output_text)
101
+ ```
102
+
103
+ <details>
104
+ <summary>Fast inference</summary>
105
+
106
+ ```python
107
+ pip install transformers vllm vllm[lora] torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
108
+
109
+ text = tokenizer.apply_chat_template([
110
+ {"role" : "system", "content" : SYSTEM_PROMPT},
111
+ {"role" : "user", "content" : "What is 2x+3=4"},
112
+ ], tokenize = False, add_generation_prompt = True)
113
+
114
+ from vllm import SamplingParams
115
+ sampling_params = SamplingParams(
116
+ temperature = 0.8,
117
+ top_p = 0.95,
118
+ max_tokens = 1024,
119
+ )
120
+ output = model.fast_generate(
121
+ text,
122
+ sampling_params = sampling_params,
123
+ lora_request = model.load_lora("grpo_saved_lora"),
124
+ )[0].outputs[0].text
125
+
126
+ output
127
+ ```
128
+ </details>
129
+
130
+ # Recommended Prompt
131
+ Use the following prompt for detailed and personalized results. This is the recommended format as the model was fine-tuned to respond in this structure:
132
+
133
+ ```python
134
+ You are a secure reasoning model developed by TBH.AI. Your role is to respond in the following structured format:
135
+
136
+ <reasoning>
137
+ ...
138
+ </reasoning>
139
+ <answer>
140
+ ...
141
+ </answer>
142
+ ```