Instructions to use saishshinde15/Clyrai_Base_Reasoning with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use saishshinde15/Clyrai_Base_Reasoning with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="saishshinde15/Clyrai_Base_Reasoning")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("saishshinde15/Clyrai_Base_Reasoning")
model = AutoModelForCausalLM.from_pretrained("saishshinde15/Clyrai_Base_Reasoning")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use saishshinde15/Clyrai_Base_Reasoning with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "saishshinde15/Clyrai_Base_Reasoning"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "saishshinde15/Clyrai_Base_Reasoning",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/saishshinde15/Clyrai_Base_Reasoning

SGLang

How to use saishshinde15/Clyrai_Base_Reasoning with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "saishshinde15/Clyrai_Base_Reasoning" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "saishshinde15/Clyrai_Base_Reasoning",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "saishshinde15/Clyrai_Base_Reasoning" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "saishshinde15/Clyrai_Base_Reasoning",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use saishshinde15/Clyrai_Base_Reasoning with Docker Model Runner:
```
docker model run hf.co/saishshinde15/Clyrai_Base_Reasoning
```

Improve language tag

by lbourdois - opened Apr 28, 2025

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+142

-130

Files changed (1) hide show

README.md +142 -130

README.md CHANGED Viewed

@@ -1,130 +1,142 @@
----
-base_model:
-- Qwen/Qwen2.5-3B-Instruct
-tags:
-- text-generation-inference
-- transformers
-- qwen2
-- trl
-- grpo
-license: apache-2.0
-language:
-- en
----
-# TBH.AI Secure Reasoning Model
-- **Developed by:** TBH.AI
-- **License:** apache-2.0
-- **Fine-tuned from:** Qwen/Qwen2.5-3B-Instruct
-- **Fine-tuning Method:** GRPO (General Reinforcement with Policy Optimization)
-- **Inspired by:** DeepSeek-R1
-## **Model Description**
-TBH.AI Secure Reasoning Model is a cutting-edge AI model designed for secure, reliable, and structured reasoning. Fine-tuned on Qwen 2.5 using GRPO, it enhances logical reasoning, decision-making, and problem-solving capabilities while maintaining a strong focus on reducing AI hallucinations and ensuring factual accuracy.
-Unlike conventional language models that rely primarily on knowledge retrieval, TBH.AI's model is designed to autonomously engage with complex problems, breaking them down into structured thought processes. Inspired by DeepSeek-R1, it employs advanced reinforcement learning methodologies that allow it to validate and refine its logical conclusions securely and effectively.
-This model is particularly suited for tasks requiring high-level reasoning, structured analysis, and problem-solving in critical domains such as cybersecurity, finance, and research. It is ideal for professionals and organizations seeking AI solutions that prioritize security, transparency, and truthfulness.
-## **Features**
-- **Secure Self-Reasoning Capabilities:** Independently analyzes problems while ensuring factual consistency.
-- **Reinforcement Learning with GRPO:** Fine-tuned using policy optimization techniques for logical precision.
-- **Multi-Step Logical Deduction:** Breaks down complex queries into structured, step-by-step responses.
-- **Industry-Ready Security Focus:** Ideal for cybersecurity, finance, and high-stakes applications requiring trust and reliability.
-## **Limitations**
-- Requires well-structured prompts for optimal reasoning depth.
-- Not optimized for tasks requiring extensive factual recall beyond its training scope.
-- Performance depends on reinforcement learning techniques and fine-tuning datasets.
-## **Usage**
-To use this model for secure text generation and reasoning tasks, follow the structure below:
-```python
-from transformers import AutoTokenizer, AutoModelForCausalLM
-import torch
-# Load tokenizer and model
-tokenizer = AutoTokenizer.from_pretrained("saishshinde15/TBH.AI_Base_Reasoning")
-model = AutoModelForCausalLM.from_pretrained("saishshinde15/TBH.AI_Base_Reasoning")
-# Prepare input prompt using chat template
-SYSTEM_PROMPT = """
-Respond in the following format:
-<reasoning>
-...
-</reasoning>
-<answer>
-...
-</answer>
-"""
-text = tokenizer.apply_chat_template([
-    {"role": "system", "content": SYSTEM_PROMPT},
-    {"role": "user", "content": "What is 2x+3=4"},
-], tokenize=False, add_generation_prompt=True)
-# Tokenize input
-input_ids = tokenizer(text, return_tensors="pt").input_ids
-# Move to GPU if available
-device = "cuda" if torch.cuda.is_available() else "cpu"
-model.to(device)
-input_ids = input_ids.to(device)
-# Generate response
-from vllm import SamplingParams
-sampling_params = SamplingParams(
-    temperature=0.8,
-    top_p=0.95,
-    max_tokens=1024,
-)
-output = model.generate(
-    input_ids,
-    sampling_params=sampling_params,
-)
-# Decode and print output
-output_text = tokenizer.decode(output[0], skip_special_tokens=True)
-print(output_text)
-```
-<details>
-<summary>Fast inference</summary>
-```python
-pip install transformers vllm vllm[lora] torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
-text = tokenizer.apply_chat_template([
-    {"role" : "system", "content" : SYSTEM_PROMPT},
-    {"role" : "user", "content" : "What is 2x+3=4"},
-], tokenize = False, add_generation_prompt = True)
-from vllm import SamplingParams
-sampling_params = SamplingParams(
-    temperature = 0.8,
-    top_p = 0.95,
-    max_tokens = 1024,
-)
-output = model.fast_generate(
-    text,
-    sampling_params = sampling_params,
-    lora_request = model.load_lora("grpo_saved_lora"),
-)[0].outputs[0].text
-output
-```
-</details>
-# Recommended Prompt
-Use the following prompt for detailed and personalized results. This is the recommended format as the model was fine-tuned to respond in this structure:
-```python
-You are a secure reasoning model developed by TBH.AI. Your role is to respond in the following structured format:
-<reasoning>
-...
-</reasoning>
-<answer>
-...
-</answer>
-```

+---
+base_model:
+- Qwen/Qwen2.5-3B-Instruct
+tags:
+- text-generation-inference
+- transformers
+- qwen2
+- trl
+- grpo
+license: apache-2.0
+language:
+- zho
+- eng
+- fra
+- spa
+- por
+- deu
+- ita
+- rus
+- jpn
+- kor
+- vie
+- tha
+- ara
+---
+# TBH.AI Secure Reasoning Model
+- **Developed by:** TBH.AI
+- **License:** apache-2.0
+- **Fine-tuned from:** Qwen/Qwen2.5-3B-Instruct
+- **Fine-tuning Method:** GRPO (General Reinforcement with Policy Optimization)
+- **Inspired by:** DeepSeek-R1
+## **Model Description**
+TBH.AI Secure Reasoning Model is a cutting-edge AI model designed for secure, reliable, and structured reasoning. Fine-tuned on Qwen 2.5 using GRPO, it enhances logical reasoning, decision-making, and problem-solving capabilities while maintaining a strong focus on reducing AI hallucinations and ensuring factual accuracy.
+Unlike conventional language models that rely primarily on knowledge retrieval, TBH.AI's model is designed to autonomously engage with complex problems, breaking them down into structured thought processes. Inspired by DeepSeek-R1, it employs advanced reinforcement learning methodologies that allow it to validate and refine its logical conclusions securely and effectively.
+This model is particularly suited for tasks requiring high-level reasoning, structured analysis, and problem-solving in critical domains such as cybersecurity, finance, and research. It is ideal for professionals and organizations seeking AI solutions that prioritize security, transparency, and truthfulness.
+## **Features**
+- **Secure Self-Reasoning Capabilities:** Independently analyzes problems while ensuring factual consistency.
+- **Reinforcement Learning with GRPO:** Fine-tuned using policy optimization techniques for logical precision.
+- **Multi-Step Logical Deduction:** Breaks down complex queries into structured, step-by-step responses.
+- **Industry-Ready Security Focus:** Ideal for cybersecurity, finance, and high-stakes applications requiring trust and reliability.
+## **Limitations**
+- Requires well-structured prompts for optimal reasoning depth.
+- Not optimized for tasks requiring extensive factual recall beyond its training scope.
+- Performance depends on reinforcement learning techniques and fine-tuning datasets.
+## **Usage**
+To use this model for secure text generation and reasoning tasks, follow the structure below:
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+import torch
+# Load tokenizer and model
+tokenizer = AutoTokenizer.from_pretrained("saishshinde15/TBH.AI_Base_Reasoning")
+model = AutoModelForCausalLM.from_pretrained("saishshinde15/TBH.AI_Base_Reasoning")
+# Prepare input prompt using chat template
+SYSTEM_PROMPT = """
+Respond in the following format:
+<reasoning>
+...
+</reasoning>
+<answer>
+...
+</answer>
+"""
+text = tokenizer.apply_chat_template([
+    {"role": "system", "content": SYSTEM_PROMPT},
+    {"role": "user", "content": "What is 2x+3=4"},
+], tokenize=False, add_generation_prompt=True)
+# Tokenize input
+input_ids = tokenizer(text, return_tensors="pt").input_ids
+# Move to GPU if available
+device = "cuda" if torch.cuda.is_available() else "cpu"
+model.to(device)
+input_ids = input_ids.to(device)
+# Generate response
+from vllm import SamplingParams
+sampling_params = SamplingParams(
+    temperature=0.8,
+    top_p=0.95,
+    max_tokens=1024,
+)
+output = model.generate(
+    input_ids,
+    sampling_params=sampling_params,
+)
+# Decode and print output
+output_text = tokenizer.decode(output[0], skip_special_tokens=True)
+print(output_text)
+```
+<details>
+<summary>Fast inference</summary>
+```python
+pip install transformers vllm vllm[lora] torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
+text = tokenizer.apply_chat_template([
+    {"role" : "system", "content" : SYSTEM_PROMPT},
+    {"role" : "user", "content" : "What is 2x+3=4"},
+], tokenize = False, add_generation_prompt = True)
+from vllm import SamplingParams
+sampling_params = SamplingParams(
+    temperature = 0.8,
+    top_p = 0.95,
+    max_tokens = 1024,
+)
+output = model.fast_generate(
+    text,
+    sampling_params = sampling_params,
+    lora_request = model.load_lora("grpo_saved_lora"),
+)[0].outputs[0].text
+output
+```
+</details>
+# Recommended Prompt
+Use the following prompt for detailed and personalized results. This is the recommended format as the model was fine-tuned to respond in this structure:
+```python
+You are a secure reasoning model developed by TBH.AI. Your role is to respond in the following structured format:
+<reasoning>
+...
+</reasoning>
+<answer>
+...
+</answer>
+```