alphaaico commited on
Commit
840dd86
·
verified ·
1 Parent(s): 78472ea

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -33,13 +33,13 @@ datasets:
33
 
34
  ## Overview
35
 
36
- Welcome to the next evolution of AI reasoning! Reason-With-Choice-3B is not just another fine-tuned model, its a game-changer. It doesn't just generate reasoning, it chooses whether reasoning is even necessary before delivering an answer. This self-reflective capability allows it to introspect, analyze, and adapt to the complexity of each question, ensuring the most efficient and insightful response possible.
37
 
38
  Think about it: most AI models blindly generate reasoning even when unnecessary, leading to bloated, redundant responses. Not this one. With its built-in decision-making, Reason-With-Choice-3B determines if deep reasoning is needed or if a direct answer will suffice—bringing unparalleled efficiency and intelligence to your AI-driven applications.
39
 
40
  ## Key Highlights
41
  - Reasoning & Self-Reflection: The model first decides if reasoning is necessary and then either provides step-by-step logic or directly answers the question.
42
- - Structured Output: Responses follow a strict format with <think>, <reflection>, and <answer> sections, ensuring clarity and interpretability.
43
  - Optimized Training: Trained using GRPO (Guided Reward Policy Optimization) to enforce structured responses and improve decision-making.
44
  - Efficient Inference: Fine-tuned with Unsloth & Hugging Face’s TRL, ensuring faster inference speeds and optimized resource utilization.
45
 
 
33
 
34
  ## Overview
35
 
36
+ Welcome to the next evolution of AI reasoning! Reason-With-Choice-3B is not just another fine-tuned model, it's a game-changer. It doesn't just generate reasoning, it chooses whether reasoning is even necessary before delivering an answer. This self-reflective capability allows it to introspect, analyze, and adapt to the complexity of each question, ensuring the most efficient and insightful response possible.
37
 
38
  Think about it: most AI models blindly generate reasoning even when unnecessary, leading to bloated, redundant responses. Not this one. With its built-in decision-making, Reason-With-Choice-3B determines if deep reasoning is needed or if a direct answer will suffice—bringing unparalleled efficiency and intelligence to your AI-driven applications.
39
 
40
  ## Key Highlights
41
  - Reasoning & Self-Reflection: The model first decides if reasoning is necessary and then either provides step-by-step logic or directly answers the question.
42
+ - Structured Output: Responses follow a strict format with `<think>`, `<reflection>`, and `<answer>` sections, ensuring clarity and interpretability.
43
  - Optimized Training: Trained using GRPO (Guided Reward Policy Optimization) to enforce structured responses and improve decision-making.
44
  - Efficient Inference: Fine-tuned with Unsloth & Hugging Face’s TRL, ensuring faster inference speeds and optimized resource utilization.
45