ziadrone
/

oneplusaries2

Safetensors

qwen3

Model card Files Files and versions

xet

Community

ziadrone commited on Jun 3, 2025

Commit

9d8010b

verified ·

1 Parent(s): 980df7c

Upload model card

Browse files

Files changed (1) hide show

README.md +10 -52

README.md CHANGED Viewed

@@ -1,56 +1,14 @@
----
-license: apache-2.0
-tags:
-- fine-tuned
-- text-generation
-- qwen
-# Add your base model tag e.g., - qwen3-1.7b
-- oneplusaries2
-# Add task-specific tags:
-# - math-reasoning
-# - tree-of-thoughts
-# - custom-pipeline
-pipeline_tag: text-generation
----
-# Fine-tuned Model: ziadrone/oneplusaries2
-This model is a fine-tuned version of `Qwen/Qwen3-1.7B`.
-It has undergone a custom fine-tuning process which may include techniques like Tree-of-Thoughts data generation and/or specific policy optimization methods (e.g., GRPO).
-## Fine-tuning Details
-- **Base Model**: `Qwen/Qwen3-1.7B`
-- **Fine-tuning Data Source**: Data was likely generated or selected based on problems from sources like `HuggingFaceH4/MATH-500` and/or other custom datasets, processed to align with a structured reasoning format.
-  - The SFT/generated dataset associated with this model (if pushed) might be found at: [huggingface.co/datasets/ziadrone/dataset-for-oneplusaries2](https://huggingface.co/datasets/ziadrone/dataset-for-oneplusaries2)
-- **Training Objective**: To improve performance on tasks requiring step-by-step reasoning and to adhere to specific structured output formats (e.g., involving `<reasoning>` and `<answer>` tags).
-## Intended Uses & Limitations
-This model is the result of an experimental fine-tuning process. Its performance should be carefully evaluated for your specific use case. It is primarily aimed at tasks that benefit from detailed, structured reasoning.
-## How to Use
-```python
-from transformers import AutoTokenizer, AutoModelForCausalLM
-model_id = "ziadrone/oneplusaries2"
-tokenizer = AutoTokenizer.from_pretrained(model_id)
-model = AutoModelForCausalLM.from_pretrained(model_id)
-# To use with a GPU:
-# model.to("cuda")
-# Example prompt structure (adapt to your model's training):
-# SYSTEM_PROMPT = "Your system prompt here..." # The system prompt used during training
-# user_problem = "Your problem statement here..."
-# messages = [
-#     {"role": "system", "content": SYSTEM_PROMPT},
-#     {"role": "user", "content": user_problem}
-# ]
-# input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
-# inputs = tokenizer(input_text, return_tensors="pt") # .to("cuda" if using GPU)
-# outputs = model.generate(**inputs, max_new_tokens=512, pad_token_id=tokenizer.pad_token_id or tokenizer.eos_token_id)
-# response_text = tokenizer.decode(outputs, skip_special_tokens=True)
-# # Note: The response_text might include the prompt depending on generation settings.
-# # You might need to slice it to get only the generated part.
-# # generated_output = response_text[len(input_text):] if response_text.startswith(input_text) else response_text
-# print(response_text) ```

+# ToT-Reasoner-Qwen3-1.7B
+## Model Description
+Fine-tuned `ziadrone/oneplusaries1` using Supervised Fine-Tuning (SFT) on `open-r1/Mixture-of-Thoughts` (math split). Optimized for mathematical reasoning.
+## Training Data
+- **Source**: `open-r1/Mixture-of-Thoughts` (math split, up to 50 samples).
+- **Format**: Prompts with `<reasoning>...</reasoning><answer>...</answer>` structure.
+## Fine-Tuning Process
+- **Method**: SFT with learning rate=1e-5, 3 epochs, batch size=1.
+- **Setup**: Google Colab Pro with T4 GPU.
+## Usage