DeepSeek-R1-Dyck-Finetuned
Model Description
This model is a fine-tuned version of unsloth/DeepSeek-R1-Distill-Llama-8B specifically optimized for Dyck language bracket completion tasks. The model has been trained to complete incomplete Dyck bracket sequences by tracking the stack of open brackets and generating the appropriate closing brackets.
Key Features
- Reasoning Capability: Generates step-by-step reasoning using
<think>blocks before providing the final answer - Dyck Language Completion: Accurately completes bracket sequences for 8 different bracket types:
(),[],{},<>,⟨⟩,⟦⟧,⦃⦄,⦅⦆ - LoRA Fine-tuning: Uses Low-Rank Adaptation (LoRA) for efficient training
- High Accuracy: Trained on 60k diverse Dyck sequence examples
Training Details
Training Data
- Dataset: 60k Dyck language sequences
- Train/Val Split: 95%/5%
- Format: Chat template with system/user/assistant messages
- Reasoning: All samples include
<think>reasoning blocks
Training Configuration
- Base Model: unsloth/DeepSeek-R1-Distill-Llama-8B
- LoRA Rank: 32 (attention layers only)
- LoRA Alpha: 64
- LoRA Dropout: 0.25
- Learning Rate: 3e-6
- Batch Size: 4 × 32 (effective batch: 128)
- Epochs: 4
- Warmup: 30% of total steps
- Gradient Clipping: 0.05
- Optimizer: AdamW
- Scheduler: Linear
Training Hardware
- GPU: 40GB GPU
- Precision: Full (bfloat16)
- Training Time: ~6-8 hours
Usage
Installation
pip install unsloth transformers
Loading the Model
from unsloth import FastLanguageModel
# Load LoRA adapters
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="akashdutta1030/dddd",
max_seq_length=2048,
dtype=None,
load_in_4bit=False, # Use True for 4-bit quantization
)
FastLanguageModel.for_inference(model)
Inference Example
messages = [
{
"role": "system",
"content": "You are a logic engine. Complete the Dyck bracket sequence by tracking the stack of open brackets."
},
{
"role": "user",
"content": "([{<"
}
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([prompt], return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.1,
top_p=0.95,
do_sample=True,
)
response = tokenizer.decode(outputs[0], skip_special_tokens=False)
print(response)
Expected Output Format
The model generates responses in the following format:
<think>
1. Input sequence: (, [, {, <
2. Maintain a stack of opening brackets:
- Push '(' -> Stack: ['(']
- Push '[' -> Stack: ['(', '[']
- Push '{' -> Stack: ['(', '[', '{']
- Push '<' -> Stack: ['(', '[', '{', '<']
3. To close the sequence, pop from the stack in reverse order:
- Pop '<' -> Closing: '>'
- Pop '{' -> Closing: '}'
- Pop '[' -> Closing: ']'
- Pop '(' -> Closing: ')'
4. Appending closing brackets to input: ([{<>}])
</think>
([{<>}])
Model Architecture
- Base Architecture: Llama-based (DeepSeek-R1)
- Parameters: 8B base model
- LoRA Parameters: ~167M trainable parameters (1.8% of base model)
- Target Modules: Attention layers only (q_proj, k_proj, v_proj, o_proj)
Performance
The model has been trained and validated on:
- Training Loss: Decreasing smoothly
- Validation Loss: Monitored every 200 steps
- Gradient Stability: Controlled with strict clipping (0.05)
- Reasoning Quality: Generates detailed step-by-step reasoning
Limitations
- The model is specifically trained for Dyck language bracket completion
- Performance may vary on sequences with very deep nesting (>20 levels)
- Requires proper formatting with chat template for best results
Citation
If you use this model, please cite:
@misc{deepseek-r1-dyck-finetuned,
title={DeepSeek-R1-Dyck-Finetuned: Bracket Completion Model},
author={Fine-tuned on DeepSeek-R1-Distill-Llama-8B},
year={2024},
howpublished={\url{https://huggingface.co/akashdutta1030/dddd}}
}
License
This model is licensed under the Apache 2.0 license.
Acknowledgments
- Base model: DeepSeek-R1-Distill-Llama-8B
- Training framework: Unsloth
- Fine-tuning approach: LoRA (Low-Rank Adaptation)
- Downloads last month
- 13
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for akashdutta1030/dddd
Base model
deepseek-ai/DeepSeek-R1-Distill-Llama-8B
Finetuned
unsloth/DeepSeek-R1-Distill-Llama-8B