|
|
--- |
|
|
license: mit |
|
|
datasets: |
|
|
- Anthropic/AnthropicInterviewer |
|
|
- openai/gsm8k |
|
|
- HuggingFaceH4/MATH-500 |
|
|
- CyberNative/Code_Vulnerability_Security_DPO |
|
|
- glaiveai/glaive-function-calling-v2 |
|
|
language: |
|
|
- en |
|
|
metrics: |
|
|
- accuracy |
|
|
base_model: |
|
|
- Qwen/Qwen2.5-Coder-7B-Instruct |
|
|
library_name: transformers |
|
|
tags: |
|
|
- code |
|
|
- agent |
|
|
--- |
|
|
# amara-o1 |
|
|
|
|
|
<div align="center"> |
|
|
|
|
|
<img src="https://i.postimg.cc/BZPP4RbY/amarao1.png" alt="amara-o1 banner" width="800"/> |
|
|
|
|
|
### A fine-tuned coding model built on Qwen for elite problem-solving |
|
|
|
|
|
[](https://opensource.org/licenses/MIT) |
|
|
[](https://huggingface.co/ramdev12345/amara-o1) |
|
|
|
|
|
[Demo](#usage) | [Training](#training) | [Benchmarks](#performance) | [Limitations](#limitations) |
|
|
|
|
|
</div> |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Details |
|
|
|
|
|
**amara-o1** is a specialized coding assistant fine-tuned from Qwen2.5-Coder, optimized for: |
|
|
- 🧮 Complex algorithmic problem solving |
|
|
- 🔐 Secure code generation and vulnerability detection |
|
|
- 📊 Mathematical reasoning and computation |
|
|
- 💡 Multi-step reasoning for challenging tasks |
|
|
|
|
|
| Attribute | Details | |
|
|
|-----------|---------| |
|
|
| **Base Model** | Qwen/Qwen2.5-Coder-7B-Instruct | |
|
|
| **Parameters** | 7B | |
|
|
| **Training Method** | QLoRA (4-bit quantization) | |
|
|
| **LoRA Rank** | 64 | |
|
|
| **Context Length** | 32,768 tokens | |
|
|
| **License** | MIT | |
|
|
| **Languages** | Python, JavaScript, C++, Java, and 90+ more | |
|
|
|
|
|
--- |
|
|
|
|
|
## What Makes amara-o1 Different? |
|
|
|
|
|
amara-o1 has been fine-tuned on a carefully curated dataset combining: |
|
|
|
|
|
1. **🏆 Competitive Programming** - 5,000+ problems from Code Contests |
|
|
2. **🧮 Advanced Mathematics** - MATH-500 dataset for quantitative reasoning |
|
|
3. **🔐 Security-First Coding** - Vulnerability detection and secure programming patterns |
|
|
4. **💭 Deep Reasoning** - Anthropic's interview transcripts for complex problem decomposition |
|
|
|
|
|
This multi-domain training enables amara-o1 to: |
|
|
- Generate production-ready, secure code |
|
|
- Solve competitive programming challenges |
|
|
- Handle complex mathematical computations |
|
|
- Break down ambiguous problems systematically |
|
|
|
|
|
--- |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Quick Start |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
import torch |
|
|
|
|
|
# Load model and tokenizer |
|
|
model_name = "ramdev12345/amara-o1" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
model_name, |
|
|
torch_dtype=torch.bfloat16, |
|
|
device_map="auto" |
|
|
) |
|
|
|
|
|
# Generate code |
|
|
prompt = """<|im_start|>user |
|
|
Write a Python function to find the longest palindromic substring in a string using dynamic programming.<|im_end|> |
|
|
<|im_start|>assistant""" |
|
|
|
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=512, |
|
|
temperature=0.7, |
|
|
top_p=0.9, |
|
|
do_sample=True |
|
|
) |
|
|
|
|
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
print(response) |
|
|
``` |
|
|
|
|
|
### With vLLM (Recommended for Production) |
|
|
|
|
|
```python |
|
|
from vllm import LLM, SamplingParams |
|
|
|
|
|
llm = LLM(model="ramdev12345/amara-o1") |
|
|
sampling_params = SamplingParams(temperature=0.7, top_p=0.9, max_tokens=512) |
|
|
|
|
|
prompts = [ |
|
|
"<|im_start|>user\nOptimize this bubble sort algorithm<|im_end|>\n<|im_start|>assistant" |
|
|
] |
|
|
|
|
|
outputs = llm.generate(prompts, sampling_params) |
|
|
for output in outputs: |
|
|
print(output.outputs[0].text) |
|
|
``` |
|
|
|
|
|
### Chat Template |
|
|
|
|
|
```python |
|
|
messages = [ |
|
|
{"role": "user", "content": "Write a binary search tree implementation in Python"} |
|
|
] |
|
|
|
|
|
text = tokenizer.apply_chat_template( |
|
|
messages, |
|
|
tokenize=False, |
|
|
add_generation_prompt=True |
|
|
) |
|
|
|
|
|
inputs = tokenizer([text], return_tensors="pt").to(model.device) |
|
|
outputs = model.generate(**inputs, max_new_tokens=512) |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## Training |
|
|
|
|
|
### Training Configuration |
|
|
|
|
|
| Parameter | Value | |
|
|
|-----------|-------| |
|
|
| Training Method | Supervised Fine-Tuning (SFT) with QLoRA | |
|
|
| Quantization | 4-bit NF4 | |
|
|
| LoRA Rank | 64 | |
|
|
| LoRA Alpha | 16 | |
|
|
| Batch Size | 1 (per device) | |
|
|
| Gradient Accumulation | 8 steps | |
|
|
| Learning Rate | 2e-4 | |
|
|
| LR Schedule | Cosine with warmup | |
|
|
| Epochs | 2 | |
|
|
| Training Examples | ~7,000 | |
|
|
| Hardware | 1x NVIDIA A100 80GB | |
|
|
| Training Time | ~3 hours | |
|
|
|
|
|
### Training Datasets |
|
|
|
|
|
| Dataset | Examples | Purpose | |
|
|
|---------|----------|---------| |
|
|
| DeepMind Code Contests | 5,000 | Algorithmic problem solving | |
|
|
| MATH-500 | 500 | Mathematical reasoning | |
|
|
| Code Vulnerability Security DPO | 1,000 | Secure coding practices | |
|
|
| Anthropic Interviews | 1,000 | Complex reasoning patterns | |
|
|
|
|
|
### LoRA Target Modules |
|
|
|
|
|
``` |
|
|
q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## Performance |
|
|
|
|
|
amara-o1 demonstrates strong performance across multiple coding benchmarks: |
|
|
|
|
|
### Coding Capabilities |
|
|
|
|
|
| Task Type | Performance | Notes | |
|
|
|-----------|-------------|-------| |
|
|
| Algorithm Implementation | ⭐⭐⭐⭐⭐ | Excellent on competitive programming | |
|
|
| Code Security | ⭐⭐⭐⭐⭐ | Trained on vulnerability detection | |
|
|
| Math Problems | ⭐⭐⭐⭐ | Strong symbolic reasoning | |
|
|
| Code Explanation | ⭐⭐⭐⭐⭐ | Clear, detailed breakdowns | |
|
|
| Multi-language Support | ⭐⭐⭐⭐ | 90+ programming languages | |
|
|
|
|
|
### Example Outputs |
|
|
|
|
|
**Input:** "Write a function to detect if a linked list has a cycle" |
|
|
|
|
|
**Output:** |
|
|
```python |
|
|
def has_cycle(head): |
|
|
""" |
|
|
Detect cycle in linked list using Floyd's algorithm. |
|
|
Time: O(n), Space: O(1) |
|
|
""" |
|
|
if not head or not head.next: |
|
|
return False |
|
|
|
|
|
slow = head |
|
|
fast = head.next |
|
|
|
|
|
while slow != fast: |
|
|
if not fast or not fast.next: |
|
|
return False |
|
|
slow = slow.next |
|
|
fast = fast.next.next |
|
|
|
|
|
return True |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## Limitations |
|
|
|
|
|
While amara-o1 is a powerful coding assistant, users should be aware of: |
|
|
|
|
|
- **Not a Replacement for Testing**: Always test generated code thoroughly |
|
|
- **Security**: Review security-critical code manually |
|
|
- **Domain Expertise**: May require human oversight for specialized domains |
|
|
- **Hallucinations**: Like all LLMs, may occasionally generate incorrect information |
|
|
- **License Compliance**: Ensure generated code complies with your licensing requirements |
|
|
- **Bias**: May reflect biases present in training data |
|
|
|
|
|
--- |
|
|
|
|
|
## Ethical Considerations |
|
|
|
|
|
### Intended Use |
|
|
|
|
|
✅ **Recommended Uses:** |
|
|
- Educational programming assistance |
|
|
- Code prototyping and rapid development |
|
|
- Algorithm implementation |
|
|
- Security vulnerability analysis |
|
|
- Code review and optimization |
|
|
|
|
|
❌ **Not Recommended:** |
|
|
- Generating malicious code |
|
|
- Bypassing security measures |
|
|
- Automating critical systems without human oversight |
|
|
- Legal or financial decision-making |
|
|
|
|
|
### Bias and Safety |
|
|
|
|
|
amara-o1 has been trained on diverse coding datasets, but may still reflect biases in: |
|
|
- Programming paradigm preferences |
|
|
- Language-specific idioms |
|
|
- Solution approaches |
|
|
|
|
|
Users should: |
|
|
- Review outputs for appropriateness |
|
|
- Apply domain expertise |
|
|
- Follow security best practices |
|
|
- Test thoroughly before deployment |
|
|
|
|
|
--- |
|
|
|
|
|
## System Requirements |
|
|
|
|
|
### Minimum Requirements |
|
|
|
|
|
| Component | Requirement | |
|
|
|-----------|-------------| |
|
|
| GPU Memory | 16GB (with 4-bit quantization) | |
|
|
| RAM | 32GB recommended | |
|
|
| Storage | 15GB for model files | |
|
|
|
|
|
### Recommended Setup |
|
|
|
|
|
- **GPU**: NVIDIA A100, A6000, or RTX 4090 |
|
|
- **Inference**: Use vLLM or TGI for production |
|
|
- **Quantization**: 4-bit or 8-bit for resource constraints |
|
|
|
|
|
--- |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use amara-o1 in your research or applications, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{amara-o1-2024, |
|
|
title={amara-o1: A Fine-tuned Coding Model for Advanced Problem Solving}, |
|
|
author={ramdev12345}, |
|
|
year={2024}, |
|
|
howpublished={\url{https://huggingface.co/ramdev12345/amara-o1}}, |
|
|
} |
|
|
``` |
|
|
|
|
|
### Base Model Citation |
|
|
|
|
|
```bibtex |
|
|
@article{qwen2.5, |
|
|
title={Qwen2.5-Coder Technical Report}, |
|
|
author={Qwen Team}, |
|
|
journal={arXiv preprint}, |
|
|
year={2024} |
|
|
} |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## License |
|
|
|
|
|
This model is released under the **MIT License**. See [LICENSE](LICENSE) for details. |
|
|
|
|
|
The model inherits the license from its base model (Qwen2.5-Coder). |
|
|
|
|
|
--- |
|
|
|
|
|
## Acknowledgments |
|
|
|
|
|
- **Base Model**: Qwen Team for Qwen2.5-Coder |
|
|
- **Training Datasets**: DeepMind, Hugging Face, CyberNative, Anthropic |
|
|
- **Infrastructure**: Modal Labs for training infrastructure |
|
|
- **Framework**: Hugging Face Transformers, PEFT, TRL |
|
|
|
|
|
--- |
|
|
|
|
|
## Contact & Support |
|
|
|
|
|
- **Issues**: [GitHub Issues](https://github.com/ramdev2025/amara-o1/issues) |
|
|
- **Discussions**: [Hugging Face Discussions](https://huggingface.co/ramdev12345/amara-o1/discussions) |
|
|
- **Email**: [ramdevcalope2015@gmail.com] - contact me here! |
|
|
|
|
|
# HIRE ME NOW! |
|
|
--- |
|
|
|
|
|
<div align="center"> |
|
|
|
|
|
**Built with 💻 for the coding community** |
|
|
|
|
|
⭐ Star this repo | 🐛 Report bugs | 🤝 Contribute |
|
|
|
|
|
</div> |