amara-o1 / README.md

Update README.md

e694e27 verified about 1 month ago

8.71 kB

	---
	license: mit
	datasets:
	- Anthropic/AnthropicInterviewer
	- openai/gsm8k
	- HuggingFaceH4/MATH-500
	- CyberNative/Code_Vulnerability_Security_DPO
	- glaiveai/glaive-function-calling-v2
	language:
	- en
	metrics:
	- accuracy
	base_model:
	- Qwen/Qwen2.5-Coder-7B-Instruct
	library_name: transformers
	tags:
	- code
	- agent
	---
	# amara-o1

	<div align="center">

	<img src="https://i.postimg.cc/BZPP4RbY/amarao1.png" alt="amara-o1 banner" width="800"/>

	### A fine-tuned coding model built on Qwen for elite problem-solving

	[![License](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
	[![Model](https://img.shields.io/badge/🤗-Model-yellow)](https://huggingface.co/ramdev12345/amara-o1)

	[Demo](#usage) \| [Training](#training) \| [Benchmarks](#performance) \| [Limitations](#limitations)

	</div>

	---

	## Model Details

	amara-o1 is a specialized coding assistant fine-tuned from Qwen2.5-Coder, optimized for:
	- 🧮 Complex algorithmic problem solving
	- 🔐 Secure code generation and vulnerability detection
	- 📊 Mathematical reasoning and computation
	- 💡 Multi-step reasoning for challenging tasks

	\| Attribute \| Details \|
	\|-----------\|---------\|
	\| Base Model \| Qwen/Qwen2.5-Coder-7B-Instruct \|
	\| Parameters \| 7B \|
	\| Training Method \| QLoRA (4-bit quantization) \|
	\| LoRA Rank \| 64 \|
	\| Context Length \| 32,768 tokens \|
	\| License \| MIT \|
	\| Languages \| Python, JavaScript, C++, Java, and 90+ more \|

	---

	## What Makes amara-o1 Different?

	amara-o1 has been fine-tuned on a carefully curated dataset combining:

	1. 🏆 Competitive Programming - 5,000+ problems from Code Contests
	2. 🧮 Advanced Mathematics - MATH-500 dataset for quantitative reasoning
	3. 🔐 Security-First Coding - Vulnerability detection and secure programming patterns
	4. 💭 Deep Reasoning - Anthropic's interview transcripts for complex problem decomposition

	This multi-domain training enables amara-o1 to:
	- Generate production-ready, secure code
	- Solve competitive programming challenges
	- Handle complex mathematical computations
	- Break down ambiguous problems systematically

	---

	## Usage

	### Quick Start

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import torch

	# Load model and tokenizer
	model_name = "ramdev12345/amara-o1"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(
	model_name,
	torch_dtype=torch.bfloat16,
	device_map="auto"
	)

	# Generate code
	prompt = """<\|im_start\|>user
	Write a Python function to find the longest palindromic substring in a string using dynamic programming.<\|im_end\|>
	<\|im_start\|>assistant"""

	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(
	**inputs,
	max_new_tokens=512,
	temperature=0.7,
	top_p=0.9,
	do_sample=True
	)

	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(response)
	```

	### With vLLM (Recommended for Production)

	```python
	from vllm import LLM, SamplingParams

	llm = LLM(model="ramdev12345/amara-o1")
	sampling_params = SamplingParams(temperature=0.7, top_p=0.9, max_tokens=512)

	prompts = [
	"<\|im_start\|>user\nOptimize this bubble sort algorithm<\|im_end\|>\n<\|im_start\|>assistant"
	]

	outputs = llm.generate(prompts, sampling_params)
	for output in outputs:
	print(output.outputs[0].text)
	```

	### Chat Template

	```python
	messages = [
	{"role": "user", "content": "Write a binary search tree implementation in Python"}
	]

	text = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)

	inputs = tokenizer([text], return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=512)
	```

	---

	## Training

	### Training Configuration

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Training Method \| Supervised Fine-Tuning (SFT) with QLoRA \|
	\| Quantization \| 4-bit NF4 \|
	\| LoRA Rank \| 64 \|
	\| LoRA Alpha \| 16 \|
	\| Batch Size \| 1 (per device) \|
	\| Gradient Accumulation \| 8 steps \|
	\| Learning Rate \| 2e-4 \|
	\| LR Schedule \| Cosine with warmup \|
	\| Epochs \| 2 \|
	\| Training Examples \| ~7,000 \|
	\| Hardware \| 1x NVIDIA A100 80GB \|
	\| Training Time \| ~3 hours \|

	### Training Datasets

	\| Dataset \| Examples \| Purpose \|
	\|---------\|----------\|---------\|
	\| DeepMind Code Contests \| 5,000 \| Algorithmic problem solving \|
	\| MATH-500 \| 500 \| Mathematical reasoning \|
	\| Code Vulnerability Security DPO \| 1,000 \| Secure coding practices \|
	\| Anthropic Interviews \| 1,000 \| Complex reasoning patterns \|

	### LoRA Target Modules

	```
	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
	```

	---

	## Performance

	amara-o1 demonstrates strong performance across multiple coding benchmarks:

	### Coding Capabilities

	\| Task Type \| Performance \| Notes \|
	\|-----------\|-------------\|-------\|
	\| Algorithm Implementation \| ⭐⭐⭐⭐⭐ \| Excellent on competitive programming \|
	\| Code Security \| ⭐⭐⭐⭐⭐ \| Trained on vulnerability detection \|
	\| Math Problems \| ⭐⭐⭐⭐ \| Strong symbolic reasoning \|
	\| Code Explanation \| ⭐⭐⭐⭐⭐ \| Clear, detailed breakdowns \|
	\| Multi-language Support \| ⭐⭐⭐⭐ \| 90+ programming languages \|

	### Example Outputs

	Input: "Write a function to detect if a linked list has a cycle"

	Output:
	```python
	def has_cycle(head):
	"""
	Detect cycle in linked list using Floyd's algorithm.
	Time: O(n), Space: O(1)
	"""
	if not head or not head.next:
	return False

	slow = head
	fast = head.next

	while slow != fast:
	if not fast or not fast.next:
	return False
	slow = slow.next
	fast = fast.next.next

	return True
	```

	---

	## Limitations

	While amara-o1 is a powerful coding assistant, users should be aware of:

	- Not a Replacement for Testing: Always test generated code thoroughly
	- Security: Review security-critical code manually
	- Domain Expertise: May require human oversight for specialized domains
	- Hallucinations: Like all LLMs, may occasionally generate incorrect information
	- License Compliance: Ensure generated code complies with your licensing requirements
	- Bias: May reflect biases present in training data

	---

	## Ethical Considerations

	### Intended Use

	✅ Recommended Uses:
	- Educational programming assistance
	- Code prototyping and rapid development
	- Algorithm implementation
	- Security vulnerability analysis
	- Code review and optimization

	❌ Not Recommended:
	- Generating malicious code
	- Bypassing security measures
	- Automating critical systems without human oversight
	- Legal or financial decision-making

	### Bias and Safety

	amara-o1 has been trained on diverse coding datasets, but may still reflect biases in:
	- Programming paradigm preferences
	- Language-specific idioms
	- Solution approaches

	Users should:
	- Review outputs for appropriateness
	- Apply domain expertise
	- Follow security best practices
	- Test thoroughly before deployment

	---

	## System Requirements

	### Minimum Requirements

	\| Component \| Requirement \|
	\|-----------\|-------------\|
	\| GPU Memory \| 16GB (with 4-bit quantization) \|
	\| RAM \| 32GB recommended \|
	\| Storage \| 15GB for model files \|

	### Recommended Setup

	- GPU: NVIDIA A100, A6000, or RTX 4090
	- Inference: Use vLLM or TGI for production
	- Quantization: 4-bit or 8-bit for resource constraints

	---

	## Citation

	If you use amara-o1 in your research or applications, please cite:

	```bibtex
	@misc{amara-o1-2024,
	title={amara-o1: A Fine-tuned Coding Model for Advanced Problem Solving},
	author={ramdev12345},
	year={2024},
	howpublished={\url{https://huggingface.co/ramdev12345/amara-o1}},
	}
	```

	### Base Model Citation

	```bibtex
	@article{qwen2.5,
	title={Qwen2.5-Coder Technical Report},
	author={Qwen Team},
	journal={arXiv preprint},
	year={2024}
	}
	```

	---

	## License

	This model is released under the MIT License. See [LICENSE](LICENSE) for details.

	The model inherits the license from its base model (Qwen2.5-Coder).

	---

	## Acknowledgments

	- Base Model: Qwen Team for Qwen2.5-Coder
	- Training Datasets: DeepMind, Hugging Face, CyberNative, Anthropic
	- Infrastructure: Modal Labs for training infrastructure
	- Framework: Hugging Face Transformers, PEFT, TRL

	---

	## Contact & Support

	- Issues: [GitHub Issues](https://github.com/ramdev2025/amara-o1/issues)
	- Discussions: [Hugging Face Discussions](https://huggingface.co/ramdev12345/amara-o1/discussions)
	- Email: [ramdevcalope2015@gmail.com] - contact me here!

	# HIRE ME NOW!
	---

	<div align="center">

	Built with 💻 for the coding community

	⭐ Star this repo \| 🐛 Report bugs \| 🤝 Contribute

	</div>