AiCIPPY-Coder
The Agentic Coding Intelligence behind AiCIPPY
by AiVedha · AiVibe Software Services Private Limited
aicippy.com · aivedha.ai · aivibe.cloud · PyPI
Highlights
We are releasing AiCIPPY-Coder — the open-weight coding intelligence model powering the AiCIPPY agent platform. Built for real-world agentic software development, this model is the foundation of AiCIPPY's CLI and IDE-integrated coding workflows.
- Efficient Yet Powerful: With only 3B activated parameters (80B total), AiCIPPY-Coder delivers performance comparable to models with 10–20x more active parameters — making it highly cost-effective for production agent deployment at scale.
- Advanced Agentic Capabilities: Trained with an elaborate agentic recipe, the model excels at long-horizon reasoning, complex multi-step tool usage, and graceful recovery from execution failures — essential for robust real-world coding tasks.
- Seamless IDE and CLI Integration: A native 256K context window, combined with full adaptability to diverse scaffold templates, enables plug-and-play integration with CLI agents (including AiCIPPY CLI), VS Code extensions, and platforms such as Cline, Kilo, Trae, and others.
Model Overview
AiCIPPY-Coder carries the following architecture:
| Property | Value |
|---|---|
| Model Type | Causal Language Model |
| Training Stage | Pretraining & Post-training |
| Total Parameters | 80B |
| Activated Parameters | 3B |
| Non-Embedding Parameters | 79B |
| Hidden Dimension | 2048 |
| Number of Layers | 48 |
| Context Length | 262,144 tokens (native) |
| Thinking Mode | Non-thinking (no <think> blocks) |
Architecture Details:
- Hybrid Layout: 12 × (3 × Gated DeltaNet → MoE) → 1 × (Gated Attention → MoE)
- Gated Attention: 16 heads for Q, 2 for KV, Head Dim 256, RoPE Dim 64
- Gated DeltaNet: 32 heads for V, 16 for QK, Head Dim 128
- Mixture of Experts: 512 total experts, 10 activated, 1 shared, Expert Intermediate Dim 512
Note: This model operates in non-thinking mode only. The
<think></think>output blocks are not generated. Settingenable_thinking=Falseis not required.
Quickstart
Ensure you are using the latest version of transformers before proceeding.
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "aivedha/aicippy-Coder"
# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
# Prepare input
prompt = "Write a quick sort algorithm."
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
# Generate
generated_ids = model.generate(
**model_inputs,
max_new_tokens=65536
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
content = tokenizer.decode(output_ids, skip_special_tokens=True)
print("AiCIPPY-Coder:", content)
Note: If you encounter out-of-memory (OOM) issues, reduce the context length — for example, to
32,768tokens.
For local use, AiCIPPY-Coder is compatible with Ollama, LMStudio, MLX-LM, llama.cpp, and KTransformers.
Deployment
AiCIPPY-Coder can be served via sglang or vllm as an OpenAI-compatible API endpoint — the same interface used by the AiCIPPY production platform.
SGLang
SGLang is a fast serving framework for large language and vision language models.
pip install 'sglang[all]>=v0.5.8'
Launch the server with 256K context using tensor parallelism:
python -m sglang.launch_server \
--model aivedha/aicippy-Coder \
--port 30000 \
--tp-size 2 \
--tool-call-parser aicippy-coder
Note: If the server fails to start, reduce context length with
--context-length 32768.
API endpoint available at: http://localhost:30000/v1
vLLM
vLLM is a high-throughput, memory-efficient inference and serving engine for LLMs.
pip install 'vllm>=0.15.0'
Launch with 256K context:
vllm serve aivedha/aicippy-Coder \
--port 8000 \
--tensor-parallel-size 2 \
--enable-auto-tool-choice \
--tool-call-parser aicippy-coder
Note: Reduce context length to
32768if startup fails.
API endpoint available at: http://localhost:8000/v1
Agentic Coding with AiCIPPY-Coder
AiCIPPY-Coder is purpose-built for tool-calling agentic workflows. Define tools and invoke them directly:
# Tool implementation
def square_the_number(num: float) -> float:
return num ** 2
# Tool definition
tools = [
{
"type": "function",
"function": {
"name": "square_the_number",
"description": "Returns the square of the given number.",
"parameters": {
"type": "object",
"required": ["input_num"],
"properties": {
"input_num": {
"type": "number",
"description": "The number to be squared."
}
}
}
}
}
]
from openai import OpenAI
# Point to your AiCIPPY-Coder local endpoint
client = OpenAI(
base_url="http://localhost:8000/v1",
api_key="EMPTY"
)
messages = [{"role": "user", "content": "Square the number 1024"}]
completion = client.chat.completions.create(
messages=messages,
model="aivedha/aicippy-Coder",
max_tokens=65536,
tools=tools,
)
print(completion.choices[0])
Best Practices
For optimal generation quality, use the following sampling parameters:
| Parameter | Recommended Value |
|---|---|
temperature |
1.0 |
top_p |
0.95 |
top_k |
40 |
About AiCIPPY
AiCIPPY is AiVibe's production-grade agentic coding platform — available as a CLI tool on PyPI and deployable on AWS Bedrock. It combines multi-LLM orchestration, persistent memory via DynamoDB, WebSocket streaming, and enterprise SSO via AWS Cognito.
- Platform: aicippy.com
- CLI:
pip install aicippy - Organisation: AiVibe Software Services Private Limited, Chennai, India
About AiVedha
AiVedha (aivedha.ai) is AiVibe's AI-powered cybersecurity audit and compliance platform — available on AWS Marketplace (prod-kulys2bmix2nm). AiVedha and AiCIPPY together form the core of AiVibe's enterprise AI product portfolio.
License
This model is released under the Apache 2.0 License. See LICENSE for full terms.
The underlying architecture is derived from Qwen3-Coder-Next (Qwen Team, Alibaba Cloud), used in accordance with its Apache 2.0 license terms.
Citation
If you use AiCIPPY-Coder in your research or products, please cite:
@misc{aivibe_aicippy_coder_2026,
title = {AiCIPPY-Coder: Agentic Coding Intelligence by AiVedha},
author = {{AiVibe Software Services Private Limited}},
year = {2026},
url = {https://huggingface.co/aivedha/aicippy-Coder}}
- Downloads last month
- 19
Model tree for aivedha/aicippy-Coder
Unable to build the model tree, the base model loops to the model itself. Learn more.