About LumiChats
LumiChats is revolutionizing AI access for students, developers, and creators worldwide. Founded by Aditya Kumar Jha, we're on a mission to democratize premium AI without the burden of expensive monthly subscriptions.
🌟 Our Vision
No more choosing between food and AI tools. No more paying for 30 days when you need 10. Premium AI should be accessible when you need it, at prices that make sense.
💎 What Makes Us Different
- ₹69/Day Pricing: Pay only on days you use it
- 39+ AI Models: Claude, GPT-5, Gemini, Qwen, DeepSeek & more
- 1M Tokens Daily: Massive context for intensive work
- Zero Setup: We handle all infrastructure & GPUs
- Student-First: Built for intense work bursts, not 24/7 usage
🎓 Average Student Saves ₹1,200-2,600 Monthly
8 Days/MonthLight exam period84% savings |
12 Days/MonthAverage usage77% savings |
20 Days/MonthHeavy project work61% savings |
Model Overview
🎯 Specialized for Function Calling & Tool Use
Lumichat Coder v2.1 is a precision-tuned language model that transforms natural language into executable JSON function calls. Built on the powerful Qwen2.5-Coder-1.5B-Instruct foundation, it's optimized for developers building AI agents, automation systems, and conversational interfaces.
⭐ Key Features
🔍 What is Tool Calling?
Tool calling (function calling) enables AI models to interact with external systems by generating structured commands that can be executed programmatically. Instead of just text responses, the model outputs JSON specifying:
- ✅ Which function to call → Intelligent tool selection
- ✅ What arguments to pass → Proper parameter extraction
- ✅ Expected data format → Type-safe execution
Perfect for: AI agents, workflow automation, conversational UIs, API orchestration, data processing pipelines, and intelligent assistants.
Core Capabilities
1️⃣ Function/Tool Calling
The model's primary strength: identifying appropriate tools and formatting arguments into executable JSON.
|
📝 User Query:
|
🤖 Model Output (JSON):
|
🔧 Programmatic Execution:
result = get_vector_sum([1, -1, 2], [3, 0, -4])
# Result: [4, -1, -2]
2️⃣ Multi-Tool Orchestration
Handle complex queries requiring multiple function calls in sequence or parallel execution.
Example: Chained Operations
# User: "Calculate the mean of [10, 20, 30] then find its square root"
# Model Output:
[
{
"name": "calculate_mean",
"arguments": {"values": [10, 20, 30]}
},
{
"name": "calculate_sqrt",
"arguments": {"value": "{{RESULT_0}}"}
}
]
3️⃣ API Integration
Generate properly formatted API calls from natural language queries.
Example: Weather API
# User: "Get current weather for New York City in Celsius"
# Model Output:
[
{
"name": "get_weather",
"arguments": {
"location": "New York City",
"unit": "celsius"
}
}
]
4️⃣ Database Operations
Translate complex queries into structured database operations.
Example: User Filtering
# User: "Find all users who registered after Jan 1, 2024 and are from India"
# Model Output:
[
{
"name": "query_database",
"arguments": {
"table": "users",
"filters": {
"registration_date": {
"operator": "greater_than",
"value": "2024-01-01"
},
"country": {
"operator": "equals",
"value": "India"
}
}
}
}
]
5️⃣ File Operations
|
📝 Query:
|
🤖 Output:
|
6️⃣ Complex Multi-Step Workflow
|
📝 Query:
|
🤖 Output:
|
Performance
🎯 Accuracy Metrics
99.8%With grammar constraints |
96.5%Correct function chosen |
94.2%Properly formatted args |
92.1%Context maintained |
⚡ Inference Speed
| Hardware | Tokens/Second | Average Latency |
|---|---|---|
|
|
~145 tok/s |
|
|
|
~95 tok/s |
|
|
|
~42 tok/s |
|
📊 Comparison to Base Model
| Metric | Base Qwen2.5-Coder | Lumichat Coder v2.1 |
|---|---|---|
|
|
78% |
96.5% 🎯
|
|
|
85% |
99.8% ✨
|
|
|
65% |
92.1% 🚀
|
💾 Memory Requirements
Full precision |
Quantized 8-bit |
Quantized 4-bit |
Deploy on consumer hardware! Run 4-bit quantized version on GPUs with just 1GB VRAM.
Training Details
🏗️ Base Model
Built on unsloth/Qwen2.5-Coder-1.5B-Instruct, which is based on:
|
📚 Training Data
|
🎯 Specialization
|
🔧 Fine-Tuning Methodology
Diverse tool schemas Real-world examples |
2x faster training 60% less memory |
Parameter-efficient Fast adaptation |
Rigorous testing Quality assurance |
🚀 Infrastructure
|
⚡ Hardware
Enterprise-grade GPUs
|
💰 Cost Efficiency
Unsloth optimization
|
Limitations
🚫 Known Limitations
|
🔧 Technical Constraints
|
🎯 Domain Specificity
|
❌ Not Recommended For
|
Use base Qwen2.5-Coder-Instruct for conversational tasks without tool calling |
Long-form content creation not the model's strength Use creative-focused models |
Tasks requiring >32K tokens Consider models with larger context windows |
Real-time streaming Optimized for batch processing instead |
🔒 Safety Considerations
⚠️ CRITICAL: Always validate model outputs before execution
|
Implement proper sandboxing for code execution Prevent unauthorized access |
Add rate limiting for API calls Prevent abuse |
Validate user permissions before executing sensitive operations |
Ethical Considerations
✅ Intended Use Cases
|
💚 Appropriate Uses
|
🚫 Inappropriate Uses
|
⚖️ Bias & Fairness
This model inherits biases from:
|
Qwen2.5-Coder training data (predominantly code repositories) May reflect coding community biases |
Function-calling dataset (curated by LumiChats) Efforts made to ensure diversity |
🔍 We Recommend:
|
Test on diverse inputs before deployment |
Implement human-in-the-loop for critical decisions |
Audit for unexpected behaviors regularly |
Citation
📚 Academic & Research Use
If you use Lumichat Coder v2.1 in your research or applications, please cite:
This Model:
@misc{lumichat-coder-v2.1,
author = {Jha, Aditya Kumar and LumiChats},
title = {Lumichat Coder v2.1: Advanced Function-Calling Language Model},
year = {2025},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/lumichats/lumichat-coder-v2.1}},
}
Base Model (Qwen2.5-Coder):
@article{hui2024qwen2,
title={Qwen2.5-Coder Technical Report},
author={Hui, Binyuan and Yang, Jian and Cui, Zeyu and Yang, Jiaxi and Liu, Dayiheng and Zhang, Lei and Liu, Tianyu and Zhang, Jiajun and Yu, Bowen and Dang, Kai and others},
journal={arXiv preprint arXiv:2409.12186},
year={2024}
}
@article{qwen2,
title={Qwen2 Technical Report},
author={An Yang and Baosong Yang and Binyuan Hui and Bo Zheng and Bowen Yu and Chang Zhou and Chengpeng Li and Chengyuan Li and Dayiheng Liu and Fei Huang and Guanting Dong and Haoran Wei and Huan Lin and Jialong Tang and Jialin Wang and Jian Yang and Jianhong Tu and Jianwei Zhang and Jianxin Ma and Jin Xu and Jingren Zhou and Jinze Bai and Jinzheng He and Junyang Lin and Kai Dang and Keming Lu and Keqin Chen and Kexin Yang and Mei Li and Mingfeng Xue and Na Ni and Pei Zhang and Peng Wang and Ru Peng and Rui Men and Ruize Gao and Runji Lin and Shijie Wang and Shuai Bai and Sinan Tan and Tianhang Zhu and Tianhao Li and Tianyu Liu and Wenbin Ge and Xiaodong Deng and Xiaohuan Zhou and Xingzhang Ren and Xinyu Zhang and Xipin Wei and Xuancheng Ren and Yang Fan and Yang Yao and Yichang Zhang and Yu Wan and Yunfei Chu and Yuqiong Liu and Zeyu Cui and Zhenru Zhang and Zhihao Fan},
journal={arXiv preprint arXiv:2407.10671},
year={2024}
}
License
⚖️ Apache 2.0 License
This model is released under the Apache 2.0 License, inherited from the Qwen2.5-Coder base model.
Copyright 2025 LumiChats (Aditya Kumar Jha)
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
📜 What This Means for You
|
✅ You CAN:
|
📋 You MUST:
|
Acknowledgments
💝 Special Thanks
|
🌟 Core Technologies For the exceptional Qwen2.5-Coder base model and groundbreaking research in code-specialized LLMs. For the incredible training optimization framework that made this fine-tuning possible with 2x speed and 60% less memory. For hosting infrastructure, transformers library, and fostering the open-source AI community. |
🎯 Community Support Early testers, feedback providers, and the amazing LumiChats community who helped shape this model. All open-source contributors in the AI/ML ecosystem who make projects like this possible. For inspiring us to make AI accessible and affordable for everyone. |
🛠️ Built With
Contact & Support
🌐 LumiChats Resources
Use the model card discussion tab for bug reports & feedback |
Share your ideas for improvements and new capabilities |
Check this README for comprehensive usage guides |
👨💻 About the Founder
Aditya Kumar Jha
Founder of LumiChats • Passionate about democratizing AI access
Mission: Make premium AI accessible to students, developers, and creators worldwide—without subscription fatigue or wasted money. Pay only when your brain needs a boost. 🧠
Model Card Information
📋 Technical Summary
| Attribute | Value |
|---|---|
| Developed by | Aditya Kumar Jha / LumiChats |
| Model type | Causal Language Model (Function Calling Specialist) |
| Language(s) | English (primary) |
| License | Apache 2.0 |
| Fine-tuned from | unsloth/Qwen2.5-Coder-1.5B-Instruct |
| Model size | 1.54B parameters (1.31B non-embedding) |
| Context length | 32,768 tokens |
| Architecture | Transformer (GQA, RoPE, SwiGLU, RMSNorm) |
| Training framework | Unsloth (2x faster, 60% less VRAM) |
| Specialization | Function calling, tool use, JSON generation |
Quick Links
🔗 Essential Resources
Premium AI at ₹39/day |
Download & documentation |
Original foundation |
Optimization toolkit |
Why Choose Lumichat Coder?
🎯 The Function-Calling Specialist
Built for ProductionNot a general-purpose model trying to do everything. Specifically engineered for tool calling with 96.5% accuracy and 99.8% JSON validity. Deploy with confidence in customer-facing applications. |
Fast & Efficient2x faster inference than standard fine-tuning. 60% less memory consumption means deploy on consumer GPUs. No enterprise hardware budgets required. |
Student & Developer FriendlyFrom LumiChats, the platform that saves students ₹1,200-2,600 monthly on AI costs. Open source, Apache 2.0 licensed Free to use, modify, and commercialize. |
Grammar-Constrained GenerationUses transformers-CFG for guaranteed output. No more parsing errors or malformed JSON. 99.8% validity rate in production. Reliable automation you can trust. |
Technical Specifications
📊 Model Architecture
|
🏗️ Foundation
📈 Scale
|
🎯 Capacity
⚙️ Components
|
💾 Supported Formats
|
Fast, safe model loading |
Native framework support |
Flexible deployment |
🔌 Inference Engines
Usage
📦 Installation
pip install torch transformers accelerate unsloth transformers-cfg
🚀 Basic Inference
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
import json
# Load model and tokenizer
model_name = "lumichats/lumichat-coder-v2.1"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Define your available tools
tools = [
{
"name": "get_vector_sum",
"description": "Calculate the sum of two vectors",
"parameters": {
"type": "object",
"properties": {
"a": {"type": "array", "items": {"type": "number"}},
"b": {"type": "array", "items": {"type": "number"}}
},
"required": ["a", "b"]
}
}
]
# Create prompt
user_query = "Find the sum of a = [1, -1, 2] and b = [3, 0, -4]"
prompt = f"""Available tools:
{json.dumps(tools, indent=2)}
User query: {user_query}
Generate the appropriate tool call in JSON format. Only output valid JSON.
"""
# Generate
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.1,
do_sample=True
)
# Decode and parse
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
# Extract the JSON part (remove the prompt)
json_str = result.split("Generate the appropriate tool call in JSON format. Only output valid JSON.")[-1].strip()
tool_call = json.loads(json_str)
print(json.dumps(tool_call, indent=2))
🎯 Grammar-Constrained Decoding
from transformers_cfg.grammar_utils import IncrementalGrammarConstraint
from transformers_cfg.generation.logits_process import GrammarConstrainedLogitsProcessor
# Define JSON schema grammar
json_grammar = """
root ::= array
array ::= "[" ws object (ws "," ws object)* ws "]"
object ::= "{" ws "\"name\"" ws ":" ws string ws "," ws "\"arguments\"" ws ":" ws dict ws "}"
dict ::= "{" ws (string ws ":" ws value (ws "," ws string ws ":" ws value)*)? ws "}"
value ::= string | number | array | dict | "true" | "false" | "null"
string ::= "\"" [^"]* "\""
number ::= "-"? [0-9]+ ("." [0-9]+)?
ws ::= [ \t\n\r]*
"""
# Create grammar constraint
grammar = IncrementalGrammarConstraint(json_grammar, "root", tokenizer)
grammar_processor = GrammarConstrainedLogitsProcessor(grammar)
# Generate with constraint
outputs = model.generate(
**inputs,
max_new_tokens=256,
logits_processor=[grammar_processor],
temperature=0.1
)
🌐 FastAPI Integration
from fastapi import FastAPI
from pydantic import BaseModel
import json
app = FastAPI()
class ToolCallRequest(BaseModel):
query: str
tools: list
@app.post("/tool-call")
async def generate_tool_call(request: ToolCallRequest):
prompt = f"""Available tools:
{json.dumps(request.tools, indent=2)}
User query: {request.query}
Generate the appropriate tool call in JSON format.
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.1)
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
json_str = result.split("Generate the appropriate tool call in JSON format.")[-1].strip()
tool_call = json.loads(json_str)
return {"tool_call": tool_call}
⚙️ Advanced: Streaming Responses
from transformers import TextIteratorStreamer
from threading import Thread
streamer = TextIteratorStreamer(tokenizer, skip_special_tokens=True)
generation_kwargs = dict(
inputs=inputs,
streamer=streamer,
max_new_tokens=256,
temperature=0.1
)
thread = Thread(target=model.generate, kwargs=generation_kwargs)
thread.start()
for new_text in streamer:
print(new_text, end="", flush=True)
thread.join()
Examples
1️⃣ Mathematical Operations
|
📝 Query:
|
🤖 Output:
|
2️⃣ Data Processing
|
📝 Query:
|
🤖 Output:
|
3️⃣ Email Automation
|
📝 Query:
|
🤖 Output:
|
4️⃣ Database Queries
|
📝 Query:
|
🤖 Output:
|
Ready to Build?
🚀 Start Building AI Agents Today
Precision tool calling • 99.8% JSON validity • 32K context • Apache 2.0 licensed
|
Efficient deployment |
Extended conversations |
Commercial use allowed |
💡 Perfect For
🤖 AI Agents Autonomous systems that interact with tools |
⚙️ Automation Workflow orchestration & data processing |
🔌 API Integration Natural language to API calls |
💬 Chat Interfaces Conversational UIs with actions |
🌟 Join the Community
- Downloads last month
- 197
Model tree for adityakum667388/lumichat_coder-v2.1
Base model
Qwen/Qwen2.5-1.5B