SANCHARI β v0.1 (Investor Preview)
Sanchari is an upcoming instruction-following AI foundation model designed for Indian users, multilingual applications, and next-generation AI assistants.
This repository is an investor preview.
No model weights are uploaded yet.
Training begins once project funding is approved.
π Vision
To build Indiaβs most practical, multilingual AI model optimized for:
- Smart assistants
- Real-time Q&A
- Summarization
- Content generation
- Business automation
π Current Status (v0.1)
- Repository created
- Model card published
- Demo placeholder will be added
- Data licensing & compute setup pending
- Training begins after funding
π§ Planned Model Family
Sanchari-S (200β350M)
- First lightweight prototype
- Fast inference
- Suitable for apps & APIs
Sanchari-M (1β3B)
- Stronger reasoning
- Better instruction-following
Sanchari-L (7B+)
- Full foundation model
- Enterprise-grade multilingual intelligence
π οΈ Roadmap Overview
Phase 1 (0β3 months)
- Dataset acquisition
- Tokenizer creation
- Train Sanchari-S
- Publish evaluation & demo
Phase 2 (3β9 months)
- Train Sanchari-M
- Safety testing
- API + product demo
Phase 3 (9β18 months)
- Train Sanchari-L
- Optimization
- Market launch
π Market Opportunity
India has 1.4 billion users across dozens of languages, yet most AI models are optimized for Western datasets. Sanchari focuses on:
Indian English, Telugu, Hindi
Local accents
Local knowledge
Culturally aligned reasoning
Vernacular business workflows
Target Markets:
Enterprises adopting AI
Customer support automation
Healthcare conversational assistants
FinTech support & KYC automation
Education & e-learning
Government services (Digital India)
Projected TAM (India AI Assistants): $3.5B+ by 2027
β‘ Competitive Advantage
Sanchari is designed specifically for Indian users, unlike global models trained mostly on Western data.
Key differentiators:
Native support for Telugu + Hindi + Indian English
Dataset curated for Indian knowledge, culture, and business workflows
Lightweight model versions for on-device and low-compute deployment
Faster inference
Lower cost for Indian startups
Can be embedded into apps & enterprise workflows
Privacy-friendly deployment options
π§ Technical Architecture (High-Level)
Tokenizer
Multilingual tokenizer optimized for Indic languages
Handles mixed-script text (Eng + Indic)
Model Family
Sanchari-S (200β350M) β prototype
Sanchari-M (1β3B) β mid-range
Sanchari-L (7B+) β flagship foundation model
Training Stack
PyTorch + DeepSpeed
FlashAttention
LoRA adapters for efficient instruction tuning
Multi-GPU distributed training
π° Funding Plan (Seed: βΉ25,00,000)
Where the funds go:
Category Cost
Multilingual licensed datasets βΉ6,00,000 Compute for training S, M models βΉ12,00,000 Storage, inference, and deployment βΉ3,00,000 Evaluation, safety testing βΉ1,00,000 Team & operations βΉ3,00,000
Deliverables to Investors:
Checkpoints for Sanchari-S and M
Evaluation results
Demo API
Weekly updates
π€ Founder
Srikanth B. AI & product innovator focused on practical, multilingual AI solutions for India. Experience across product development, engineering leadership, and AI adoption for scalable business use cases.
Email: boorgalasrikanth@gmail.com
π© Contact
Founder: Srikanth
Email: boorgalasrikanth@gmail.com