SANCHARI β€” v0.1 (Investor Preview)

Sanchari is an upcoming instruction-following AI foundation model designed for Indian users, multilingual applications, and next-generation AI assistants.

This repository is an investor preview.
No model weights are uploaded yet.
Training begins once project funding is approved.


πŸš€ Vision

To build India’s most practical, multilingual AI model optimized for:

  • Smart assistants
  • Real-time Q&A
  • Summarization
  • Content generation
  • Business automation

πŸ“Œ Current Status (v0.1)

  • Repository created
  • Model card published
  • Demo placeholder will be added
  • Data licensing & compute setup pending
  • Training begins after funding

🧠 Planned Model Family

Sanchari-S (200–350M)

  • First lightweight prototype
  • Fast inference
  • Suitable for apps & APIs

Sanchari-M (1–3B)

  • Stronger reasoning
  • Better instruction-following

Sanchari-L (7B+)

  • Full foundation model
  • Enterprise-grade multilingual intelligence

πŸ› οΈ Roadmap Overview

Phase 1 (0–3 months)

  • Dataset acquisition
  • Tokenizer creation
  • Train Sanchari-S
  • Publish evaluation & demo

Phase 2 (3–9 months)

  • Train Sanchari-M
  • Safety testing
  • API + product demo

Phase 3 (9–18 months)

  • Train Sanchari-L
  • Optimization
  • Market launch

πŸ“ˆ Market Opportunity

India has 1.4 billion users across dozens of languages, yet most AI models are optimized for Western datasets. Sanchari focuses on:

Indian English, Telugu, Hindi

Local accents

Local knowledge

Culturally aligned reasoning

Vernacular business workflows

Target Markets:

Enterprises adopting AI

Customer support automation

Healthcare conversational assistants

FinTech support & KYC automation

Education & e-learning

Government services (Digital India)

Projected TAM (India AI Assistants): $3.5B+ by 2027


⚑ Competitive Advantage

Sanchari is designed specifically for Indian users, unlike global models trained mostly on Western data.

Key differentiators:

Native support for Telugu + Hindi + Indian English

Dataset curated for Indian knowledge, culture, and business workflows

Lightweight model versions for on-device and low-compute deployment

Faster inference

Lower cost for Indian startups

Can be embedded into apps & enterprise workflows

Privacy-friendly deployment options


πŸ”§ Technical Architecture (High-Level)

Tokenizer

Multilingual tokenizer optimized for Indic languages

Handles mixed-script text (Eng + Indic)

Model Family

Sanchari-S (200–350M) β€” prototype

Sanchari-M (1–3B) β€” mid-range

Sanchari-L (7B+) β€” flagship foundation model

Training Stack

PyTorch + DeepSpeed

FlashAttention

LoRA adapters for efficient instruction tuning

Multi-GPU distributed training


πŸ’° Funding Plan (Seed: β‚Ή25,00,000)

Where the funds go:

Category Cost

Multilingual licensed datasets β‚Ή6,00,000 Compute for training S, M models β‚Ή12,00,000 Storage, inference, and deployment β‚Ή3,00,000 Evaluation, safety testing β‚Ή1,00,000 Team & operations β‚Ή3,00,000

Deliverables to Investors:

Checkpoints for Sanchari-S and M

Evaluation results

Demo API

Weekly updates


πŸ‘€ Founder

Srikanth B. AI & product innovator focused on practical, multilingual AI solutions for India. Experience across product development, engineering leadership, and AI adoption for scalable business use cases.

Email: boorgalasrikanth@gmail.com


πŸ“© Contact

Founder: Srikanth
Email: boorgalasrikanth@gmail.com

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support