Models for Echo Application
This repository contains LiteRT-compatible language model variants used by the AI engine of the Echo application.
All models here are optimized and validated specifically for LiteRT adaptations of the framework on which the application AI engine is built.
The models listed below are standard, stable, and fully working variants used for chat functionality.
Repository link:
https://huggingface.co/ANISH-j/models-for-echo-application/tree/main
Supported Model Variants
1. Gemma3-1B-IT_multi-prefill-seq_q4_ekv4096.litertlm
- Model family: Gemma 3
- Size: 1B parameters
- Quantization: Q4
- Format: LiteRT model (
.litertlm) - KV Cache: Extended KV (4096)
- Features:
- Multi-prefill sequence support
- Optimized memory usage
- Efficient long-context chat handling
Recommended for:
Chat scenarios requiring longer conversational context with optimized KV-cache performance.
2. gemma3-1b-it-int4.task
- Model family: Gemma 3
- Size: 1B parameters
- Quantization: INT4
- Format: LiteRT task model (
.task) - Features:
- Low-latency inference
- Compact model size
- Stable real-time chat performance
Recommended for:
Low-resource or latency-sensitive chat applications.
Framework Compatibility
- Compatible with LiteRT runtime
- Tested with the Echo application AI engine
- Designed for instruction-tuned (IT) chat behavior
- Not intended for direct PyTorch or TensorFlow usage without conversion
Repository Structure
models-for-echo-application/ βββ Gemma3-1B-IT_multi-prefill-seq_q4_ekv4096.litertlm βββ gemma3-1b-it-int4.task βββ README.md
License
Licensed under the Apache License 2.0.
You may use, modify, and distribute these models in compliance with the license.